• Post Reply Bookmark Topic Watch Topic
  • New Topic

hashcode  RSS feed

 
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
K&B states the following:


In real-life hashing, it’s not uncommon to have more than one entry in a bucket.



which bascially means that two objects can have same hashcode


The default hashCode method in class Object virtually always comes up with a unique number (hashcode) for each object



which contradicts the above quote.

is the word "virtually" playing some trick here??
 
author
Sheriff
Posts: 23329
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

ikneet singh wrote:


In real-life hashing, it’s not uncommon to have more than one entry in a bucket.



which bascially means that two objects can have same hashcode



No. There isn't a one to one mapping of buckets and hash codes. If there is, then you will quickly run out of memory.... so objects of different hash codes can share the same bucket.

Henry
 
Bartender
Posts: 2124
44
Firefox Browser IntelliJ IDE Java Linux Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That is correct. This method will produce unique hash codes with great probability but this is not guaranteed.
 
Henry Wong
author
Sheriff
Posts: 23329
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

ikneet singh wrote:


The default hashCode method in class Object virtually always comes up with a unique number (hashcode) for each object



which contradicts the above quote.

is the word "virtually" playing some trick here??



It means that the hash code generator for the Object class will try to generate a unique hash code for each Object, but it isn't guaranteed to work. You can have two different Object objects with the same hash code.

Henry
 
Marshal
Posts: 57276
175
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you have 4294967297 instances of a particular class, please check whether they can all have different hash code.
What happens inside a hash map implementation is that the following formula is applied, after some fiddling with the hash code:-
buckets[h & c - 1]
…where h is the hash code and c is the capacity of the backing array. That formula works really nicely as long as c is an exact power of 2. So you use the rightmost however many bits of the (fiddled) hash code to determine the bucket to put the key into. When the array is nearly full (look for the load factor) the backing array is doubled in size and all the K V pairs are put into the new buckets.
 
Campbell Ritchie
Marshal
Posts: 57276
175
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I merged your stuff with the following thread. I hope that is okay by you.
 
ikneet singh
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


Even if the two instances happen to hash to the same bucket, the get()
method will almost certainly return null, as HashMap has an optimization that
caches the hash code associated with each entry and doesn’t bother checking for
object equality if the hash codes don’t match.



why return null if the bucket is same ? can't equals() method be used instead to match the objects in the bucket ?
what is this "optimization" and "caching" that the HashMap will carry out ? Will all hash based collections work out this optimization ?
 
Bartender
Posts: 1707
40
Eclipse IDE Google Web Toolkit Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you need to quote your sources. I am afraid you just have an incomplete piece of explanation.
 
ikneet singh
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
quote is from Effective Java 2nd edition, Item 9: Always override hashCode() when you override equals()

Effective Java wrote:




Suppose you attempt to use this class with a HashMap:



At this point, you might expect m.get(new PhoneNumber(707, 867, 5309)) to
return "Jenny", but it returns null. Notice that two PhoneNumber instances are
involved: one is used for insertion into the HashMap, and a second, equal, instance
is used for (attempted) retrieval. The PhoneNumber class’s failure to override
hashCode causes the two equal instances to have unequal hash codes, in violation
of the hashCode contract. Therefore the get method is likely to look for the phone
number in a different hash bucket from the one in which it was stored by the put
method.
Even if the two instances happen to hash to the same bucket, the get()
method will almost certainly return null, as HashMap has an optimization that
caches the hash code associated with each entry and doesn’t bother checking for
object equality if the hash codes don’t match.



why return null if the bucket is same ? can't equals() method be used instead to match the objects in the bucket ?
what is this "optimization" and "caching" that the HashMap will carry out ? Will all hash based collections work out this optimization ?
 
Campbell Ritchie
Marshal
Posts: 57276
175
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Please don't create two threads for one question.

The bucket will not be the same (unless you are very lucky). The hash code is used to find the bucket; if the wrong bucket is found then no value will be found. The reason for not only using equals() is given in the same chapter of Effective Java™. Without using hash codes to find the right bucket, the performance of a hash map will reduce to that of a linked list because every K‑V pair may need to be explored until a match is found.
 
Rancher
Posts: 3255
37
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is saying that HashMap has three steps.

1. Find the "bucket" based on the hash.
2. Cycle through the entries comparing the hashes.
3. For any that match, then do an equals.

So even if the object by accident has a hash similar enough to end up in the same bucket, it will still fail if the hashes don't match, even if the objects are otherwise classed as equal.
 
salvin francis
Bartender
Posts: 1707
40
Eclipse IDE Google Web Toolkit Java
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let me demonstrate a small example.
Let's say you have an employee object and it has a name property. Let's say you decide that the hash code is the employee's name's first character. So, Abraham will be 65 (since it starts with A)
Do you think this will be a valid hash code for all the millions of employees in your database?

Do you think that a hash map containing more than 26 employees won't work?

The answer is yes this is a very poor implementation but definitely a valid hash code
And even a hash map of 1000 employees will function well
(assuming that equals is properly implemented for example on employee id)
 
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

ikneet singh wrote:why return null if the bucket is same ? can't equals() method be used instead to match the objects in the bucket ?


It IS; but (as others have said) the bucket will not be the same unless you're very lucky because, in the absence of a custom hashCode() method, you will get the result of Object.hashCode() - and that is similar to (but probably not the same as) its memory address.

So if you don't write your own hashCode() method, the likelihood of two different objects returning the same hashcode is very remote, even if they are "equal".

And that's why it's so important to implement the two methods together: because they are defined that way.

Specifically: if a.equals(b) returns true, then a.hashCode() and b.hashCode() MUST return the same value - and if there is any situation where they don't, then all bets are off.

what is this "optimization" and "caching" that the HashMap will carry out ? Will all hash based collections work out this optimization ?


Maybe not, but if what you read is indeed true, all HashMaps will.

TBH, I'm not aware of any caching in HashMap - because the structure itself is actually a form of cache - but I'm pretty sure that the class does do further massaging to the hashcode that you generate, and then masks the result according to the current size of the map.

However, this really isn't stuff you need to worry about. The whole point of using classes like HashMap is that they've (hopefully) been written by someone who has forgotten more about hashmaps than you (or I) are ever likely to know. You simply have to trust that they know their business.

Winston
 
ikneet singh
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks everyone..
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!