• Post Reply Bookmark Topic Watch Topic
  • New Topic

Hashcode() & hashtables

 
Roger Zhao
Ranch Hand
Posts: 73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jst see the words about Hashcode() in API docs:
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Wanna know: In such cases, what about the relative improvement on the performance of hashtables?
thanks.
 
Thomas Paul
mister krabs
Ranch Hand
Posts: 13974
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Moving this to the Performance forum.
 
Mark Herschberg
Sheriff
Posts: 6037
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Roger Zhao:

Wanna know: In such cases, what about the relative improvement on the performance of hashtables?

I haven't looked at JVM implementations in this respect, but here's the theory...
Each instance is converted into a single number (based on instance properties). This number is used as an index into the hashtable. The instances are placed into bins in the hashtable, based on their hash number. the hashing function is somewhat expensive, but constant time, O(1).
When two objects go into the same bin, they are said to collide. The hashing algorithm then needs to decide what to do in the case of a collision. Two common options are given below:
1) Repeated hashing. The object is hashed again (possibly taking into account information about the first bucket). If many instances collide, you can easily double the cost of using the hashmap.
2) Chaining. When you have multiple items in the buck, you simply chain them together, in, say, an array. Now you incur not only the hash cost, but also a search cost.
In short, collisions cost more work, and so should be avoided.
--Mark
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As far as I know, both Hashtable and HashMap use chaining (take a look at the source code provided with the JDK when in doubt).
So, when you put different Objects with the same hashcode into it, you basically get the performance of a list instead that of a hashtable. You only get O(1) behaviour if hashcodes are "reasonably distributed".
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!