• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Hashcode() & hashtables

 
Ranch Hand
Posts: 73
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jst see the words about Hashcode() in API docs:
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Wanna know: In such cases, what about the relative improvement on the performance of hashtables?
thanks.
 
mister krabs
Posts: 13974
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Moving this to the Performance forum.
 
Author
Posts: 6055
8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Roger Zhao:

Wanna know: In such cases, what about the relative improvement on the performance of hashtables?


I haven't looked at JVM implementations in this respect, but here's the theory...
Each instance is converted into a single number (based on instance properties). This number is used as an index into the hashtable. The instances are placed into bins in the hashtable, based on their hash number. the hashing function is somewhat expensive, but constant time, O(1).
When two objects go into the same bin, they are said to collide. The hashing algorithm then needs to decide what to do in the case of a collision. Two common options are given below:
1) Repeated hashing. The object is hashed again (possibly taking into account information about the first bucket). If many instances collide, you can easily double the cost of using the hashmap.
2) Chaining. When you have multiple items in the buck, you simply chain them together, in, say, an array. Now you incur not only the hash cost, but also a search cost.
In short, collisions cost more work, and so should be avoided.
--Mark
 
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As far as I know, both Hashtable and HashMap use chaining (take a look at the source code provided with the JDK when in doubt).
So, when you put different Objects with the same hashcode into it, you basically get the performance of a list instead that of a hashtable. You only get O(1) behaviour if hashcodes are "reasonably distributed".
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic