• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Rehashing

 
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In a HashMap (or any other hash data structure in Java), the hashcode is used to determine the 'bucket' in which the value is present. So, I feel the 'buckets' are only created when the hashmap gets a new item with a new hashcode. I don't know whether the question makes any sense or not but here I go -

How will rehashing be done if we keep getting the items with the same hashcode? Or may be the question is - What will be 'rehashed' if we keep getting items with the same hashcode?

Regards,
Vijay.
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

How will rehashing be done if we keep getting the items with the same hashcode? Or may be the question is - What will be 'rehashed' if we keep getting items with the same hashcode?



This is actually an implementation detail -- and not really that relevent to know, but since you are interested...

The number of buckets is determined when the hashmap is first instantiated. When a element is added, it is normalized to find the right bucket to place it in. For example, if there is 10 buckets, the range of possible hashcodes (an int value) is divided by 10 groups to determine which bucket to place. There is no bucket that matches the hashcode. Two hashcodes may end up in the same bucket by this normalization.

Anyway, the hashmap also keeps track of the number of elements, and when it exceeds a certain value. It will change the number of buckets. This is needed in order to keep access at constant time -- lower the need to search though buckets. When this happens, the hashmap needs to rehash every element, in order to determine which buckets it should go to. It can't move all the elements in the bucket (together, to a new bucket), because two elements that used to be in the same bucket, prior to the rehash, may not be in the same bucket, after.

Henry

 
Vijay Raj
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Henry.

I was under the impression that every bucket has an unique hashcode assigned to it.

Correct me if I am wrong - Rehashing will require more number of buckets. Like you said, if we had 10 buckets and its increased to 20 buckets, the range of possible hashcodes will now be divided into 20 groups. Is this where rehashing is done?

Regards,
Vijay.
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Vijay Raj wrote:
I was under the impression that every bucket has an unique hashcode assigned to it.



Unfortunately, this impression is not correct.

If an array was created that had a bucket for every possible hashcode, you'll run out of memory. And if the bucket were to be created on the fly, then you need a mechanism to find the bucket -- probably some sort of list. Traversing the list of buckets can't be done in constant time (order of one), and that is a design requirement of hashing collections.

Vijay Raj wrote:Correct me if I am wrong - Rehashing will require more number of buckets. Like you said, if we had 10 buckets and its increased to 20 buckets, the range of possible hashcodes will now be divided into 20 groups. Is this where rehashing is done?



Yes.

Henry
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic