• Post Reply Bookmark Topic Watch Topic
  • New Topic

Rehashing  RSS feed

 
Vijay Raj
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In a HashMap (or any other hash data structure in Java), the hashcode is used to determine the 'bucket' in which the value is present. So, I feel the 'buckets' are only created when the hashmap gets a new item with a new hashcode. I don't know whether the question makes any sense or not but here I go -

How will rehashing be done if we keep getting the items with the same hashcode? Or may be the question is - What will be 'rehashed' if we keep getting items with the same hashcode?

Regards,
Vijay.
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How will rehashing be done if we keep getting the items with the same hashcode? Or may be the question is - What will be 'rehashed' if we keep getting items with the same hashcode?


This is actually an implementation detail -- and not really that relevent to know, but since you are interested...

The number of buckets is determined when the hashmap is first instantiated. When a element is added, it is normalized to find the right bucket to place it in. For example, if there is 10 buckets, the range of possible hashcodes (an int value) is divided by 10 groups to determine which bucket to place. There is no bucket that matches the hashcode. Two hashcodes may end up in the same bucket by this normalization.

Anyway, the hashmap also keeps track of the number of elements, and when it exceeds a certain value. It will change the number of buckets. This is needed in order to keep access at constant time -- lower the need to search though buckets. When this happens, the hashmap needs to rehash every element, in order to determine which buckets it should go to. It can't move all the elements in the bucket (together, to a new bucket), because two elements that used to be in the same bucket, prior to the rehash, may not be in the same bucket, after.

Henry

 
Vijay Raj
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Henry.

I was under the impression that every bucket has an unique hashcode assigned to it.

Correct me if I am wrong - Rehashing will require more number of buckets. Like you said, if we had 10 buckets and its increased to 20 buckets, the range of possible hashcodes will now be divided into 20 groups. Is this where rehashing is done?

Regards,
Vijay.
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Vijay Raj wrote:
I was under the impression that every bucket has an unique hashcode assigned to it.


Unfortunately, this impression is not correct.

If an array was created that had a bucket for every possible hashcode, you'll run out of memory. And if the bucket were to be created on the fly, then you need a mechanism to find the bucket -- probably some sort of list. Traversing the list of buckets can't be done in constant time (order of one), and that is a design requirement of hashing collections.

Vijay Raj wrote:Correct me if I am wrong - Rehashing will require more number of buckets. Like you said, if we had 10 buckets and its increased to 20 buckets, the range of possible hashcodes will now be divided into 20 groups. Is this where rehashing is done?


Yes.

Henry
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!