# About Buckets used in Hashcode in java

Karan Kaw

Greenhorn

Posts: 14

posted 3 years ago

---Source(A Programmers Guide to JAVA CERTIFICATION by Khalid Mughal)

My Question is :

Does my hashcode necessarily have to start from Zero onwards so that Array of Buckets maintained by hashtable gets correct Zero-based Index?

How we ensure that hashcode() value must start from '0' as I believe hashcode gives some arbitrary Number??

When we create hashcode based storage(map/set); Hash based Systems Use an array of Buckets.

For storing, we perform hashcode() on Key(object), That gives us index of Bucket in which we actually save 'Key-Value'(map) or Value(set)

---Source(A Programmers Guide to JAVA CERTIFICATION by Khalid Mughal)

My Question is :

Does my hashcode necessarily have to start from Zero onwards so that Array of Buckets maintained by hashtable gets correct Zero-based Index?

How we ensure that hashcode() value must start from '0' as I believe hashcode gives some arbitrary Number??

posted 3 years ago

You don't have to worry about that. HashMap etc all have their own internal mechanism to make sure hash codes can be used to locate buckets. If you want to know how this works you can check out the source code of HashMap inside the src.zip file in your JDK's root folder.

hashCode() doesn't necessarily give an arbitrary number. Most classes have a deterministic way of calculating it; String for instance even describes the calculation. And like I said before, you don't need to ensure that values start with 0, negative values are just fine.

Karan Kaw wrote:Does my hashcode necessarily have to start from Zero onwards so that Array of Buckets maintained by hashtable gets correct Zero-based Index?

You don't have to worry about that. HashMap etc all have their own internal mechanism to make sure hash codes can be used to locate buckets. If you want to know how this works you can check out the source code of HashMap inside the src.zip file in your JDK's root folder.

How we ensure that hashcode() value must start from '0' as I believe hashcode gives some arbitrary Number??

hashCode() doesn't necessarily give an arbitrary number. Most classes have a deterministic way of calculating it; String for instance even describes the calculation. And like I said before, you don't need to ensure that values start with 0, negative values are just fine.

SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6 - OCEJPAD 6

How To Ask Questions How To Answer Questions

posted 3 years ago

There's not a one-to-one mapping between hash codes and buckets. Usually a modulo is used to make this mapping. So, if your hash has 100 buckets, you take the hash code of an object and mod it by 100 to determine the bucket to use. Modulos are automatically 0-based. Conflicts will arise as different hash codes will be mapped to the same bucket, so every hashing algorithm must have a way to resolve these conflicts, for example by creating a linked list in the bucket instead of a single object. A prime number of buckets tends to reduce conflicts, but doesn't eliminate them.

posted 3 years ago

No. The rules for hashcode are just one rule: if two objects are equal then they must have the same hashcode. That's all.

In particular the concept of hashcodes "starting from" some point doesn't mean anything. They are indeed, as you say, just some (nearly) arbitrary number. Beyond that you don't need to know how the hashtable class manages those buckets, and really you don't even need to know that there are buckets involved at all.

At least that's true for beginners and for learners. It may happen that your hashcode function turns out to be extremely important from a performance point of view, in which case you may need to know something about the internal implementation details of the hashtable implementation you are using, but that is extremely unlikely.

Karan Kaw wrote:Does my hashcode necessarily have to start from Zero onwards so that Array of Buckets maintained by hashtable gets correct Zero-based Index?

How we ensure that hashcode() value must start from '0' as I believe hashcode gives some arbitrary Number??

No. The rules for hashcode are just one rule: if two objects are equal then they must have the same hashcode. That's all.

In particular the concept of hashcodes "starting from" some point doesn't mean anything. They are indeed, as you say, just some (nearly) arbitrary number. Beyond that you don't need to know how the hashtable class manages those buckets, and really you don't even need to know that there are buckets involved at all.

At least that's true for beginners and for learners. It may happen that your hashcode function turns out to be extremely important from a performance point of view, in which case you may need to know something about the internal implementation details of the hashtable implementation you are using, but that is extremely unlikely.

Karan Kaw

Greenhorn

Posts: 14

posted 3 years ago

Hi Folks,

Yes You are right I do not need to bother about these Intrinsic Details of Hashcode.

But, Just Out of curiosity, I wanted to know/guess how Arbitrary Values generated by Hashcode relate to index of Bucket Array.

For sake of understanding, Its good enough to know

as Greg has Quoted here.

Thanks , All of You.

Really This is Really useful Knowledge Sharing Forum.

Yes You are right I do not need to bother about these Intrinsic Details of Hashcode.

But, Just Out of curiosity, I wanted to know/guess how Arbitrary Values generated by Hashcode relate to index of Bucket Array.

For sake of understanding, Its good enough to know

Modulos are automatically 0-based

as Greg has Quoted here.

Thanks , All of You.

Really This is Really useful Knowledge Sharing Forum.

Campbell Ritchie

Sheriff

Posts: 49451

64

posted 3 years ago

Most people would not use 100 buckets, but 128. Then you can useGreg Charles wrote:There's not a one-to-one mapping between hash codes and buckets. Usually a modulo is used to make this mapping. So, if your hash has 100 buckets, you take the hash code of an object and mod it by 100 to determine the bucket to use. Modulos are automatically 0-based. . . .

`h & c - 1`where h is hash code and c capacity. That always gives a result which matches an array index, unlike % which can give negative results.