• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

HashMap collisions question

 
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi everybody!
I have a quick question about HashMap. Even for the best hashing algorithm we should probably consider the case when for two distinct objects of the same type hashCode() method returns same integer number. So my understanding is that calling MyHashMap.put(String term, Object myLovelyObject) does not guarantee that for two different Strings we will get two different locations in our HashMap (collision may occur). So given that all strings in my program are unique, do we still have to manually check the return value of MyHashMap.put(...) method to be a null value, just to make sure we don't accidentally dump anything? I am a little confused. Please help me.

Thanks a lot.
 
author & internet detective
Posts: 41860
908
Eclipse IDE VI Editor Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Nikolay,
Welcome to JavaRanch!

A collision will not cause anything to get dumped. However, storing something with the same key will. Suppost I make the following calls:

Also, assume I have a really bad hashing algorithm that always maps to the same hash. At the end of the code, I will have to values in my map. key1/value2 and key2/value3. They will be stored under the same hash and not give me increased performance (over a list), but they will both be stored.
 
Nikolay Gudovich
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
and I am glad to become part of JavaRanch community thank you.

Well, this means we can disregard the collision issue all together. ( I was suspecting that - otherwise that's too much low-level detail for a Java programmer ) However, I am also a little interested how HashMap implements the collision case internally. You may say "go diggin' source code for HashMap" but I still hope somebody could give me a clue.
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
A hash table (Java's, or any other) is basically an array of lists.



The horizontal lines are the cells of the array. Each cell is called a "bucket". The hashcode of your key is used to choose a bucket by computing an index into the main array (using shifts and/or the modulo operator, generally).

Once the map has chosen a bucket, it stores the [key, value] pair into that bucket (in the case of java.util.Map, as a Map.Entry object). If there's already something in the bucket, that's fine; the new entry is just added to that bucket's list.

To look up an item from the key, you compute the bucket index from the hash code, and then search the list in that bucket for the key. In a lightly loaded map with a good hash function, most buckets will have 0 or 1 item in them, so the search takes constant time. If you have a bad hash function or an overloaded map, this searching can come to take a long time. In the diagram I drew, some buckets are empty, while others have 1, 2, or 3 pairs in them. This is typical.

There are various fancy-pants ways of improving this general scheme, but that's how it works in broad strokes.

Bob Sedgewick's "Algorithms" books are worth owning; you'll learn this kind of fundamental computer science from them.
 
Nikolay Gudovich
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
that's me again.
well I did go to source code (HashMap.java). It was scary first but after about 10 mins I knew all the internals of HashMap. It uses an array of Entry references and each Entry is actually a linked list node. So actual Hash table contains reference to head node and every node has a link to next node. When we have a collision it stores new Object in a new node of the Linked List. When we call MyHashMap.get(Object key) it does not just go to cell of the table but it goes through the linked list and compares passed key Object and stored Object using .equals(key) method. Well, this again proves I shouldn't be scared of source code

Anyway, thanks for help and I will definitely continue with JavaRanch and Head First books!!!
 
Right! We're on it! Let's get to work tiny ad!
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic