• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

What is a HashSet?

 
Franz Fountain
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What exactly is a HashSet? I understand it is an unordered Set, but what does it have to do with the hashCode() method? In K&B p. 542 it says "the more efficient your hashCode() implementation the better access performance you'll get". So the hashcode is used somehow for storage and retrieval.

I guess a corollary to this question is how is a HashSet stored? Is it stored as a tree sorted by hashcode?
 
Rick Reumann
Ranch Hand
Posts: 281
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Go back a few pages ( 529-525 ). I think that will help answer a lot of your questions.
 
Franz Fountain
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I looked at the section you mentioned in K&B. It talks about hashcodes. I understand hashcodes and I understand sets. I just don't understand what hashcodes have to do with sets. This is not explained explicitly in K&B.


But anyway back to the question. I understand how hashcodes and maps work together. This is the obvious use of hashcodes to store and retrieve information with good performance.

The not so obvious one is sets. Sets are collections of unique objects. A TreeSet makes sense to me. Since the tree is maintained in order, there can be a search for an element and if it matches then it isn't added to the TreeSet.

A HashSet doesn't say how the set is stored which seems important. Is it an array or linked list or a tree? I don't know. It just feels very fuzzy to me. Does anyone have any more concrete information regarding HashSets?

Thanks
 
Keith Lynn
Ranch Hand
Posts: 2409
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A HashSet implements the Set interface backed up by an instance of a HashMap.

A HashMap has a certain number of "buckets". The hashCode() is used to determine in which bucket the object will be placed.

The equals() method is used to distinguish between the objects in a bucket.
 
Franz Fountain
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So your saying HashSet is really using HashMap under the covers. So does this mean that the [key, value] pair is a same to same relationship. For example in the case of a set of Strings:

Key Value
"apple" "apple"
"grape" "grape"
"cherry" "cherry"

Is that the way it works?

[ November 22, 2006: Message edited by: Franz Fountain ]
[ November 22, 2006: Message edited by: Franz Fountain ]
 
Satish Kota
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So your saying HashSet is really using HashMap under the covers. So does this mean that the [key, value] pair is a same to same relationship. For example in the case of a set of Strings:

Key Value
"apple" "apple"
"grape" "grape"
"cherry" "cherry"

Is that the way it works?


HashSet internally uses HashMap. To be more precise this is how it is maintained inside an HashMap object:



So when you add an element to HashMap using add(E o), this method internally calls map.put(o, PRESENT) method. And here PRESENT is an Object reference. Which goes like this:


So when you store "apple", "grape", "cherry" in your HashSet it gets stored in HashMap as

KEY VALUE
--------------
"apple" PRESENT
"grape" PRESENT
"cherry" PRESENT
--------------
 
Franz Fountain
Ranch Hand
Posts: 58
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Satish,

Thank you very much. That's very interesting. So PRESENT is a constant object. I assume at retrieval time if Object == PRESENT, then the Key is returned as the Value.

Is that right? Anyway, you've answered my original question. The rest is just implementation details. Still it's very interesting to see how it's implemented. It's great that Sun releases the source code unlike Microsoft. Actually if you've ever seen any MS source code (they do release some), it's unbelievably convoluted. Sun sources seem to be so much clearer and easier to read. Just a different culture I guess. Go Java!
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic