• Post Reply Bookmark Topic Watch Topic
  • New Topic

Hashtable Key Violations  RSS feed

 
Rob Mech
Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok, I ran into a very intresting question on the Whizlabs SCJP simulator I just have to know the underlying answer to.

The question was what will happen when you run the following code:



The answer is put "Java:Great Language"

Alright, I take issue with this. Hopefully some guru can explain to me why the decision was made to overwrite a primary key. This kind of goes against the purpose of a key/value pair to some degree. In fact I think it's kind of sloppy.

I would think that either it would ignore the new insert (as key was violated) or throw a checked exception. To think that you could inadvertently overwrite a value without ryhme or reason is maddening. How many bugs are going to take hours to track down on something like this?

Can somone tell me why they would choose to implement the Hashtable in such a way?
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can't think of any language or library in which an associative array is implemented any other way -- can you? Everything I can think of, from C++ to Perl to Ruby to Lisp and back, does it this same way.
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I believe the .NET Hashtable throws an exception if the element (with the same key) already exists. So technically, you can say this is done in C#, J#, Managed C++, and Visual Basic...


Can somone tell me why they would choose to implement the Hashtable in such a way?


Personally, I have never encountered a bug related to this behavior, certainly not one that took "hours to track down". I guess I am just used to the Java implementation.

Henry
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
put() returns the previous value for the key, or null if none. Good practice, if you do not expect to overwrite keys, is to check that return value with an "if" or an "assert".

Note that null values are allowed in hashtables, which does somewhat pervert the above method. You can use containsKey() to get over that.
 
Rob Mech
Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well when I reviewed the API doc it wasnt very clear on the duplicate key issue. I did read the same note about it returning a value of the previous key.

Still not sure why it would be implemented that way though. .NET aside, from a database perspective youre never allowed to insert a duplicate key into any index either.
 
Aditya Prasann
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A Hashtable internally contains buckets in which it stores the key/value pairs. The Hashtable uses the key's hashcode to determine to which bucket the key/value pair should map. When you pass a key/value to the Hashtable, it queries the key's hashcode. The Hashtable uses that code to determine the bucket in which to place the key/value. Hashcodes, however, represent only half the picture. The hashcode only tells the Hashtable into which bucket to drop the key/value. Sometimes, however, multiple objects may map to the same bucket, an event known as a collision. In case of collision it looks for the implementation of key's equals method. If two keys are 'unequal' by implementation of equals method it will store the two key/value pair at the same location. Here in your example as you are using String as keys(which are immutable), the value gets replaced as the two keys are equal.
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Henry Wong:
I believe the .NET Hashtable throws an exception if the element (with the same key) already exists. So technically, you can say this is done in C#, J#, Managed C++, and Visual Basic...


Ignorance, as they say, is bliss

Boy, I hope this is a property you can switch on or off. I can see where it would be useful to have this behavior sometimes, but it would be a real pain in the arse to have it act this way all the time.

Maybe there's a "replace()" method that works even if there's no previous value for that key?
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Rob Mech:
from a database perspective youre never allowed to insert a duplicate key into any index either.


OK, but this isn't a relational database, it's a low-level in-memory data structure. There's no reason why Java hash maps should behave like relational database indexes, because they're not the same thing.

There is a valid debate to be had about whether maps should allow replacement of existing entries - some applications will find it useful, others a pain. But I don't really see how the behaviour of relational databases informs that debate.

Are you coming to Java from an SQL background, by any chance?
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The class Hashtable is more or less a legacy class nowadays; in general, you're better off using HashMap instead. Often there's little real difference, except to confuse people. But one difference is that the API for HashMap's put() is a little more clear: "If the map previously contained a mapping for the key, the old value is replaced." This is also true of their shared interface, Map. Like Henry, I've never seen a problem with this behavior. I guess it's just a matter of what expectations you bring with you. I learned about hash tables before I learned about databases, so this never seemed strange.

Thinking about this, I can see where throwing an exception would be useful sometimes, though not always. But silently ignoring a new put()? That seems far more likely to induce bugs, to me. But maybe that's just because I'm used to put() behaving the way it does.

There's also a newer interface, ConcurrentMap, which adds putIfAbsent(), replace(), and a conditional remove(). These are primarily intended for use in multithreaded code, where it can be important to make these slightly more complex operations atomic. But they may be of interest even in single-threaded code. Note that these methods don't throw exceptions; they just return null if they have no effect. Which is ok since putIfAbsent() has a nice clear name, and it's not surprising that it behaves the way it does.
 
Rob Mech
Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Peter Chase:


OK, but this isn't a relational database, it's a low-level in-memory data structure. There's no reason why Java hash maps should behave like relational database indexes, because they're not the same thing.

There is a valid debate to be had about whether maps should allow replacement of existing entries - some applications will find it useful, others a pain. But I don't really see how the behaviour of relational databases informs that debate.

Are you coming to Java from an SQL background, by any chance?


I would say a "Database" background. I've been using relational data structures in various databases MySql, MSSQL, Oracle, FoxPro for some time.


Originally posted by Jim Yingst:

Thinking about this, I can see where throwing an exception would be useful sometimes, though not always. But silently ignoring a new put()? That seems far more likely to induce bugs, to me. But maybe that's just because I'm used to put() behaving the way it does.


See that's what I thought too. I wasnt looking to start a flame war here I'm simply trying to understand the reasoning behind this. When you consider that youre free to update a primary key value with another key/value pair it could introduce considerable amount of bugs. Granted this is not a database but uniquie should be unique and it should at least warn you that youre about to do something really stupid.

From a simplicty standpoint I can understand why this would be. It's easy to hold new configuration value pairs, or update a pair in the Map. Makes perfect sense there. If I was however trying to read in a large client supplied file of key/values lets say an account number and an account balance I'd have to do this duplicate checking myself to ensure that the file had the proper integrity (ok lets leave out the reasons why i'd do this anyhow and stick to the topic).

I'm cool with how it works, I'm just simply trying to understand the reasoning. One of those "How-it-Works" or more to the point "Why it works".
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could create a wrapper class that takes any Map and makes it throw if you try to replace a key. Similar idea to Collections.unmodifiableMap().
 
Sandeep Deb
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
HashMap, HashTable are associative arrays where one key can map to one value. There are variances of associative arrays called multi maps which can store multiple values against one key. Java API doesn't provide a default implementation of a multimap, but you can find one in Apache's collection framework

MultiMap

You can find more information about MultiMaps here

- Sandeep
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!