The hash number is calculated based on the contents of the String. You can see an overview of how it is calculated in the Sun JVM by looking at the
API documentation for that method.
So knowing that the hash code is always calculated based on the actual Characters in the String, it makes sense that the hash code is always going to be the same for Strings that contain identical Characters.
However there is nothing to state that a hashing function must return a different value for different contents. It is quite plausible for two different contents to return the same hash code. For example, on my computer, I get the same results for "aa" and "bB":
So you can see that no matter whether I constructed the Strings or whether I just used them, I still got the same hash code for 2 different strings. This is as per the contract for hash codes.
Hash codes therefore cannot be used to determine equality (actually I suppose it is feasible that you could write a class that uses an algorithm that produces unique values for distinct objects, however this is likely to defeat the purpose of hashing - namely to quickly get a known value for an object that will enable you to lookup that object in a collection (such as a HashMap).
We then get to the question of equality, which takes us to
Java memory allocation. You can read more than you ever wanted to know about this in the
The Structure of the Java Virtual Machine.
But, to make this a bit simpler (at the risk of making a slightly inaccurate picture). Imagine if we had 10 memory locations:
Now lets try adding some code that allocates some constants:
At the end of that, only one memory location has been used:
And all three locations point to it (so s1, s2, and s3 are all pointing to the constant in constant memory pool location # 7.
Whereas when we use "new" to generate a new Object, they will be allocated on the heap:
Now we have some heap space being used:
Now as to how this affects you:
hashCode() The hashCode for s1, s2, s3, s4, and s5 is always computed for the actual values of the Characters in the String "Hello, World". So in all 5 cases the hashCode will be identical.
However, as previously stated, this is useless to you for determining equality - it is quite plausible that another string could have an identical hash code.
== By default, the "==" operator compares the memory locations of two objects. So in my examples s1, s2, and s3 are all pointing to the same memory location in the Constant Memory Pool. So the "==" operator will show that they are
the same object.
However s4 and s5 are each pointing to different objects (memory locations 1 and 3 on the stack), so the "==" operator will show them as being different objects.
.equals() The String .equals() method will look at the individual characters in the String and determine whether each Character is equal.
It will indicate that the
contents of s1, s2, s3, s4, and s5 are all equivalent, even though they are in different memory locations.
So it all depends on what you want to determine.
If you want a fast way to look something up in a collection then you want to use the hash function (actually the collection classes will use the hashing function themselves - you don't really need to worry about it unless you are implementing your own classes that need to be stored in a collection).
If you need to know whether two Strings contain the same characters, then you need to use the .equals() method.
If you need to know whether two Strings are actually the same object using the same memory location, then you can use the == operator.
Regards, Andrew