• Post Reply Bookmark Topic Watch Topic
  • New Topic

HashMap Storage if it exceeds Interger range  RSS feed

 
raghu kalachar
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As many of us know that initially hash map allocates a memory with default initial capacity of 16 and the default load factor of 0.75.Now when we try to store values into hashmap first it calculates bucket location by calling hashcode function on hash map key.Suppose if user defined hashcode methods returns a value which is greater than integer range and that values exceeds initial capacity address of hashmap,Then how the value will get stored in hashmap?
 
A.J. Côté
Ranch Hand
Posts: 417
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am not sure that I understand your question. hashCode() must return an int:

javadoc:

public int hashCode()

Returns a hash code value for the object. This method is supported for the benefit of hash tables such as those provided by HashMap.

The general contract of hashCode is:

Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

Returns:
a hash code value for this object.
See Also:
equals(java.lang.Object), System.identityHashCode(java.lang.Object)


here is the hash() method in HashMap:

 
Campbell Ritchie
Marshal
Posts: 56578
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you mean that you are going to try to enter more than 2147483647 K‑V pairs in the Map? You will probably run out of heap space long before that happens; if each Map.Entry object occupies 24B and each of the K or V in the pair also occupies 24B, then you are looking at something in the region of 144GiB heap space used. I have never seen a computer on sale with enough RAM to support such a big heap, but presumably there are supercomputers which are large enough.

The standard HashMap object has an array (Map.Entry[]) behind the scenes. It is arranged that the length of the array is always an exact power of 2, so the largest you are going to get is 2³⁰ = 1073741824 elements. When you exceed loadFactor × size (for the default hash map settings, 12 elements) the array is doubled in size and all the K‑V pairs rehashed and put into the new array. The old array is probably eligible for garbage collection. The Map uses the formula
h & c - 1
to determine the array index to put the K‑V pair; that location in the array is what you call a “bucket”. In that formula h is the rehashed hash code and c the capacity of the array and that formula is a bit like a remainder but it only works when c is an exact power of 2. It never gives a negative value. So that formula is exactly what you want to determine bucket indices. Remember that the − operator has a higher precedence than &. That formula never allows results larger than the capacity of the array provided c is a power of 2.

You cannot create an array larger than 2147483648 elements, not even if you have enough heap space. If you try to double the size from 2³⁰ = 1073741824 to 2³¹, you end up with 2147483648, which as you know is a non‑existent number for an int (it would overflow and turn negative). If you had enough memory to create such an array, you would suffer some sort of Exception.
 
Rob Spoor
Sheriff
Posts: 21135
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Technically it's still possible to have a Map or Collection with more elements than Integer.MAX_VALUE. A LinkedList for instance has no limitations other than memory, and a HashMap is only limited by Integer.MAX_VALUE entries per bucket. The API of both are aware of this; size() is explicitly documented to return Integer.MAX_VALUE if more entries exist. Both LinkedList and HashMap fail to do this though, as they store the size in a simple int.
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16060
88
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
raghu kalachar wrote:Suppose if user defined hashcode methods returns a value which is greater than integer range ...

That is not possible. The hashCode() method returns an int, so it can never return a value which is outside the range of values that an int can hold.
 
Campbell Ritchie
Marshal
Posts: 56578
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
To OP: I presume you are familiar with the concept of arithmetic overflow? If you are risking exceeding the bounds of the int datatype you suffer overflow, so you remain within the bounds of int. You usually regard overflow as a serious error, but in the case of a hash code you tolerate it. That is possibly why Java® arithmetic does not throw an exception when overflow occurs.
 
Rodion Gork
Ranch Hand
Posts: 47
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not sure I get the author's question right, but in MapDB there was something:

http://www.mapdb.org/apidocs/org/mapdb/LongMap.html

It is not directly HashMap with long hashes - rather Map with long keys. But it should be easy to built what you want upon it if it is not enough.
 
Campbell Ritchie
Marshal
Posts: 56578
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, you can design hashing algorithms which return datatypes other than int. MD5 and SHA256 are that sort of program.
Would you need a lot of memory to sue that LongMap successfully?
 
Rodion Gork
Ranch Hand
Posts: 47
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As I remember MapDB project is dedicated to persistent collections, so that they could use disk storage - that is why they care about such large maps

Though I did not watch closely this thing during latest years...
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!