• Post Reply Bookmark Topic Watch Topic
  • New Topic

Effect of HashMap unnecessarily large value for initialCapacity?  RSS feed

 
rohit chavan
Ranch Hand
Posts: 133
Hibernate Java Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I stumbled upon a legacy code, where the functional requirement is for 1500 value, hashmap, but the one being created is with almost 6000 initial capacity.



What puzzles me here is,

if we create a HashMap with an unnecessarily large value as initialCapacity, does it have an impact on performance of the code.

i.e. would it be quicker to perform a get() on the HashMap, if it had initialCapacity of 1800?

- Rohit

 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
rohit chavan wrote:if we create a HashMap with an unnecessarily large value as initialCapacity, does it have an impact on performance of the code.

Nope. In fact it'll probably ensure that performance is optimal.

What it might do is make the HashMap unnecessarily big; but since you're only talking about space for references (4/8 bytes) it would have to be extraordinarily huge to make much difference. And 6,000 is NOT "extraordinary".

If you need to make 6,000 HashMaps on the other hand...

Winston
 
rohit chavan
Ranch Hand
Posts: 133
Hibernate Java Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you Winston,

It was really helpful in clearing up my doubt.

- Rohit
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There can be a performance impact for iteration over the entries in the map. The complexity of the iteration is roughly proportional to the size of the map plus the capacity of the map. So, if the capacity is not used, iterations over the map can be impacted. Gets from the map won't be impacted.
 
Campbell Ritchie
Marshal
Posts: 56536
172
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What makes you think it has a 6000 capacity? Hash‑based collections usually use h & c - 1 where h is hash code, c capacity and & is the bitwise AND operator. That allows them to find the “bucket”, which lives in an array. Obviously that trick with bitwise AND only works if capacity is an exact power of 2, so your 6000 map will have a capacity of 8192. It will not enlarge to 16384 capacity until you put 6144 K‑V pairs in it, if you use the default load factor of 75%.
The 8192 array would use 32KiB or slightly more for its storage, but you usually have much more memory than that available.
 
rohit chavan
Ranch Hand
Posts: 133
Hibernate Java Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Martin Vajsar wrote:So, if the capacity is not used, iterations over the map can be impacted. Gets from the map won't be impacted.


I am doing get operation only, and not iterating over the map. (however if "iterating" covers the way JAVA gets to the bucket and hands back the value against a key - then I am a bit confused here with the two statements.


Campbell Ritchie wrote:
What makes you think it has a 6000 capacity? Hash‑based collections usually use h & c - 1 where h is hash code, c capacity and & is the bitwise AND operator. That allows them to find the “bucket”, which lives in an array. Obviously that trick with bitwise AND only works if capacity is an exact power of 2, so your 6000 map will have a capacity of 8192. It will not enlarge to 16384 capacity until you put 6144 K‑V pairs in it, if you use the default load factor of 75%.
The 8192 array would use 32KiB or slightly more for its storage, but you usually have much more memory than that available.


I agree, but what I am searching for is - if I am having my HashMap filled to at most 30% of the declared capacity (6000 OR 8192) . does it have a performance impact doing GET operation on the HashMap. Will it be better if the HashMap is declared with a capacity that is actually being used.

Thank you,

-Rohit
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
rohit chavan wrote:
Martin Vajsar wrote:So, if the capacity is not used, iterations over the map can be impacted. Gets from the map won't be impacted.


I am doing get operation only, and not iterating over the map. (however if "iterating" covers the way JAVA gets to the bucket and hands back the value against a key - then I am a bit confused here with the two statements.

No, the iteration I'm speaking about is the iteration using the iterator provided by the Map, such as:

for (Key k : map.keySet())

or

for (Map.Entry<Key, Value> entry : map.entrySet())

or using the iterator explicitly. These operations depend also on the number of buckets in the map (all buckets - including empty ones - are visited). If you never use these, you're safe.

I personally tend not to set initial capacity, unless the capacity is known beforehand (such as copying one collection exactly into another), or the resizing comes up as a bottleneck in the profiler. The second reason is hypothetical, I've never seen it, but I guess it can happen sometimes.
 
Campbell Ritchie
Marshal
Posts: 56536
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Resizing takes longer if your capacity is small, but it only takes a few μs and only happens a few times.
To add to what MV said earlier, if you go through the documentation for HashMap, it tells you what the performance of the operations is: constant time for get and put.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!