Win a copy of Spring in Action (5th edition) this week in the Spring forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Bear Bibeault
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
Sheriffs:
  • Knute Snortum
  • Junilu Lacar
  • paul wheaton
Saloon Keepers:
  • Ganesh Patekar
  • Frits Walraven
  • Tim Moores
  • Ron McLeod
  • Carey Brown
Bartenders:
  • Stephan van Hulst
  • salvin francis
  • Tim Holloway

hashmap vs searching  RSS feed

 
Ranch Hand
Posts: 61
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
we can do searching via hash map and by some searching api available,can any one tell which one is better in terms of time of execution or any other measure






thanks for support
 
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
HashMap is very fast for exact-match look-up. Of course, you need to consider size of your data.

To choose between HashMap and "some searching api available" you need to know (1) more about what you really need and (2) more about "some searching api available".
 
Ranch Hand
Posts: 259
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you please mention to what the search is related to ?
 
Puneet N Vyas
Ranch Hand
Posts: 61
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
say simple 100 integer values i have to search,which one will be best?
 
Vilmantas Baranauskas
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you have 100 integer values and define your search as "do I have value X" then following would work:

 
Bartender
Posts: 9550
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Vilmantas Baranauskas:
. . .then following would work:



"would work", yes, but OP is asking which is better.
My gut reaction is that a hash map would be faster for simple cases like your example. I would bet there are other cases where maintaining a map is impractical due to memory constraints, hash collisions and so on and a good search algorithm would perform better.
 
author & internet detective
Posts: 38911
684
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would expect it to depend on how many times you call contains(), how many hash conflicts there are (for integers this should be low) and how big the dataset is. For 100 items, it is unlikely to matter either way.
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you do any measurements with real data, kindly let us know how it turns out.

In addition to the factors others have mentioned, I would consider how often the data changes. For a large set of unchanging strings there are some reallllly fast algorithms as used in spell checkers. Naturally if the dataset changes alot the cost of building the spell checker data structure is not worth the fast individual lookups.

Bill
 
author
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For a limited range of integer values, a boolean array would be another option, and probably even faster. Hard to tell whether that would worth it without knowing more. When in doubt, I would always start with a HashSet and use a profiler to find the bottleneck if that's not fast enough.
 
Vilmantas Baranauskas
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Some numbers.

Pentium M 2GHz, Java 1.6 on linux 2.6.24.

1.000.000 random integers are put into HashSet. Takes 3991ms.

10.000 look ups of random integers takes 24ms.
 
Rancher
Posts: 4686
7
Linux Mac OS X VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by fundupuneet vyas:
say simple 100 integer values i have to search,which one will be best?



If you have 100 values, it doesn't matter.
Premature optimization.

I'd use a HashMap just because its easy. Or if I knew more about the domain, I might use a self optimizing table.
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If the integers are in a reasonable range, say 0 to 1000000 you could use a boolean array.

boolean[] flag = new boolean[1000000];

then to test if N is in the array, just look at flag[N]

Bill
 
Ilja Preuss
author
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by William Brogden:
If the integers are in a reasonable range, say 0 to 1000000 you could use a boolean array.

boolean[] flag = new boolean[1000000];

then to test if N is in the array, just look at flag[N]

Bill



Bill, thanks for making my suggestion more concrete...
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My experience is that a byte[] array typically uses one full byte for every boolean. Thus, it's eight times as big as it needs to be. This is not specified anywhere that I'm aware of, and it's possible that on some platforms, a boolean[] array uses only one bit per boolean. It's also possible that some platforms use more than one byte per boolean. Anyway though, if you want to use space a little more effectively, a java.util.BitSet may be a good idea. This generally uses one bit per boolean, if you get the size right in the first place and don't need it to resize. It's probably a little slower than a byte[] array - perhaps a lot slower, if you aren't doing anything else in your program. Usually that doesn't matter much though.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!