# choosing a data structure

I know Vectors are not really efficient when dealing with large numbers of objects. However, if I am dealing with solely 200 objects is it ok to use either a vector or list? Joe

Ranch Hand

See that the SET interface already assures you can only put something in once which takes care of part of your problem. That saves you a few lines of code and more importantly communicates to the reader your intent to only allow each object once.

It's not clear which of the sets would be best for putting a lot of objects in quickly. You could compare them on your own, but I bet at 200 objects you won't notice any difference. Maybe with 2 million you'll see some performance difference.

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi

Sheriff

Originally posted by Stan James:

It's not clear which of the sets would be best for putting a lot of objects in quickly.

Theoretically it should be HashSet (if its initial size is choosen appropriately, so that it doesn't have to resize). HashSet has O(1) for insertion, TreeSet O(n log n) (if I remember correctly). That is, insertion time for a HashSet shouldn't increase with the number of objects at all.

I agree that with just 200 objects, it probably doesn't matter, anyway.

The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus

Sheriff

**[Ilja]: HashSet has O(1) for insertion, TreeSet O(n log n) (if I remember correctly).**

That's a bit off. Inserting one element into a HashSet is O(1) (on average), while inserting N of them is O(N). For a TreeSet, inserting one element is O(log N), and inserting N of them is O(N log N).

A TreeSet is generally slower than a HashSet, but the difference frequently is not a big deal In general, use a TreeSet if you need the elements to be sorted; use a HashSet if you don't. Or since the original poster was originally interested in a List, perhaps a LinkedHashSet would be the preferred choice if the insertion order is important. Slightly slower than a HashSet, but the performance is comparable.

And yeah, the performance probably doesn't really matter here. The important criteria should be: does Joe want something sorted (TreeSet), in insertion order (LinkedHashSet), or does it not matter (HashSet)?

[ March 11, 2006: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, *Twister*

Sheriff

Originally posted by Jim Yingst:

That's a bit off. Inserting one element into a HashSet is O(1) (on average), while inserting N of them is O(N). For a TreeSet, inserting one element is O(log N), and inserting N of them is O(N log N).

Ah, right - thank you very much for the correction!

Another thing to keep in mind is that the big O notation only becomes relevant for big n - for a small number of objects, it could actually be that an O(log n) implementation is faster than an O(1) implementation. In this case, it would - besides other things - depend on the implementation details of hashcode versus compareTo.

But again, this is rather academic - for just 200 objects you are unlikely to notice the difference, anyway.

The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus

http://www.jamonapi.com/ - a fast, free open source performance tuning api.

JavaRanch Performance FAQ