• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Overall performance of ArrayList vs. HashSet

 
Ranch Hand
Posts: 116
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I have an xml object that contains node with multiple occurences. I am trying come up with a code that would remove such nodes of the xml based on certain conditions. These nodes are already been dumped into an array of Strings. I am capturing the indexes of this array into an ArrayList object which satisfies the conditions for removal from the array. I can put this array of strings into a ArrayList and do a iteration and do remove. Similarly can use HashSet to do the same. Wondering what would perform overall better in removing these elements? Or there is better way to do this in a totally different way?

Thanks

Tariq
 
Rancher
Posts: 5008
38
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Wondering what would perform overall better in removing these elements?


One way to compare different ways is to time them and check the differences.
 
Marshal
Posts: 79177
377
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Asking for performance differences between ArrayList and HashSet is like asking which is faster, a motorbike or a speedboat. On water the motorbike is faster and the speedboat is better on the roads . . . or maybe the other way round.

The two classes give completely different functionality and you need to read about the Set and List interfaces, in the API and Java Tutorials. In fact using a HashSet will lose all your duplicates on insertion.
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
First of all, as Campbell already says, an ArrayList and a HashSet are two different things. An ArrayList is an ordered list of elements, internally stored in an array, and a HashSet is an unordered collection, which does not allow duplicates.

You can't say which of the two is "overall faster". Some operations are faster on an ArrayList and some operations are faster on a HashSet, due to how these datastructures are organized internally. The API documentation gives some information about the performance of different operations.

The documentation for ArrayList says:

The size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation.


And the documentation of HashSet says:

This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance's size (the number of elements) plus the "capacity" of the backing HashMap instance (the number of buckets). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

 
Tariq Ahsan
Ranch Hand
Posts: 116
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks all for your responses to my sort of dumb question. I'll read into more on List and Set to getter understanding on the two.
Appreciated for your help.
 
Campbell Ritchie
Marshal
Posts: 79177
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Tariq Ahsan:
Thanks all for your responses to my sort of dumb question.

What was "dumb" about it?

You're welcome.
 
Ranch Hand
Posts: 378
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
A small addition..

The HashSet determines a duplicate by calling the equals method of the objects being added. Unless the equals method of the objects that is being added to the HashSet is properly overridden the HashSet will accept "duplicates".
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
But then of course they aren't duplicates anymore, are they

But you're right, the notion of "duplicate" is determined by hashCode + equals for (Linked)HashMap and by compareTo (or the Comparator) for TreeMap.
 
Ranch Hand
Posts: 218
VI Editor Ruby Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Gamini Sirisena:
A small addition..

The HashSet determines a duplicate by calling the equals method of the objects being added. Unless the equals method of the objects that is being added to the HashSet is properly overridden the HashSet will accept "duplicates".



Considering the name of the class is HashSet, I would think it will rely more on hashCode() method. Of course the common best practice is where a class return true on equals() it should return the same value on hashCode(), so you can kind of say that..but no guarantee
[ September 02, 2008: Message edited by: Wirianto Djunaidi ]
 
Campbell Ritchie
Marshal
Posts: 79177
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Wirianto Djunaidi:
Considering the name of the class is HashSet, I would think it will rely more on hashCode() method. Of course the common best practice is where a class return true on equals() it should return the same value on hashCode(), so you can kind of say that..but no guarantee



Look at the API documentation, where it tells you how HashSet works (briefly).

If you are not guaranteeing that you get the same hashCode as other objects returning true on equals, you are programming wrongly. Read about hashCode here and notice it says "must" or "must consistently."
 
Gamini Sirisena
Ranch Hand
Posts: 378
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I knew someone was going to pick on that duplicate..

JSE 1.6 API doc says the following on the add method of the HashSet..

Adds the specified element to this set if it is not already present. More formally, adds the specified element e to this set if this set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). If this set already contains the element, the call leaves the set unchanged and returns false.

This does not talk about the hashCode.

However it is true that when you override equals, hashCode must return same value when equals returns true. (But hash codes need not return different values when equals returns false)
 
The problems of the world fade way as you eat a piece of pie. This tiny ad has never known problems:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic