• Post Reply Bookmark Topic Watch Topic
  • New Topic

ArrayList -- remoAll()  RSS feed

 
Arno Reper
Ranch Hand
Posts: 286
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I was wondering how slooooowly the implementation of the removeAll() for ArrayList was. Its really not normal...did someone else discover that?
They used the remove() method, who just move one by one the array's object, and this each time you remove one. I test It with an

where number is the number of elements inserted ( just integer ) randomly and a small Collection like 6 Object. Wow almost 2 minutes and i stoppped before the end...I made an beter one and its like dark and white. like 10000% faster. Im sure every other student can do much more beter...
Ok you could answer ArrayList isn't made for insert/delete but when its well implemented, it remains ok.
Sun should change the removeAll because its just method
arno
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Note that ArrayList just inherits a generic removeAll() from AbstractCollection; it's not written specifically for that class.

I don't really understand your test case because it's not explained clearly, and you didn't tell us your secret for making things go faster. I can imagine making things faster by a factor of N (here 6) or so, but anything else is going to either extra space proportional to the size of the ArrayList, or change the order of the elements in the ArrayList, neither of which are desirable properties, especially the second one. But I'm interested to hear what you did -- let's see!
 
Arno Reper
Ranch Hand
Posts: 286
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
sorry i was unclear...
here is my removeAll

and now the prog I used to try it :

So I hope its better now...;-)
ps :
Note that ArrayList just inherits a generic removeAll() from AbstractCollection; it's not written specifically for that class.

They should override It...



[ April 29, 2006: Message edited by: Arno Reper ]

[edited misplaced code tag which created overlong line - Jim]
[ April 30, 2006: Message edited by: Jim Yingst ]
 
Arno Reper
Ranch Hand
Posts: 286
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
just try it you'll see the difference...
if you make a graphe, its like...(i can't explain in english so...try it)
 
Arno Reper
Ranch Hand
Posts: 286
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here are the results :

===> 800000 elements Sun: 73500 (millis) vs Me : 125 (millis) , no comment...;-)

[ April 29, 2006: Message edited by: Arno Reper ]
[ April 29, 2006: Message edited by: Arno Reper ]
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The performance of your algorithm highly depends on this line:

if ( c.contains(elementData[lire]))

In your test, c always is an ArrayList with only six elements, and therefore contains() will be rather fast. What happens if you try it with a much bigger collection?
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Addendum:

You could try

Collection collect = creer(i);

for a "more fair" benchmark.
 
Arno Reper
Ranch Hand
Posts: 286
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
why should I do that?
Collection collect = creer(i);
Both will take more time but mine 'll still be much more faster ( I think )

[ April 29, 2006: Message edited by: Arno Reper ]
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Arno Reper:
why should I do that?
Collection collect = creer(i);
Both will take more time but mine 'll still be much more faster ( I think )


I tried it, and yes, yours is still faster. I don't think that was totally obvious, though, because both algorithms aren't symmetric - they could have reacted very differently to changes in the sizes of both collections, and therefore it could have been that one algorithm performs better for one combination, and the other for another combination of input parameters.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The implementation in AbstractList requires the exact same number of c.contains() calls. The key difference is that Arno's implementation doesn't do an O(N) remove() for each and every item removed. Instead, each array element is copied either 0 or 1 times over the course of the whole algorithm. Makes sense this will always be faster or equal, not slower.

Then again, I'm not surprised no one at Sun previously bothered optimizing this for ArrayList, as removing elements isn't the sort of thing ArrayList does well anyway. I suspect that for most operations in which you want to do something like this, a Set of some sort would be much faster. LinkedHashSet if you want more List-like behavior in which order is somewhat significant. But there may be a few applications in which the performace characteristics of ArrayList are otherwise beneficial, which also would benefit from having a fast removeAll().
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Jim Yingst:
The implementation in AbstractList requires the exact same number of c.contains() calls.


Oops, correct. Somehow I did miss this.
 
Arno Reper
Ranch Hand
Posts: 286
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How could I send my version to Sun's Collection Developpers? So maybe it could help somebody...
Or to Josh Bloch...lol
arno
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would go to Sun's bug database, search a bit to see if they already have any similar bugs/RFE's listed, and if not, submit a new RFE. (Since this isn't actually a bug, just unnecessarily poor performance.)

It would also be easy to make a version of this which is a static method which can operate on an existing ArrayList (not your own custom subclass) to achieve the same effect with similar performance. Such a method might be made available as part of Jakarta Commons or someplace similar.

Note that since the algorithm still relies heavily on contains() from the Collection c to be removed, it might be prudent to copy c into a HashSet (assuming c is not already a HashSet). That way you'd get excellent contains() performance even if the original collection does not offer it.
[ May 01, 2006: Message edited by: Jim Yingst ]
 
steve souza
Ranch Hand
Posts: 862
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are also already fast open source collections out there. So, I would check with them to make sure they haven't already solved the problem. Based on your interest maybe you could join one of these projects and contribute to them.

I'm sure this isn't all of them, but a good start.

http://java-source.net/open-source/collection-libraries

Here is the jakarta solution:
http://jakarta.apache.org/commons/collections/
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!