Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to do better than the retainAll method of the List or Set class?

 
Stephane Clinckart
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I want to improve a method I use.

Let me show you a part of the code...



Any Idea how to do better???

Thanks a lot.
 
ganesh roti
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

It depends what exactly you want to do and how you are filtering out. The same code i restructured as below and gave a major performance benfit. Try your luck..

public class ListUtilTest
{

private final Collection<String> multipleOf2 = new ArrayList<String>();
private final Collection<String> multipleOf3 = new ArrayList<String>();

private final Map<Long, String> multipleOf2_Map = new HashMap<Long, String>();
private final Map<Long, String> multipleOf3_Map = new HashMap<Long, String>();
private final Collection<String> commonCol = new ArrayList<String>();

public ListUtilTest()
{
initCollections();
}

private void initCollections()
{
for (long j = 1; j < 100000; j++)
{
if (j % 2 == 0)
{
//multipleOf2.add("" + j);
multipleOf2_Map.put(j, (""+j));
}
if (j % 3 == 0)
{
//multipleOf3.add("" + j);
multipleOf3_Map.put(j, (""+j));
if (multipleOf2_Map.containsKey(j))
{
commonCol.add("" + j);
}
}

}
}

public void testGetCommonMultipleOf2_3()
{
final ListUtil<String> listUtil = new ListUtil<String>();
//final Collection<String> commonElements = listUtil.getCommonElements(multipleOf2, multipleOf3);
System.out.println(commonCol.size());
}

public static void main(String[] args)
{
final long msSecConvFactor = 1000;
final long startTime = System.currentTimeMillis();


final ListUtilTest lUTest = new ListUtilTest();
lUTest.testGetCommonMultipleOf2_3();

final long endTime = System.currentTimeMillis();

final long diffSec = ((endTime - startTime) );
System.out.println("Time required: "+diffSec+" milliSec");


}
 
ganesh roti
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But if you want to make generic solution by passing two collections to a set and expecting the comoon elements then your code is the most optimistic in that case ..
 
Stephane Clinckart
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
ganesh roti wrote:But if you want to make generic solution by passing two collections to a set and expecting the comoon elements then your code is the most optimistic in that case ..


I want effectivly to have a generic code...

Another part of this class is:


It works... but it is not performant at all !!!

It take me 18 seconds to run the first sample !

I suppose it can be optimized !

This code (that don't do the same stuff) is optimized...
--> it take only half of a second with the same collections as in first sample:


Thanks for help.
 
R van Vliet
Ranch Hand
Posts: 144
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here you go, about 2,500 times faster on my computer and should be relatively generic :
 
D. Ogranos
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can also try this:


 
Saifuddin Merchant
Ranch Hand
Posts: 607
Firefox Browser Java Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
D. Ogranos wrote:You can also try this:




Similar to the collection interface retain all - except that its in reverse. I don't think it would give any better performance. The main bottleneck would be the 'contains' method that iterates over all the elements of 'elems' for each element of the 'first' set. Right?
 
D. Ogranos
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sam Mercs wrote:Similar to the collection interface retain all - except that its in reverse. I don't think it would give any better performance. The main bottleneck would be the 'contains' method that iterates over all the elements of 'elems' for each element of the 'first' set. Right?


Actually its performance is better. The main trick is to put one of the lists into a (Hash)Set, so that the contains() method gets constant complexity. It might also be worth it to check which list is shorter and then put one or the other into the Set...I haven't tried around with that option.
Also the retainAll() seems to keep the list elements in order, while this method shuffles them around. So sorting may be neccessary.

Some quick tests compared to the original method with retainsAll() (with two lists of each 100 strings, and 10 common elements):
10.000 calls: 125ms vs. 203ms
250.000 calls: 1077ms vs. 4508ms
1.000.000 calls: 4009ms vs. 18079ms
 
Saifuddin Merchant
Ranch Hand
Posts: 607
Firefox Browser Java Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
D. Ogranos wrote:
Sam Mercs wrote:Similar to the collection interface retain all - except that its in reverse. I don't think it would give any better performance. The main bottleneck would be the 'contains' method that iterates over all the elements of 'elems' for each element of the 'first' set. Right?


The main trick is to put one of the lists into a (Hash)Set, so that the contains() method gets constant complexity. It might also be worth it to check which list is shorter and then put one or the other into the Set...I haven't tried around with that option.
Also the retainAll() seems to keep the list elements in order, while this method shuffles them around. So sorting may be neccessary.



The OP also does the same thing,

final Set<T> commonSet = new HashSet<T>(smallestList);
commonSet.retainAll(biggestList);

I don't quite understand why your implementation would give a better performance compared to the OP - as I think they do exactly the same thing. Maybe I should try running the code to figure it out. Will post if I do....

  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic