Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

LinkedList/ArrayList: need a hybrid  RSS feed

 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
By profiling my app, I found that an enormous proportion of the time is spent by accessing the elements of a particular list, as in


I looked at the implementation, and sure enough, it was a large LinkedList which was accessed quite often. So, if I have a million elements in my list, and wanted to get the last element, it would take 1 million iterations to get to that element. Here is the corresponding Sun code:


So, I swapped LinkedList for ArrayList, and that problem was gone. However, the reason that I originally implemented it as a LinkedList was because I also needed to trim my arrays often, and the trimming needed to be done with the first elements in the list. What happens with ArrayList implementation is that whenever the first element is deleted, all the subsequent elements are shifted up, and it also takes a long time with ArrayList.

So, here is the dilemma: with an ArrayList, random access is fast and removing elements at the start of the list is extremely slow. And with a LinkedList, random access is extremely slow while removing elements at the start is fast. But I need both! Is anyone aware of some (perhaps custom) List implementation which accomplishes both purposes? Thanks.
 
Christian Gossart
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Did you check Commons Collections and especially TreeList ?
 
steve souza
Ranch Hand
Posts: 862
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have never used it but trove is worth a look.

http://trove4j.sourceforge.net/
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The first question that occurs to me:

How often is the LinkedList content changed compared to the frequency of access?

If there are numerous accesses between changes, it would be reasonable to use the LinkedList toArray() method to create an array for access after every modification. Then the lookup would be practically instantaneous.

My instinct is that this would be worth the trouble if only a few dozen accesses occurred between modifications.

Bill
 
Pat Farrell
Rancher
Posts: 4678
7
Linux Mac OS X VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What do you mean you want to optimize 'both'?

Your usage should be biased towards one or that other, or at least have some characterists that bias the choice.

For example, if you need to insert values in the middle, but only rarely, then you can use a simple ArrayList with a copy/shuffle logic to do insertions in the middle. But if you need to add lots of records in the middle, that is a loser.

Have you considered using a HashMap with an Integer key?
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24215
37
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Another thing to think about: are you sure you need random access? Are your get() calls mostly inside loops? Can you just iterate over the list instead (much cheaper!)
 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How often is the LinkedList content changed compared to the frequency of access? If there are numerous accesses between changes, it would be reasonable to use the LinkedList toArray() method to create an array for access after every modification. Then the lookup would be practically instantaneous.

The access/modification ratio is very large. The list is typically accessed hundreds of times before another element is added to the list. I suppose I could do toArray(), but that would take a lot of extra time and memory, since the list is huge. In fact, it's so huge that I need to trim it periodically to avoid running out of memory.

are you sure you need random access? Are your get() calls mostly inside loops? Can you just iterate over the list instead (much cheaper!)

Yes, the get() calls are mostly inside loops, and no, I have not thought about iterating. I'll try it to see how it changes profiling results.

Did you check Commons Collections and especially TreeList?

That sounds like what I was looking for. It says that all insertions and removals are O(log n), and get() is only slightly slower than ArrayList:

 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sounds like a variant of an ArrayList could be useful, where the first element of the list doesn't have to be the first element in the array. When you delete the first element, it would just remember that the first element now is at index 1 instead of index 0.

I don't know whether such an implementation already exists, but it doesn't sound too hard to implement.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Annoying buzzer noise
but that would take a lot of extra time and memory, since the list is huge.

Did you actually measure this. My money says it will be a LOT faster than you think. Take a look at the LinkedList actual code


Memory used will only be one reference per entry, it doesn't duplicate the object. I am betting you will find it very fast to convert and the speedup in lookup well worth the effort.

Bill
 
Mike Simmons
Ranch Hand
Posts: 3090
14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
[John Smith]: I looked at the implementation, and sure enough, it was a large LinkedList which was accessed quite often. So, if I have a million elements in my list, and wanted to get the last element, it would take 1 million iterations to get to that element.

Not quite. If you get the last element, theLinkedList will traverse the list from the end of the list (the header has a link to the tail via the previous field) and so you can get to the end right away. It's when you access elements in the middle of the list that things get slow. The LinkedList will iterate from whichever end is closest, but if you're far from either end, it's slow.

If indeed most of your accesses are in loops, then using an Iterator on a LinkedList should be a huge improvement for you, as EFH says. I think it's likely that no further improvement will be needed, but just in case:

[Ilja]: Sounds like a variant of an ArrayList could be useful, where the first element of the list doesn't have to be the first element in the array. When you delete the first element, it would just remember that the first element now is at index 1 instead of index 0.

I agree; that's what I was thinking too. It would also be similar to a circular buffer. This would work well for inserts and deletes at either end, but not so well for inserts or deletes in the middle. Same as for an ArrayList, really.
[ June 26, 2008: Message edited by: Mike Simmons ]
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!