Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
Win a copy of Cross-Platform Desktop Applications: Using Node, Electron, and NW.js this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

ArrayList searching and sorting  RSS feed

 
Jon Dornback
Ranch Hand
Posts: 137
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i have a sticky performance question. here's the situation:
my jsp page uses a bean to connect to the database and then store the results in an ArrayList of Articles. Articles is my class that stores the title, body, database id, and order (which order to display the articles on the page). this is done for encapsulation purposes, so that anyone implementing the classes do not need to handle db connections directly. The articles in the arraylist are automatically stored in order of their rank, so that a single loop will print them out in the correct order. however, i also need to get a single article by it's db id.
the question is: would it be better to re-sort the arraylist based on the database id (requiring a comparator, etc), or simply do a linear search through the arraylist? in my situation, the arraylist will never be very big, but i'm interested in opinions on which approach is better for medium or huge lists as well.
thanks in advance,
Jon Dornback
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would use two different collection-type objects to store references to the Articles, one for order by rank, and one for looking up by db id number. This may seem wasteful, but remember that each collection doesn't really store all the articles inside it - just a reference to each one. So if you have 1000 articles, you will have 1000 Article objects, and two collections containing a total of 2000 references to the Articles. (Not 2000 Article objects.)
For the rank attribute, it seems you need to maintain an order for iteration, but don't really need to look up by rank, so a List (ArrayList or LinkedList) might be best. If you need to look up by rank as well as keep articles in order, a TreeMap might be best. (Also good if you are adding and removing articles frequently - you don't need to re-sort the whole tree for each change.)
For the db id attribute, a HashMap is probably the best bet, as it generally offers the fastest lookup for larger collections - but it doesn't maintain order for iteration.
Naturally, the best choice for each collection will depend on how you use it. How often do you perform each type of operation, and which operations are most time-critical. Thomas Paul has written several good articles about this for our newsletter - check out this one in particular. Remember that the performance numbers can vary substantially depending on your application and environment, so what looks best on paper may not turn out to be best in practice - test for yourself.
 
Chris Smith
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This really depends on what kinds of stuff you're doing. If the search by database ID happens on average once every four or five searches, then you're probably better off simply doing a linear search when it needs to occur. It's also possible that you want to maintain a data structure that allows efficient searching by database ID.
As Jim mentioned, you can store data in multiple data structures simultaneously. There's no reason the records can't be placed into both a HashMap and an ArrayList. However, remember that the time required to build a HashMap will always be higher than the time required to do a linear search by database ID. The memory requirement will also be linear on size, where a search through an unsorted array is still constant space. So, before you start building data structures for that purpose, check that you're doing more than one search per query, on average.
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Chris Smith:
As Jim mentioned, you can store data in multiple data structures simultaneously. There's no reason the records can't be placed into both a HashMap and an ArrayList. However, remember that the time required to build a HashMap will always be higher than the time required to do a linear search by database ID. The memory requirement will also be linear on size, where a search through an unsorted array is still constant space.

Another drawback is the duplicated data: You now have to maintain two collections, to ensure consistency between them. So don't do this lightly, but only when you are sure that the improved performance will justify the increased maintenance costs.
 
Jon Dornback
Ranch Hand
Posts: 137
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks for the help - great advice all around. i'm sticking with a linear search for retrieving by db id since there will only be about 20 elements at most in the ArrayList, and it will not be frequently called. it was just one of those interesting "what if this was huge...?" type questions that i wanted to look in to. thanks again.
Jon
 
Chad McGowan
Ranch Hand
Posts: 265
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you are using JDK 1.4, how about a LinkedHashMap? Use database Id as the key and insert the Articles in order of their rank. This way you maintain the original ordering and still have quick retreival using database Id.
(I haven't yet used 1.4, so there may be other considerations with this class, but it sounds like exactly what you need).
Chad
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!