• Post Reply Bookmark Topic Watch Topic
  • New Topic

How to search for index quickly

 
Tim Berett
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to implement something like a running order list to store and retrieve unique Strings, e.g.

1, Orange
2, Apple
3, Pear
4, Durian

If I want to know the index for Pear, it should return me a 3. Subsequently, if I delete Apple, it should return a 2 and 3 when I search for Pear and Durian respectively. Any idea how I can implement a extremely fast algo to get back the indexes as I have millions for records? Thanks!
 
John Dell'Oso
Ranch Hand
Posts: 130
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tim,

Have a look at the List type classes - ArrayList or LinkedList. These classes have the necessary functionality to suit your requirements. However, if you're talking about millions of "records", you may run into resource and/or performance issues. Without knowing all the details of what you're trying to do, this sort of requirement suggests the use of a database of some sort.

Regards,
JD
 
marc weber
Sheriff
Posts: 11343
Java Mac Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think this is a bit advanced for the beginners forum, so I'm promoting it to the intermediate forum. Please continue there.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What an interesting problem! Tell us a bit more about this list:
1. Does it start small and get added to as your program runs or does you program get handed a big initial list?
2. Does it get only deletions or are there also additions to the list?
3. From your example it looks like there is no ordering to the Strings - correct?
4. How frequent are changes to the list compared to queries for list position? In other words, will there be periods when the list is stable?
5. Are queries completely random or are there some lookups repeated frequently, justifying some sort of cache approach?

Bill
 
Nicholas Jordan
Ranch Hand
Posts: 1282
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This question strikes to the core of good practices for Performance.

Consider the following:



Mr.Brogden asks the full list of questions you will need to provide us information for before we can address the issue productively.
 
steve souza
Ranch Hand
Posts: 862
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You would have to answer bills questions to get a definitive answer, but if you only care about about the index number when you do a 'get' you may be able to do a binary search and count the steps that it takes to get to your destination. That assumes an ordered collection.

say your collection has 1,000,000 entries. You start by looking at element 500,000. If that is less than your value you would take element 750,000 (if less then you would take element 250,000), and keep halving the interval until you reach your destination. The number you land at is the correct number. If you delete or add any elements the number will adjust itself.
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by steve souza:
You would have to answer bills questions to get a definitive answer, but if you only care about about the index number when you do a 'get' you may be able to do a binary search and count the steps that it takes to get to your destination. That assumes an ordered collection.

say your collection has 1,000,000 entries. You start by looking at element 500,000. If that is less than your value you would take element 750,000 (if less then you would take element 250,000), and keep halving the interval until you reach your destination. The number you land at is the correct number. If you delete or add any elements the number will adjust itself.


Collections.binarySearch will do exactly that.
 
Nicholas Jordan
Ranch Hand
Posts: 1282
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ilja Preuss:


Collections.binarySearch will do exactly that.


On an un-sorted Set or List ? I thought one had to use a Map or Hashtable or TreeMap to get really heavy reductions in search times. If not, then the choice of these as a data-storage model would be on the basis on needing lookup by key as opposed to direct access. Further, sorting by Natural Order would have to have some sort of internal override in Collections generally to avoid re-ordering of elements to achieve binary search capabilities. If not, then we have discovered a new type of wheel and should report the matter immediately to Krakatoa east of java

Most established texts on computer science render your statement adventurous.
{ my questioning of this is   NOT   to hijack the thread ... it is the central quesion of the original poster. }
[ July 29, 2007: Message edited by: Nicholas Jordan ]
 
rajesh bala
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

1. How insert/delete heavy is your datastructure.

ArrayList is the best for getting the index for such operations. However, NOTE that in arraylist if you delete or insert an item in the middle, it has to move the entire set below using System.arrayCopy(). So this might turn out to be expensive. So if you could tell us how frequently you insert or delete in this index, it would be helpful.

~Rajesh.B
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Nicholas Jordan:

On an un-sorted Set or List?


I responded to steve, who specifically wrote


That assumes an ordered collection.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I keep hoping the original poster will come back and explain more about the problem
Possible approaches keep occuring to me but they all depend on the answers to the questions I originally posted.

Bill
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by William Brogden:
I keep hoping the original poster will come back and explain more about the problem
Possible approaches keep occuring to me but they all depend on the answers to the questions I originally posted.


Ditto here! It sounds like an interesting problem, but all the speculation is kind of tiresome...
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!