programming forums Java Java JSRs Mobile Certification Databases Caching Books Engineering OS Languages Frameworks Products This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
Sheriffs:
Saloon Keepers:
Bartenders:

# Search Algorithm again

Ranch Hand
Posts: 196
I've scanned this entire board and have found no suggestion on any search algorithm other than Linear Search. (except indexing or caching that are both well beyond this assignment)
Has anyone implemented or may suggest search algorithm other than Linear Search???
Sun gives 23 points for it, this makes me think there might be something else besides linear we can do to improve search.
Thanks
P.S. The only thing I can think of is that if fields in DataInfo had been pre-sorted your search time would have been O(N) * O(log(M)), where N=# of records, M=# of fields. But since they are not sorted sorting them takes O(Mlog(M)) time so total time becomes O(N*(Mlog(M) + Clog(M))) (C=#of criteria) which in O notation is equal to O(N*Mlog(M)) which is still worse than O(M*N) that linear search gives you. Am I right?

Peter den Haan
author
Ranch Hand
Posts: 3252
The points are, I think, mainly awarded for the way you executed the parsing of the search string in criteriaFind and the appropriate delegation of the actual search to other methods. You are certainly not supposed to do anything other than linear searching (or let me put it differently: you won't lose points because your search is dumb).
Your estimates don't look correct. In criteriaFind you're giving the field names along with the values they're supposed to have, so M doesn't enter the equation at all (at most, the number of criteria fields does). In a real database, the optimiser would try to do the most discriminating search (field) first that had an index. The time taken would be O(log(N)) for a Bayer tree index, O(1) for a hash index. Provided the result set from this first field is small enough the time taken for the remainder of the search would be small.
- Peter

Ranch Hand
Posts: 196
Peter,
are you saying that Data (or subclass) should create the index of fields when it starts up?
Only then M does not enter the O-equation. That makes sense.

Peter den Haan
author
Ranch Hand
Posts: 3252
What I was trying to say is that, for the assignment, please don't bother. If anything Sun might mark your scores down because of the additional complexity introduced.
The discussion of performance estimates was a sideline only, and not relevant to the SCJD assignment. M doesn't enter the equation because you don't have to search through the fields at all. It would enter the equation if you could do a search like "any database field that reads 'foobar'". But criteriaFind() doesn't allow this.
- Peter

Ranch Hand
Posts: 196
OK, I hate to beat a dead horse, but I just dont get it.
You are saying you dont hae to go through all the fields to find the one that you need??? How are you going to find it then?
You can use Map to index fields, but in this case, you are just making the Map do the work, but you are still going through all fields to find the one you need.
If you had said that M doesnt enter equation because M doest grow I would have agreed, but I just dont understand what you mean.
[This message has been edited by Gennady Shapiro (edited November 16, 2001).]

Peter den Haan
author
Ranch Hand
Posts: 3252
When you parse the criteria, you can map the field names to field numbers (O(M)) (or if you wanted to tune the hell out of it, the offset inside the record). Then you can access the fields directly during the actual search (O(N)). Of course the O(M) time taken in the first part is utterly insignificant compared to the second.
- Peter