Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Techniques For Paging/Caching Search Results

 
Sean MacLean
author
Ranch Hand
Posts: 621
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello All! The discussion on Form Validation inspired me to ask for some opinions on 'paging' techniques. What I'm refering to here is extemely common but, for the life of me, I can't seem to find any white papers, tutorials or articles on the subject.
Here's the scenario:
1) a user sends a query (ie. search parameters ) to a servlet
2) the servlet returns a large list of results based on the search parameters
3) each result row is, in itself, a link to a details page
Now, here are some common features seen
1) These results (if there are a lot) are 'paged' [ie 1-10, 11-20, etc] - I know we've all seen this - we're looking at it right now! But how do people implement it?
2) If you click on an individual result to see a detail page, often you can scan to the previous and next of the result of you custom search (ie. you results a re store somewhere or regenerated)
Here are a few issues that I've run across
1) Should you store the search result (faster, more resource intensive) or re-do the search (slugish for the end user, easier to implement) whenever the user moves from one page to the next.
2) If you store the result, is using a serialized object stored in the session object too much overhead (memory footprint, etc)? Or maybe a search result bean?
I realize this is a rather large topic, but it would be great to hear how others have addressed this issue.
Sean

 
Frank Carver
Sheriff
Posts: 6920
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've seen systems which take either of the approaches you describe. In my case the design decision is usually based on the estimated size of the result set and the estimated number of concurrent users.
If you can fairly reliably assume that either the total result size or the total number of concurrent users will be small, then an in-memory cache is a good solution. Trim the data to its minimum representation (say an array of short Strings or numeric ids etc.), and pop it into the session, ideally with a fairly short timeout.
For bigger datasets or busier sites it has really got to be the "go back to the database" approach. It works much better if you can ask the database to only retrieve (say) items 151 to 200, but not all databases support that sort of SQL. We use this approach here with Oracle and it works very well. In effect we are amortizing the cost of the query across several requests.
Theoretically there is the middle ground of a system which caches a certain amount of the data, and goes back to the database if the data size or the user count increases beyond a threshhold, but I've never seen an adaptive system like this. Please enlighten me if you have seen such a thing.
 
kiran kumar73
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi!
i faced the same problem in my project.i followed the first method as you said- performing the search each time when moving to the next page or the required page number.i first thought of storing thr search results in session, but it became very difficult because we cannot predict the search because the data is huge.so to my knowledge better go for the first method i.e performing the search each time.
i think this suggestion may help you in solving the problem.
bye
 
Sean MacLean
author
Ranch Hand
Posts: 621
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Frank & Kiran, thanks for the great input!
I actually split the approach in one project. I stored a limited result in a session object. Within the class I created, only a small amount of data was serializable so the footprint shouldn't be too large (haven't got around to testing exactly how large though). But I also included a 'pointer' to the current page. If the user exceedes the number of cached results in the searchResult class, then I re-do the search with functionallity that is built into the searchResult class( so basically it stores a limited number of results and the original search paramters). The nice feature here is that you can use a global (global? pointer? - I sound like a C programmer!) variable to control how much you're caching. If you set it to 0, then you're simply doing the 'back to the DB' approach. Otherwise, you can vary the memory footprint in accordance with the user load (Hey, what if you dynamically changed this variable to reflect the current load - wow, I'd better write that down).
On another project, I handled the 'detail' previous and next by returning the prev/next id's when I retrieved the contents of the detail. My thinking was that, since you going to the DB anyway, you might as well get these things while you there. These seem to work but there're all just stuff that I figured out as the need arose. I'm interested to know if there is a standard approach - esp. since this is such a common functionality. It still seems like a bit of a mystery to me since some very large sites seem very fast that you'd guess that they are caching the result - perhaps I'm being fooled by good load-balancing and DB schema
Has anyone ever seen an article discussing this issue? Thanks again.
Sean

[This message has been edited by Sean MacLean (edited November 14, 2000).]
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic