• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
  • Mikalai Zaikin

running a large number of "counting QL" on a noSQL data store. Algorithm or a Best practice needed

Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello Everyone

before we start let me give you some information about our environment:
* it is written fully in Java/J2EE.
* it is developed to be deployed on GAE "Google App Engine"
* its GUI is developed by GWT.
* our problem is in a core development issue.

Here is my problem,
* i am building a web application where users "online" can search for listings in this website.
* first please open the web site careerbuilder.com and search for any keyword e.g. "Accounting".
* a page will be opened , [Narrow Search] has a way to allow you go to your target job easier "lets call this a filter" ,lots of jobs down there.
* search filter includes sub-filters [Category , Company , City , State ].
* each sub-filter has many cases or options. like for "State has (California ,Iowa , Kansas , ...etc)" beside each one of them is the number of jobs that matches your current filter/sub-filter selection. you will find it between brackets i.e. (23)

Now we want to allow this filter functionality and we want to make it fast.
making a count query for each sub-filter option is going to be an effective idea.

kindly keep in mind that:
* users can add/remove listing.
* also listings can expire.
* number of sub-filters are higher for us "can reach 20".
* each sub-filter has between 2 and 200 options.

we are searching for the best practice or a suggestion of an algorithm or whatever to solve this problem.

here are 2 options we have reached so far:
1) building a statistics table to save these results in it, then update it each time listings number is changed , also keep a nightly background job to recalculate results. and we can show number of results directly from this table.

2) build a tree data structure to be loaded on memory and saved in a table each time it is updated. this tree contains the resulting numbers of listings in each option of sub-filters.

even though i still think this is not enough !!!
can anyone suggest a better idea?
all comments, questions, suggestions are very welcomed.

Mohammad S.
    Bookmark Topic Watch Topic
  • New Topic