• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Scheduled Servlet

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a project which requires (I think) a servlet that runs forever (sort of). It needs to go out and fetch the LAST DATE MODIFIED from the web pages of our intranet site. Say maybe at 12 midnight everynight. It needs to see if any pages are older than 30 days. Is that at all possible and if it is how would that impact the performance of the webserver (WebSphere 3.5) with other applications on it? And how could that be implemented?
Thanks in advanced.
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You want a separate class (not a servlet) with its own Thread to do this. You can use a servlet to get it started and to check on it's status, and maybe to pick up the report it generates.
To ensure that the impact on the server is minimal, just give the Thread the lowest priority.
Obviously this utility class should be designed with a "Singleton" pattern.
Bill
 
Sheriff
Posts: 7001
6
Eclipse IDE Python C++ Debian Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think what you are describing is really a special case of a more general "web portal" or "smart cache". If you consider it in this light there are a few impoertant aspects of the problem which I think you should also be considering.
The first important issue is scaling the problem. You mention that your scheduled process should run at midnight, so it's probably not much use if it takes 12 hours to run!
To decide how to handle the fetching of the last-modified date, you should consider how many pages you have to examine, how your software knows which URLs to look at, how long each one might take to return the information, and you need to consider what to do if any of the pages are unavailable or slow.
Do you plan to keep a separate list of URLs to query? If so, how will that list be derived - manually, or via some sort of "web-spider"? If you don't have such a list, then your little "last modified" engine will also need to be responsible for fetching and parsing all the pages and extracting links to the others. This could really slow it down and add dangerous complexity.
However you design your fetching process, the majority of it's time will probably be spent waiting for remote servers to return pages. To speed up the overall process you should consider running multiple "fetch" threads at once. Several threads waiting doesn't take much more CPU horsepower than one thread waiting.
Running multiple parallel threads really helps if any of the URLs are unavailable or slow. Waiting for a few potentially long timeouts can cause big problems in a single-threaded solution. Losing one thread for a bit while the others keep on working is only a minor issue.
I'm always wary of the Singleton pattern, as it can be very limiting if used carelessly. I would probably consider a more flexible solution (some sort of "fetcher factory", or a pool of worker threads maybe). And don't forget to make sure that whatever Collection class you are using to gather the results is thread-safe enough to allow multiple fetchers to populate it in parallel.
 
I am going to test your electrical conductivity with this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic