posted 15 years ago
When you say "HTML", do you actually mean static HTML files? if so, Apache Nutch might do the trick. If it's actually dynamic content (maybe from a DB or a CMS), then Lucene might be a good choice, or a native search (a SELECT in the case of a DB, or the CMS's built-in search API).