Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

text extractors from web pages  RSS feed

Ali Khalfan
Ranch Hand
Posts: 129
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

I'm trying to implement a search engine on selected articles from all around the web. Thing is to make the search work i'll need to extract the right content from the source code; so no javascript; no flash no header; no footer...etc just extract the right sentences.

anyone know of any api the could do this....i saw lingpipe (even used it, but its performance is a bit consuming and it takes a lot of space). [URL=][/url]

so if i'm gonna extract anything from this page it should just be what i write and the replies as weill as the subject (not the header above or the url for the deer with one eye )
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!