• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Can you give me direction on how to achieve this with Hadoop ecosystem?

 
Baran Bismo
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello fellows,
My company (insurance) wants to be informed of what is going on around the world as per "insurance" concerned. They want me to setup a system which can gather all the news links from various online newspapers around the world about "social insurance", "public health", "cancer" and some other keywords in a daily base. We already have a 30 node hadoop cluster setup already. So far I examined Nutch & Solar. My first question is do you think I can achieve this with these tools? https://wiki.apache.org/nutch/FrontPage[1]
Also when system fetches a link, how will I know it is today's news? I mean boss wants me to bring fresh news in front of him and publish daily. How can I differentiate yesterday's news and today's news?
Can you give me direction? Thanks in advance...
 
Rajesh Nagaraju
Ranch Hand
Posts: 63
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Baran,

How did you proceed?

Thanks and Regards
Rajesh Nagaraju
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic