GeeCON Prague 2014*
The moose likes Hadoop and the fly likes Hadoop - Why Should/shouldn't I pick it up ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Hadoop - Why Should/shouldn Watch "Hadoop - Why Should/shouldn New topic
Author

Hadoop - Why Should/shouldn't I pick it up ?

Ashwin Sridhar
Ranch Hand

Joined: Jul 09, 2011
Posts: 272

Hi Alex,

Just when I was thinking about moving to a NoSql Database , there is a promotion for a book on one of the NoSql Databases. I couldn't ask for a better timing.

I have below questions for you

  • How do you compare a NoSql database like Hadoop with a RDBMS like Oracle ? Is migration from RDBMS to Hadoop explained in the book ?
  • There are a number NoSql Databases in the tray. How could someone very new to NoSql Database pick one from the lot out there ? How does Hadoop makes its money worth when compared with Cassandra, MongoDB , Big Table etc ?
  • What are the areas which seems to be not addressed yet in Hadoop when compared with RDBMS
  • Could you comment on the stability of the database and extent to which this has been handled in your book?


  • Ashwin Sridhar
    SCJP | SCWCD | OCA
    Alex Holmes
    Author
    Greenhorn

    Joined: Oct 19, 2012
    Posts: 21
    Hi Ashwin,

    I wouldn't put Hadoop in the same camp as NoSQL technologies - for the most part NoSQL technologies tend to be real-time, versus Hadoop, which is batch-based, and excels at ETL, DW type use cases. In terms of which NoSQL solution to pick that's a touch choice as there doesn't seem to be a clear winner in the marketplace at the moment. Having said that Cassandra, MongoDB and HBase have distinctive traits which will likely push you to one of them based on how you intend to access your data. I'm not an expert on these systems so I won't attempt to push one over the other, but after you do some research I think it'll become apparent which one will work best for you.

    Relational systems are quite different from Hadoop, and not only from an the real-time/batch perspective. Hadoop isn't a transactional system, but it was architected from the ground-up to scale, so you can typically work with much larger data sets than you can with monolithic database systems. Hadoop is also great at joining structured and unstructured data together, and for data aggregations and summarizations. You can also use tools like Mahout for predictive analytics.

    Hope this helps some of your questions.

    Thanks,
    Alex

    Author, Hadoop in Practice, http://www.manning.com/holmes/
    Blog at http://grepalex.com/
     
    GeeCON Prague 2014
     
    subject: Hadoop - Why Should/shouldn't I pick it up ?