Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Prerequisites for learning hadoop ?

 
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Can someone tell me what are the prerequisites for learning hadoop ? I know java and c++ well. Does knowledge of these or any other languages help in learning hadoop ?

Thanks.
 
Bartender
Posts: 2407
36
Scala Python Oracle Postgres Database Linux
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
From What Is Apache Hadoop:

Working directly with Java APIs can be tedious and error prone. It also restricts usage of Hadoop to Java programmers. Hadoop offers two solutions for making Hadoop programming easier.

Pig is a programming language that simplifies the common tasks of working with Hadoop: loading data, expressing transformations on the data, and storing the final results. Pig's built-in operations can make sense of semi-structured data, such as log files, and the language is extensible using Java to add support for custom data types and transformations.

Hive enables Hadoop to operate as a data warehouse. It superimposes structure on data in HDFS and then permits queries over the data using a familiar SQL-like syntax. As with Pig, Hive's core capabilities are extensible.
Choosing between Hive and Pig can be confusing. Hive is more suitable for data warehousing tasks, with predominantly static structure and the need for frequent analysis. Hive's closeness to SQL makes it an ideal point of integration between Hadoop and other business intelligence tools.


There's a load of video tutorials etc at the Cloudera site as well.
 
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Hadoop deals with analysis of bigdata ...Since the tool is built in java knowing any object oriented programming is an added advantage.But besides that one should be confident with the concepts of web analytics,data analysis and datawarehousing,distributed computing.
 
David Payne
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator

Sachin rakesh wrote:Hadoop deals with analysis of bigdata ...Since the tool is built in java knowing any object oriented programming is an added advantage.But besides that one should be confident with the concepts of web analytics,data analysis and datawarehousing,distributed computing.



Can you recommend some books for all these subjects ? Is there any hadoop book that covers all these topics ?
 
Sachin rakesh
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Hadoop: The Definitive Guide by Tom White...This book is targetted for freshers in hadoop.Try this one..Also contact the experts in hadoop by posting it in hadoop forum etc...
Happy leaning.
 
David Payne
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator

Sachin rakesh wrote:Hadoop: The Definitive Guide by Tom White...This book is targetted for freshers in hadoop.Try this one..Also contact the experts in hadoop by posting it in hadoop forum etc...
Happy leaning.



Right now, all I know is the Java part of hadoop. So, how much time would it take (approximately) to learn and become proficient enough in hadoop to do entry-level "company projects" ?
 
David Payne
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
bounce
 
Sheriff
Posts: 17644
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
David,

Please don't post your questions in multiple forums. This question is essentially a repeat of this: https://coderanch.com/t/584427/java/java/much-time-approx-it-learn
 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
i want to learn hadoop. But i dont know any object oriented language such as java or .net. So is this neccesary to learn one of these first. I'm planning to learn .net first. Is that right?. For learning datawarehousing, what should i do.
 
clojure forum advocate
Posts: 3479
Mac Objective C Clojure
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Personally I would say that you don't need to know an OOP language to start coding in Hadoop. In fact, this is great!

Why?

Because big data crunching is about processing massive amount of data streams, filtering, pipe-ing and aggregating. Functional programming languages are the perfect fit for this. In functional programming languages you deal with data structures, lazy evaluation and functions.

When using an OOP language while doing big data, you will get the same feeling you got when trying to fill the gap between a database and objects.
 
Hussein Baghdadi
clojure forum advocate
Posts: 3479
Mac Objective C Clojure
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
But well, you can use Hadoop with many programming languages, not only with Java.
 
momin shakeeb
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Thank you Hussein bhaghdadi for your kind reply. I also want to know that for learning 'web analytics, data analysis n datawarehousing n distributed computing' that are necessary for hadoop. So what should i do to learn these things. Is sql server 2008 include any of these thing. Or i've to do oracle dba. Or something else. I dont know.
 
Proudly marching to the beat of a different kettle of fish... while reading this tiny ad
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
    Bookmark Topic Watch Topic
  • New Topic