You can follow the below curriculum to learn hadoop:-
1. Start with Introduction -
BigData, Hadoop, HDFS, Yarn, Architectures etc
2. Setup your own local machine cluster in Ubuntu Server or Centos.
You can create the Pseudo mode.
3. Understand Hadoop MapReduce Framework
Different stages of mapreduce, writing mapreduce code, Use cases of mapreduce
4. Advance MapReduce
Combiner and Partitioner, Map side/Reduce side Join, Using Writable and Comparable, etc
5. Pig
Installation of Pig, Learn Pig Latin -Load file, process files, apply queries. Define Pig UDF, Pig APIs, use cases, etc
6. Apache Hive
Installation of Hive, Learn HiveQL - create database, create table, partition table, join, union, group, serde, Hive UDF, use cases etc
7. Data migratory tools
Learn Flume, Learn Sqoop
8. Nosql databases
Basic and Architecture of some popular Nosql databases like- mongodb, hbase, use cases etc
9. Learn one Nosql database
10. Zookeeper
Installation of Zookeeper, Basics of Zookeeper, Zookeeper Data Model, ZNokde Types, Sequential ZNodes, Use cases etc
11. Project
Pickup some random dataset from internet and apply some learning fundamentals and generate some useful outcomes.
Post by:autobot
Try 100 things. 2 will work out, but you will never know in advance which 2. This tiny ad might be one:
a bit of art, as a gift, that will fit in a stocking