This week's book giveaway is in the Other Languages forum.
We're giving away four copies of Functional Reactive Programming and have Stephen Blackheath and Anthony Jones on-line!
See this thread for details.
Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Process a file using Hadoop Map Reduce without proper End of Line.

 
Satyaprakash Joshii
Ranch Hand
Posts: 200
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hadoop Mapper reads each line as value. I processed a file having username,comments e.t.c Each usernma,comments e.tc wre in a separate line. I processed it successfully to extract the comments and do required manipulation.

Now, I have to process a file in which each line is not in a separate line.i.e line breaks are not regular.Can you advice me how to process this file as Hadoop Mapper read each line as a value now if there is no proper end of line how to process it.

Thanks.
 
Srinivas Mupparapu
Greenhorn
Posts: 14
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
MapReduce uses a record reader behind the scenes which by default reads one line at a time. You can override this behviour using a customre record reader and take control of what constitutes a record. Look into org.apache.hadoop.mapred.RecordReader interface. There are several implementations of this interface available out of the box.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic