Win a copy of Java Mock Exams (software) this week in the Programmer Certification (OCPJP) forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Hadoop key mismatch

 
Larry Homes
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

Hope this is the correct forum for a hadoop question.

I have a file with a bunch of lines like this:



It continues on for all 50 states, then there is another word like politics:30 Virginia ... etc.

I want to do a distributed sort on this using mapreduce. I know mapreduce sorts between the map and reduces stages, so I just want to emit from map, then from reduce without processing, but it is not working. Here is my map and reduce function:



Here is my main class



And here is the inputformat class i wrote since FileInputFormat would always fail



Here is the error




Thanks
 
Larry Homes
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thought I would post the solution I found. It was an incredibly dumb error on my part. In my main class, I named the Job instance sort, but then when setting the mapOutputKey, mapOutputValues, outputKey and outputValue, I use the identifier job. That identifier was from a previous mapreduce in the chain and I had just copied and pasted the code without remembering to change the job identifier.
 
What are you doing? You are supposed to be reading this tiny ad!
the new thread boost feature brings a LOT of attention to your favorite threads
https://coderanch.com/t/674455/Thread-Boost-feature
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!