File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Hadoop and the fly likes why there are separate slots for map and reduce tasks? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "why there are separate slots for map and reduce tasks?" Watch "why there are separate slots for map and reduce tasks?" New topic
Author

why there are separate slots for map and reduce tasks?

Avinash Ga
Ranch Hand

Joined: Aug 13, 2011
Posts: 78

As per my (limited )knowledge of Map-Reduce algorithm, i believe that in a job, reduce tasks will start running only after all the map tasks (also the combiner tasks if there are any) have finished execution. if there is no chance of a reduce task running (correct me if i am wrong ) while there are pending map tasks, why the tasktracker have separate (configurable) slots for map and reduce tasks? i have read that before starting a map task, a task tracker will look for a free map task slot, if it finds any it will allocate that slot to the map task, if there are no free task slots left then it will allocate a slot from reduce task slot . i just want to know why there is a configuration like this in hadoop . is this configuration is per job (make no sense, since reduce tasks cannot start before the completion of all map tasks.... again correct me if i am wrong) or per system (a system with many job...... this make some sense )


Avinash G.A
OCP Java SE 6 Programmer, OCP Java EE 5 Web Component Developer, OCE Java EE 6 Web Services Developer, VMware Certified Core Spring 3.x Developer, EMC Proven Professional (ISM-V2)
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: why there are separate slots for map and reduce tasks?