Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

MapReduce vs Distributed task Queue

 
Zaharie Sergiu
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello all,

Can someone make a clear difference between this 2 concepts, when is better to build a distributed system with MapReduce (Hadoop) or Distributed Task Queue (Celery)?
With respect to performance, load balancing, big data, scalability, reliability, availability, efficiency, what can be a drawback or advantage of using one or another?

I am currently in the research phase of a project, which consist in building a web based distributed system. I have an initial text mining software which I need to decompose it in order to integrate it with one of this 2 frameworks and make it distributed.


Thank you!
 
Mark Spritzler
ranger
Sheriff
Posts: 17278
6
IntelliJ IDE Mac Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, here is an answer that isn't really direct.

the answer

It Depends.

It depends on what task, process you are doing. There are some tasks that you want run distributed, but doesn't fit well into MapReduce and some that do. Typical Hadoop example of reading many files and counting words is a great example of Map Reduce. getting results for a search like Google is great example for MapReduce. Handling Events via Messaging and processing the data doesn't need MapReduce and distributed tasks would be better.

It depends on the particular use case.

Mark
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic