• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

MapReduce vs Distributed task Queue

 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello all,

Can someone make a clear difference between this 2 concepts, when is better to build a distributed system with MapReduce (Hadoop) or Distributed Task Queue (Celery)?
With respect to performance, load balancing, big data, scalability, reliability, availability, efficiency, what can be a drawback or advantage of using one or another?

I am currently in the research phase of a project, which consist in building a web based distributed system. I have an initial text mining software which I need to decompose it in order to integrate it with one of this 2 frameworks and make it distributed.


Thank you!
 
ranger
Posts: 17347
11
Mac IntelliJ IDE Spring
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
OK, here is an answer that isn't really direct.

the answer

It Depends.

It depends on what task, process you are doing. There are some tasks that you want run distributed, but doesn't fit well into MapReduce and some that do. Typical Hadoop example of reading many files and counting words is a great example of Map Reduce. getting results for a search like Google is great example for MapReduce. Handling Events via Messaging and processing the data doesn't need MapReduce and distributed tasks would be better.

It depends on the particular use case.

Mark
 
reply
    Bookmark Topic Watch Topic
  • New Topic