• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

candidate for map reduce

 
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi

I wanted to know your thoughts on the same. Recently I was going through the open source map reduce framework called Hadoop. I currently have a standalone java application. The target users of this app would invoke providing input details by some front end mechanism which our client will develop. Our part was to just develop the java app.
Our java app basically does some mathematical calculations after interacting with the database using an OR mapping framework called Hibernate.
Now there is a new requirement that our application will have another invocation mechanism (batch jobs). There will be a file which will have 1000 or more invocations and we have to execute the above mentioned application that many number of times. If I go by the current batch process it will call the application in a sequential order.
Recently i read about map-reduce style open source implementation called Hadoop. Which basically divides the tasks into some pre-defined size and then spawns thread to execute the divided number of tasks.

Do you think map reduce style of execution can expedite the batch process job. First I would like to know if map reduce is a feasible solution for this sort of problem.

Mohit
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It sounds like the individual invocations are independent of each other; is that the case? If so, Hadoop (or MapReduce) wouldn't add anything. You could just schedule the invocations on different machines and be done with it.
 
Mohit Sinha
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes your understanding is correct. each task in the batch job is a distinct one and the tasks are not interdependent. You suggested about running the job over different machines but if that multiple machine option is not available can we achieve the same using java threading.
Any insight on the same would be helpful
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You may see a speedup by using multiple threads or you may not; it depends a great deal on the problem at hand. If it's pure computation -with little or no I/O interspersed- it's unlikely to become noticeably faster through multithreading.
 
It's just a flesh wound! Or a tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic