This week's book giveaway is in the Server-Side JavaScript and NodeJS forum.
We're giving away four copies of Modern JavaScript for the Impatient and have Cay Horstmann on-line!
See this thread for details.
Win a copy of Modern JavaScript for the Impatient this week in the Server-Side JavaScript and NodeJS forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Bear Bibeault
  • Junilu Lacar
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • salvin francis
  • Frits Walraven
Bartenders:
  • Scott Selikoff
  • Piet Souris
  • Carey Brown

candidate for map reduce

 
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi

I wanted to know your thoughts on the same. Recently I was going through the open source map reduce framework called Hadoop. I currently have a standalone java application. The target users of this app would invoke providing input details by some front end mechanism which our client will develop. Our part was to just develop the java app.
Our java app basically does some mathematical calculations after interacting with the database using an OR mapping framework called Hibernate.
Now there is a new requirement that our application will have another invocation mechanism (batch jobs). There will be a file which will have 1000 or more invocations and we have to execute the above mentioned application that many number of times. If I go by the current batch process it will call the application in a sequential order.
Recently i read about map-reduce style open source implementation called Hadoop. Which basically divides the tasks into some pre-defined size and then spawns thread to execute the divided number of tasks.

Do you think map reduce style of execution can expedite the batch process job. First I would like to know if map reduce is a feasible solution for this sort of problem.

Mohit
 
Rancher
Posts: 43016
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It sounds like the individual invocations are independent of each other; is that the case? If so, Hadoop (or MapReduce) wouldn't add anything. You could just schedule the invocations on different machines and be done with it.
 
Mohit Sinha
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes your understanding is correct. each task in the batch job is a distinct one and the tasks are not interdependent. You suggested about running the job over different machines but if that multiple machine option is not available can we achieve the same using java threading.
Any insight on the same would be helpful
 
Ulf Dittmer
Rancher
Posts: 43016
76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You may see a speedup by using multiple threads or you may not; it depends a great deal on the problem at hand. If it's pure computation -with little or no I/O interspersed- it's unlikely to become noticeably faster through multithreading.
 
No. No. No. No. Changed my mind. Wanna come down. To see this tiny ad:
Thread Boost feature
https://coderanch.com/t/674455/Thread-Boost-feature
    Bookmark Topic Watch Topic
  • New Topic