• Post Reply Bookmark Topic Watch Topic
  • New Topic

What are the steps to convert a java program into multithreaded program?  RSS feed

 
Monica Shiralkar
Ranch Hand
Posts: 922
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What are the steps to convert a java program into multithreaded program. I have a java program.Now since it is taking long time to execute, I need to convert this into a multithreaded program. What are the steps I should follow?

thanks
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Start with a tutorial.

If the program is running slowly, threading might not make it run faster. Please supply more details.
 
fred rosenberger
lowercase baba
Bartender
Posts: 12565
49
Chrome Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Step 1 - Document what the performance requirements are. Be precise. Having a vague goal of "as fast as possible" is pointless.
Step 2 - Profile your application and figure out where it is ACTUALLY slow.
Step 3 - Determine what would be the best course of action. It may be threads. It may be a better search algorithm. It may be a better database schema.

The point is, you don't know right now if threads are the right thing to do. Or at least, WE don't know.

Heck, without the performance specifications, you can't really say there even IS a problem.

 
Monica Shiralkar
Ranch Hand
Posts: 922
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is extracting lakhs of files from a folder extracting some information and writing to a file. I have the program for it. It is reading records one by one and taking hours to complete. The acceptable time is lesser not certainly hours. What can be done?

thanks
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
About the only way to improve a single processor service would be:

1. A Thread to read each file into a custom object containg the file source information plus the byte[] array contents - each object to be placed in a Queue - when the designed size of the queue is reached, this Thread waits - see java.util.concurrent package, the ArrayBlockingQueue

2. A Thread to take the next object from the queue and parse out the results.

If multiple processors / multiple machines are available, perhaps you are in Hadoop territory or some other "grid" computing are.

Bill
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Monica. Shiralkar wrote:It is extracting lakhs of files from a folder extracting some information and writing to a file. I have the program for it. It is reading records one by one and taking hours to complete. The acceptable time is lesser not certainly hours. What can be done?


Have you tried getting faster disks? If the application is pinning the disks at 100% utilization, adding threads won't make the disks work faster.

Henry
 
Steve Luke
Bartender
Posts: 4181
22
IntelliJ IDE Java Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
To extend fred's steps:

Determining if the best course of action includes threads:
Step 3a: Determine which parts of the application can run in parallel to one another and which parts must be sequential
Step 3b: Determine synchronization points: those points were parallel tasks must synchronized to protect data integrety
Step 3c: Compare the steps determined above with the parts that were determined to be actually slow
Step 3d: If step 3c determines the slow parts are parts that can be parallelized, then proceed, otherwise, stop considering threading

If you determine that you have a task that can be parallelized and that it is at least part of the application's bottleneck, then it is time to write up some test code to see if the task will actually benefit from threading. Isolate the operation into a stand alone application, and run it sequentially (current state) and time it. Break the sequential steps into logical units to be run as independent tasks. Then create a Thread Pool and put the tasks in the pool. Start with 1 thread in the pool, then slowly increase the number of threads measuring performance at each step to see if you get a performance increase.

With file manipulation you often won't because of the limitations of file systems and disk access. But if you are using a multi-headed RAID then maybe you will see performance.

One thing I would suggest is consider the sequential consistency of the data that you need to write to the output file. You probably only want one thread writing to the output file, and you probably want to ensure that the data is written in consistent chunks. So that probably means a dedicated queue feeding output to the writing thread.
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And to add to what Henry said: adding more threads that access the same disk more likely makes the program slower instead of faster, depending on how many heads the disk has.
 
Monica Shiralkar
Ranch Hand
Posts: 922
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks all.

I am thinking of reading the small files and putting the data of these files in one big file so that it takes less time. But still that will reduce time little big. Can I introduce multi threading after that such that first thread reads first so many lines of file,next thread reads next these many lines and so on.
OR
It will be better if there are more than 1 file and each thread reads one file. Is it feasible or the above first option is feasible.
 
Joanne Neal
Rancher
Posts: 3742
16
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't wish to be too blunt, but - have you actually read any of the replies so far ?
The basic gist of all of them is - we don't know if threads are the solution to your problem. You need to analyse your program in the various ways described and then decide if multithreading is the way forward or if one of the other suggested options would be better.
 
fred rosenberger
lowercase baba
Bartender
Posts: 12565
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Threads only help you if you can be doing more than one thing at a time. For example, one thread is reading from disk while another is sending data out on a port.

If the real issue is that your disk is maxed out, threads will not help. Here's why:

single thread - disk is maxed out. You are getting data as fast as is possible.

multiple threads - disk is still maxed out, but now, the program has to spend time managing threads. Further, if thread 'A' gets half its data, and then thread 'B' gets it's turn at the disk. Now you have to wait for the head to move and the disk to rotate to the right spot to start 'B's read. It gets part of its data, and then 'C' gets its turn, then you have to get back for 'A', etc....so now, instead of getting the data as fast as possible, you are now spending a lot more time waiting for the disk/head to position for the read.

Threads are not a panacea that can fix any/all slowness issues.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Standing between your file read calls and physical movement of read heads on a hard disk we find:

1. Operating system buffers
2. Hard disk buffers - large amounts of memory and intelligent processors - see this wikipedia article for example.

Thus the generalized comments you see above may or may not apply to your situation. The only solution is to experiment with realistic data.

Bill
 
Satyaprakash Joshii
Ranch Hand
Posts: 205
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok not talking about threading but I am thinking of reading the small files and putting the data of these files in one big file so that it takes less time( as it will not involve multiple read and writes).Will this help.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Monica. Shiralkar wrote:I have a java program.Now since it is taking long time to execute, I need to convert this into a multithreaded program.

Just to add to the weight of good advice you've already got - what on earth led you to the above conclusion?

Personally, multi-threading is one of the last things I would consider when trying to speed up a system, because:
(a) It's not simple.
(b) It's easy to get wrong.
(c) It's difficult to test.
(d) The simple act of synchronization tends to slow code down.
and I certainly wouldn't advocate it until I could demonstrate that it actually solved the problem.

That's not to say that you should never use it; but you need to be absolutely sure that it's what you want before you do.

So:
1. Why do you think that threading is the solution? - And be specific.
2. Is threading the only solution, or are there other alternatives?

HIH

Winston
 
Joanne Neal
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Satyaprakash Joshii wrote: Ok not talking about threading but I am thinking of reading the small files and putting the data of these files in one big file so that it takes less time( as it will not involve multiple read and writes).Will this help.

Are you working with Monica on the same project or are you hijacking this thread to ask a similar question about your own project ?
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Monica. Shiralkar wrote: . . . Can I introduce multi threading after that such that first thread reads first so many lines of file,next thread reads next these many lines and so on. . . .
That looks like a recipe for disaster to me. How are you going to ensure all the lines are written to the large file in the correct order?

If you have performance problems, the file reading and writing is likely to be much slower than any processing in RAM. I suspect multi‑threaded multiple writing may actually be slower than single‑threading.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!