• Post Reply Bookmark Topic Watch Topic
  • New Topic

ForkJoinPool exit without processing all elements in the ParallelStream  RSS feed

 
Jose Marcano
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am executing a parallel stream with n number of elements; however the the forJoin finish before my collection 'records' processes all elements in parallel stream.

Below my code:

Am I missing something in Parallel Stream? I was expecting ForJoinPool to be closed after all elements in forEach(... are executed.

 
Liutauras Vilda
Marshal
Posts: 3829
202
BSD
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Jose Marcano,

Welcome, nice to see you in JavaRanch

One advice to remember, it should help you to get quicker and better answers if you UseCodeTags (<- link) when you post your code, even if you post only few lines of code.
Main reason to use them is because it makes your code readable. See how better it looks? I have added them for you this time
 
Piet Souris
Rancher
Posts: 1783
55
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Jose,

In what follows is what I think is the case. I could be wrong though, I'm not
working on a regular basis with ForkJoinPools. In that case, some expert will
correct me.

Why do you think that forJoinPool "closes" after all n elements of 'records' are handled?
As far as the Pool is concerned, it executes the runnable that you are supplying.
The fact that that runnable issues a parallel stream has nothing to do with the
Pool. If you want the Pool to do the parallel work, then you should send it
an appropriate ForkJoinTask. There is a very nice tutorial on this at Oracle, about
blurring an image.

Do you see all messages appearing?

Greetz,
Piet
 
Jose Marcano
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Liutauras, I am new in the forum.

Thanks for your answer Piet, I will give you more information about my problem.

The collection has 2000 records;however, I see in the logs only 1900 ids processed. When I monitor the threads, the thread pool is completed, even though, I have more Ids to process in my collection.

Other thing to consider, if my collection has 1000 records or less, I see the application working properly and all records are processed;however , after 1000 elements, I see this behavior happening, like it never finish the iteration over all elements in the collection.

My service reads from database using a connection pool of 100 threads, and writes some a blob to the file System.
 
Stephan van Hulst
Saloon Keeper
Posts: 6984
110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you give us an SSCCE showing your problem?
 
Piet Souris
Rancher
Posts: 1783
55
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I made myself an SSCCE, and I must say: Jose is rught, something funny IS taking place.
My careful conclusion is is that you should give the ForkJoinPool
enough time to do its job, and that holds for the runnables too!
Here gos a very lengthy experiment.

My first attempt:

Output:
run:
Starting first job for fjp
Starting second job for fjp
Starting K(0)
list length: 10000
fjp finished
fjp terminated? false
busy with T(0)
BUILD SUCCESSFUL (total time: 2 seconds)

A second run gives:
run:
Starting first job for fjp
Starting second job for fjp
fjp finished
Starting K(0)
list length: 10000
busy with T(0)
fjp terminated? false
Starting K(1)
list length: 10000
busy with T(0)
BUILD SUCCESSFUL (total time: 0 seconds)

Might it have something to do with the fact that the program is finished?
I don't know. So, let's make sure that the program is NOT finished,
and see what happens.

So I made a new constructor:

Output:
run:
Starting first job for fjp
Starting second job for fjp
fjp finished
fjp terminated? false
Starting K(1)
list length: 10000
busy with T(0)
Starting K(0)
list length: 10000
busy with T(0)
busy with T(1000)
busy with T(2000)
busy with T(3000)
busy with T(4000)
busy with T(5000)
busy with T(6000)
busy with T(7000)
busy with T(8000)
busy with T(9000)
Runnable K(1) finished
busy with T(1000)
busy with T(2000)
busy with T(3000)
busy with T(4000)
busy with T(5000)
busy with T(6000)
busy with T(7000)
busy with T(8000)
busy with T(9000)
Runnable K(0) finished
BUILD SUCCESSFUL (total time: 1 minute 13 seconds)

That certinly seems to do the trick.

Another constructor:


Output:
run:
Starting first job for fjp
Starting second job for fjp
fjp finished
fjp terminated? false
Starting K(0)
list length: 10000
busy with T(0)
Starting K(1)
list length: 10000
busy with T(0)
BUILD SUCCESSFUL (total time: 1 second)

Or:

Output:
run:
Starting first job for fjp
Starting second job for fjp
fjp finished
fjp terminated? false
Starting K(1)
Starting K(0)
list length: 10000
list length: 10000
busy with T(0)
busy with T(0)
busy with T(1000)
busy with T(2000)
busy with T(3000)
busy with T(4000)
busy with T(1000)
busy with T(2000)
busy with T(3000)
busy with T(4000)
busy with T(5000)
busy with T(6000)
busy with T(7000)
busy with T(8000)
busy with T(9000)
Runnable K(0) finished
BUILD SUCCESSFUL (total time: 2 seconds)

Just did not finish K1!

And finally (almost!):

Output:
run:
Starting first job for fjp
Starting second job for fjp
fjp finished
fjp terminated? false
Starting K(0)
list length: 10000
busy with T(0)
Starting K(1)
list length: 10000
busy with T(0)
busy with T(1000)
busy with T(2000)
busy with T(3000)
busy with T(4000)
busy with T(5000)
busy with T(6000)
busy with T(7000)
busy with T(8000)
busy with T(9000)
Runnable K(1) finished
busy with T(1000)
busy with T(2000)
busy with T(3000)
busy with T(4000)
busy with T(5000)
busy with T(6000)
busy with T(7000)
busy with T(8000)
busy with T(9000)
Runnable K(0) finished
BUILD SUCCESSFUL (total time: 2 seconds)

And, finally a parallel-version:

Output:
run:
Starting first job for fjp
Starting second job for fjp
fjp finished
fjp terminated? false
Starting K(1)
list length: 10000
Starting K(0)
list length: 10000
busy with T(3000)
busy with T(4000)
busy with T(2000)
busy with T(9000)
busy with T(1000)
busy with T(0)
busy with T(8000)
busy with T(2000)
busy with T(7000)
busy with T(1000)
busy with T(0)
busy with T(9000)
busy with T(8000)
busy with T(7000)
busy with T(6000)
busy with T(5000)
Runnable K(0) finished
busy with T(5000)
busy with T(6000)
busy with T(4000)
busy with T(3000)
Runnable K(1) finished
BUILD SUCCESSFUL (total time: 2 seconds)


Pfffff.....

Greetz,
Piet
 
Jose Marcano
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Piet, thanks for your help.

I tried this:

ForkJoinPool, by default, uses daemon threads. In other words, in your code, the program must complete before the task you submitted for execution has time to complete. You'll have to wait/block until the task is complete.

I used a submit to get a ForkJoinTask.



But I have same problem, I am wondering if my problem is related to a thread contention with connection pool or one problem mentioned here:
http://zeroturnaround.com/rebellabs/java-parallel-streams-are-bad-for-your-health/

In my ideal world, I should see all my collection processed, so If I have 2000 ids, at least see the call to the method for all my IDs.

So my question is how do I allow the ForkJoinPool to wait for all worker thread to process all tasks? and why this is not happening with a smaller number of Ids in my collection?

regards,
Jose
 
Jose Marcano
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I found the issue!!! The problem was related to null pointer exception inside my service, so the parallel stream is stopping when one error is happening, so I had to catch and/or avoid NullPointerException in my code in order to make it work. No I can execute big collection in parallel with no ISSUE!!!

In conclusion if one exception bubble up to the parallel stream :



It is going to stop the processing of the collection "records".

 
Piet Souris
Rancher
Posts: 1783
55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Jose,

I'm glad that you managed to solve your problem.

An error in a parallel stream, I had never given this a thought.
Interesting!

One question I had though:

my only experience with a FJP is from the tutorial from Oracle.
Is there any particular reason for you to use such an FJP?
As far as I know, and that was what my first reply was about,
is that the parallel stream of the runnable is independent of
the FJP.

Greetz,
Piet
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!