• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

When utilizing all cores - speed of execution decrease significantly. Why?

 
Ranch Hand
Posts: 75
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Something doesn't make sense in my code in terms of 'speed of execution': The computer I'm using is a power mac with 16G ram and 16 cores. The ReadTaskThread simply goes to the DB to retrieve information (SELECT). This returns a list of items and apparently it takes the same time to fetch 10 items or 1000 items (give or take 2 sec).

problem 1: When I use nProcessors=2 cores I get the best performance (execution in 1.2min) . When I use nProcessors=3 cores and more, the execution is 7min + (worst).
problem 2: When the list of items is 1000, the speed to process in the 'consumer' takes some time and as a result the producer has more time to fetch the new data from the db. When the list of item is small - the delay is HUGE because the consumer is waiting for the producer to get the data.

Question 1: I was under the impression that the more utilized processor the better, why this is not the case in the senario I presented?
Question 2: Is newFixedThreadPool is the right one to use?

Thank you for any pointers!


 
Ranch Hand
Posts: 67
Eclipse IDE Debian Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
When reading from disk your processing power is probably not your bottleneck. To improve the performance in this situation you may add additional disks for parallel disk access.
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Also, unless you are using an embedded DB, you may want to look at your database server too. The taking 4 times longer with an extra request in parallel seems weird to me -- unless of course, you are doing more work with the extra request.

Henry
 
Bartender
Posts: 4179
22
IntelliJ IDE Python Java
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And I also don't think you are consuming the data very efficiently. You are consuming the data in the order in which they are submitted, rather than the order in which they return. Perhaps you should change the way the tasks object is used. For example, make it a BlockingQueue implementation, and pass it to both the producer and consumer. The Producer can then put tasks into the queue directly, and the consumer takes tasks in the order they are available which may decrease time spent waiting for a specific producer.

One last thing... you might consider working on the balance between Producer and Consumer as well. Starting with your current thread pool, turn your consumer into a Runnable which gets executed in the pool (takes one Thread from the Producer) and measure the results. Then add a second Consumer, and a third, etc... and see if there is some balance which optimizes performance. If the Consumer's job is processor-intensive, it could be that having multiple running while each producer is waiting on the database may provide better performance. And you don't necessarily have to limit yourself to 16 threads total. If your DB task is talking to a remote DB, and the DB operation takes some time then the Producers aren't using the processors while they wait on the DB - perfect time to allow something else to use the processor, like perhaps a Consumer or another Producer.
 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
New object allocation still may be a difficult task when many cores are used. While with one or two cores it may not be important (even not recommended) to care very much about object reusing, with 10 or more cores "traditionally inappropriate" approaches like object pooling and reusing may help with performance. The best is to allocate all needed objects outside the critical section and do not allocate any new objects in the main loops. We were able to speed up some programs up till several times after fixing these issues.
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic