• Post Reply Bookmark Topic Watch Topic
  • New Topic

Floting point processing and speed  RSS feed

 
Mark Herschberg
Sheriff
Posts: 6037
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My server is doing a bunch of floating point calculations (e.g. typically summing a list of double x double calculations, lots of sums and multiplications, far fewer divisons). We're finding perfomrance issues, but our profiler can't give us the level of detail necessary to answer the question I have.
The server is a Dell OptiPlex GX240 Intel pentium 4 2.4GHz with 1G of RAM. We're running Weblogic 7 (against an Oracle 9i DB on another machine) on top of Win2000.
I know that chips traditionally have a special floating point processor, this is usually not as integrated into the instruction path as is the arithmetic processing unit. We take a performance hit both by more ocmplex calculations, as well as "travel time." (Maybe modern chip architectures have changed this and it is no longer true.)
Quantities are integral values (so could be long). Prices need only two digits of percision (so could be floats). Now its possible that a float multiplied by a long would overflow a non-double value. For this and a few other reasons, we made everything doubles. Now I'm thinking that maybe I should switch to longs. Because I only need two decimals of percision (maybe 3 if I need to consider rounding issue), I could simply treat my values as pennies instead of dollars.
However, one worry I'd have is if this would negatively impact performance, because now calcs previously done in the floating point processor will content for time on the arithmetic processing unit, along with other instruction based and general accounting needs.
Does anyone have any experience with something like this? I'd prefer not to have to modify all my code to test it (350+ classes). I'm also hestitant to simply make some test case, because I don't think I can easily replicate the loads on the two math processors.
Any thoughts, comments, or ideas?
--Mark
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As for several of your past questions Mark: you're not being ignored, we just don't have any specific info for you. Well, I don't anyway. Probably the info you need is too specific, and you won't really know until you test. Out of curiosity - how sure are you that numeric computations are a significant component to performance in the current code? I undertand the profiler may not be able to break down the different parts of calculations for you - but can you at least verify that it's certain lines of code containing computations that are the problem? (At least, a notable part of the problem?) Or are these distributed among too many different classes to tell easily? It may be worthwhile to do some refactoring here to consolidate numerical computations into fewer classes. My gut feeling is you shouldn't have to change 350 classes to replace float with long, for example - and if you do, refactoring will probably benefit you in a number of ways. Though it's hard to say much more without knowing details what the application does, and there may be good reasons why it has to be done a certain way... Good luck.
 
Mark Herschberg
Sheriff
Posts: 6037
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No problem. I figured these are tough, specific questions, requiring not just Java knowledge but probably computer, application, and possibly domain knowledge as well.
The order is hard to change. it works as follows
1) The client creates an order object
2) The order is passed to server who checks it.
3) The server then does a margin check to make sure there's enough money in the account.
4) The order is crossed.
5) Updates objects are sent out to clients.
6) The clients render the data in the update objects.
So for each step, there's a few classes involved, plus for EJBs its changing the interfaces. It's not changing all classes, but its a few hours of work which may be a dead end that I had hoped to avoid by posting here.
You are right that I don't really know what the double vs. long performance is. The naive tests (see below) yield results I don't quite understand. For low numbers (e.g. 10000), doubles seem slightly longer then longs (sometimes they give the same time, sometimes the doubles is slightly longer). At large sizes (e.g. 1000000), doubles are faster. this may be due to my theory that longs share the pipeline, or maybe its for other reasons. I just don't know (hence my posting here :-).
So I may just have to spend a good few hours trying it and seeing the numbers. :-p
--Mark

 
Rob Ross
Bartender
Posts: 2205
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This was an interesting problem.
Here's what I found out.
First, try this version of your test:

Run it a few times. Notice that the larger SIZE times for ADDITION for double vs long are (generally) close, but the times for MULTIPLICATION for double vs long are much farther apart.
I always thought "integer arithmetic is faster than floating point." In *general* this is true. But I discovered that the way the Pentium implements it's floating point routines is much more efficient that it's integer routines with respect to Multiplications. If you're doing Addition with both floating point and integer you'll get about the same performance. Multiplication is faster in floating point, and (generally) division is faster in integer math.
Comparison of Pentium Floating Point and Integer Speeds

Reference:
java glossary
IA-32 Intel Architecture Optimization Reference Manual, page 2-94 , compiler coding rule#45
An example of the above being applied is:
http://www.azillionmonkeys.com/qed/amult.html

These are highly technical references and aren't something you ever have to really worry about when you're writing in a high-level language like Java, but occasionally it is helpful to know why things work the way they do!
I'd be curious to run this on a different CPU, like my Mac iBook and see what results I get there.
Hope this helps!
 
Mark Herschberg
Sheriff
Posts: 6037
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
THANKS YOU ROB!
This was the type of stuff I was looking for. Between the numbers from the test code and the processing speed data you found, I feel comfortable going with the assumption that it's faster to keep the doubles then to switch to longs.
Sure we'll still ultimately speculating about what my production code does, but so far all information points in the same direction.
This is also good info to know generally. I really appreciate the research and references!
--Mark
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!