• Post Reply Bookmark Topic Watch Topic
  • New Topic

BigDecimal: Usefulness vs Overkill  RSS feed

 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
While making a small program to aid me in my work, I stumbled upon one thorny issue that I don't know how to settle.

My program is supposed to operate on decimals because it concerns volume of packagings. I learnt that what normally is simple arithmetic (100 cm × 75 cm × 60 cm = 0.45 m³) in Java gives only approximate value (something like 0.44(9) m³ in this case). That's close enough so I probably shouldn't worry, but the thing is that although for one packaging the difference is barely visible—if at all—for thousands of packagings it'll make a difference. It'll be worse when I get to counting money.

I found two solutions:
1) Using integers and so-called decorative decimal point on display. I have a bad feeling about this, though.
2) Using BigDecimal.

Now, BigDecimal seems like a nice solution. I tried it and so far it looks okay. But just look at an excerpt from my program that I had to re-write (previously it looked perfectly fine as a single line):


I'm using a HashMap to store dimensions. I figured it'd make it easy for me to manage and display data later on.

What do you think, Ranchers? Isn't using BigDecimals very heavy, and thus should be avoided at all costs?

Do you need more information? Maybe I should elaborate on my mystifying code?
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Matthews wrote:I learnt that what normally is simple arithmetic (100 cm × 75 cm × 60 cm = 0.45 m³) in Java gives only approximate value (something like 0.44(9) m³ in this case)

It actually gives 0.45 but I understand the point you are making.

Mike Matthews wrote:It'll be worse when I get to counting money.

No it won't because you never ever use floats or doubles for monitory values

Seriously though, before you go to all this trouble for your volume calculations have you worked out the error in the worst case scenario (ie if all the boxes are at minimum size and all the rounding is in the same direction). If you have done this and the maximum error is not acceptable then you need to move from floating point math on the other hand the maximum error may well be insignificant and you can stick with the code you originally had.

Mike Matthews wrote:
I found two solutions:
1) Using integers and so-called decorative decimal point on display. I have a bad feeling about this, though.
2) Using BigDecimal.

These are two commonly used ways of handling such cases. Why do you have a bad feeling about using integers?

Mike Matthews wrote:What do you think, Ranchers? Isn't using BigDecimals very heavy, and thus should be avoided at all costs?

Your statement smacks of premature optimisation. If your problem dictates you need the precision offered by BigDecimal then use it, later on when you are testing using real world scenarios if there is a measurable performance issue then look at alternative approaches.
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No, BigDecimal is most certainly not overkill.

Why are you putting dimensions into a Map? Why do you not have a Box class with width height length fields? Why are you trying to calculate the volume of a box outside its class? Things like volume and surface area are attributes of the box and you would probably do better to give the class methods to calculate those things. If you make the box class immutable you can cache volume and area so you only have to calculate them once.
you should not be using
new BigDecimal("1000000")
but
private static final BigDecimal MILLION = new BigDecimal("1000000");
Then you can use MILLION and not need to create new instances all the time.

You have obviously already worked out how to use BigDecimal. There is nothing wrong with using it, nor with calculating your values with integer arithmetic, except that very large boxes might cause overflow. And I agree you do get some long lines with BigDecimal; I went back and broke the line so it is not too long to read, and you can see the correct way to break lines.

I have a couple of old posts about BigDecimal but I think they are surplus to requirements just at the moment.
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would try to hide that mess in an object, say Package:

I normally wouldn't use an abbreviated name like pkg but since package is a Java keyword, I can live with that name in this case.
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:
private static final BigDecimal MILLION = new BigDecimal("1000000");
Then you can use MILLION and not need to create new instances all the time.

And add that as a refactoring to the code I suggested
 
Bear Bibeault
Author and ninkuma
Marshal
Posts: 66304
152
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not overkill. I do lots of numeric calculations (weather data) and need the precision of BigDecimal. Yeah, the code for formulas is cumbersome beyond belief, but, as they say, "when it positively, absolutely, needs to get there overnight..." you use the right tool for the job.
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You'd probably add a getShippingCost to that class, too:
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Junilu Lacar wrote: . . . package is a Java keyword . . .
And Box is a Swing class, so we all seem to have names already used
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Wow, Ranchers, I cannot thank you enough. I love your ideas and will definitely use them.

Firstly, I think I will stick to BigDecimal, especially now that you've shown me some ways to make the code clearer. I feel silly that I haven't come up with those ideas myself. Simply ingenious!

I think I'll stick to BigDecimal because in the long run I plan to make calculations for quite big amounts of data. As you may have guessed the whole idea of Packaging is just a slice of the cake I'm baking. The plan is for it to basically substitute for what I'm doing in MS Excel right now—calculating bulk orders. Sometimes I need to make reports spanning over multiple orders.

I do have a Box class (actually called "Packaging"), which is a field in Product class. You made me wonder, though. Before, I kept dimensions in an array of variable length (there are various types of packagings, like carton or box, for which volume is calculated differently). Now I use a HashMap<String, BigDecimal> (although I'm considering enum Type instead of String) so I can easily read and access different dimensions. I feel I'm tangling it all up while I could achieve the same results in a more straightforward way...

By the way, I tried to get the same result I got before. I wish I knew how I'd done that. Now even if I use double values only, I get 0.45. That's curious...

P.S. Once again thanks to you all. Code Ranch is the place to be.
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Docherty wrote:
Mike Matthews wrote:
I found two solutions:
1) Using integers and so-called decorative decimal point on display. I have a bad feeling about this, though.
2) Using BigDecimal.

These are two commonly used ways of handling such cases. Why do you have a bad feeling about using integers?

The thing is that the user (in most situations this will be me) might input a decimal. How to handle such situations? (I insist on displaying dimensions as cm.) I could multiply the input by 10 and cast the result to integer so from, say, 47.5 cm I'd get 475 mm which I'd store in the variable and operate on. What do you say?

Tony Docherty wrote:
Your statement smacks of premature optimisation. If your problem dictates you need the precision offered by BigDecimal then use it, later on when you are testing using real world scenarios if there is a measurable performance issue then look at alternative approaches.

Maybe I am a bit hasty, but I'm being precautious. Why worry later and repair my code when I can write something correctly from the start, especially since I have your advice. :) What's more, why would I invent something that might have already been invented by someone else.
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Matthews wrote: . . . I could multiply the input by 10 and cast the result to integer so from, say, 47.5 cm I'd get 475 mm . . ..
But if they enter 47.3cm, ten times that is not 473.0. The error is slight but it is there. Let's try it:-
System.out.println(new BigDecimal(47.3 * 10.0));
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
… so BigDecimal seems to be the best/only option where precision is needed?
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Mike,

well, a system that boils down to some simple short sums and multiplications, I think
that BigDecimals are overkill. Maybe for chaotic systems you need the accuracy of
BigDecimals, I can think of astronomy, but according to Bear Bibeault also for
weather calculations.

At the office, we are calculating billions of financial numbers each month, using
plain old doubles and that is sufficient.

So. it boils down to what you think is acceptible. I tried to make an example:

And the outcomes are:

a = 0,1665
prod = 33.306.536,3669
sum = 33.306.536,4864
prod - sum = -0,1195
% error = -0,000000359 %

******* BigDecimals **************************
biga = 0.16653186
bignumber= 200.000.987
bigprod = 33.306.536,3669
bigsum = 33.306.536,3669
bigdif = 0

To me, doubles seem appropriate. But judge for yourself.

Greetz,
Piet
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Matthews wrote:… so BigDecimal seems to be the best/only option where precision is needed?

No, the point Campbell was making is if you want to use the integer approach you can't read in the input as a decimal and then multiply it by a factor as this may introduce a very small error. The user can still enter the value as an decimal value but you have to read the input in as a String and then do some manipulation on the String to convert it to an integer value, for example you could check the input string to ensure there is the correct number of digits after the decimal point and if not pad it with 0's and then remove the decimal point.
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Since you are using DecimalFormat on the doubles and not on the BigDecimals, you are not comparing like with like.
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Docherty wrote:
Mike Matthews wrote:… so BigDecimal seems to be the best/only option where precision is needed?

No, the point Campbell was making is if you want to use the integer approach you can't read in the input as a decimal and then multiply it by a factor as this may introduce a very small error. The user can still enter the value as an decimal value but you have to read the input in as a String and then do some manipulation on the String to convert it to an integer value, for example you could check the input string to ensure there is the correct number of digits after the decimal point and if not pad it with 0's and then remove the decimal point.

Good point. I see it now. But then, it seems like the error margin is small for doubles after all so I might as well try them out. Then, when I have feedback from larger data, I'll see if there's a need to swap to BigDecimal or integers. Integers seem like a better option so I might try that before BigDecimal if need be.

Campbell Ritchie wrote:Since you are using DecimalFormat on the doubles and not on the BigDecimals, you are not comparing like with like.

But the results are still correct, aren't they? Perhaps I am missing something?

Piet Souris wrote:

To me, doubles seem appropriate. But judge for yourself.

That they do, but I'll see if in my case the margin is also so small. I'll be doing quite some rounding up or down, which (if my thinking is right) might mess up the calculations.

I'm feeling less and less confused. I know which way to go from here. I'll try to give you some feedback as soon as I can. The discussion is still open so feel free to add as much as you wish. I'd be grateful for every little piece of insight.
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Matthews wrote:
Good point. I see it now. But then, it seems like the error margin is small for doubles after all so I might as well try them out.

Which is why, in my very first post in this thread I said:

Tony Docherty wrote:Seriously though, before you go to all this trouble for your volume calculations have you worked out the error in the worst case scenario (ie if all the boxes are at minimum size and all the rounding is in the same direction). If you have done this and the maximum error is not acceptable then you need to move from floating point math on the other hand the maximum error may well be insignificant and you can stick with the code you originally had.
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Docherty wrote:
Mike Matthews wrote:
Good point. I see it now. But then, it seems like the error margin is small for doubles after all so I might as well try them out.

Which is why, in my very first post in this thread I said:

Tony Docherty wrote:Seriously though, before you go to all this trouble for your volume calculations have you worked out the error in the worst case scenario (ie if all the boxes are at minimum size and all the rounding is in the same direction). If you have done this and the maximum error is not acceptable then you need to move from floating point math on the other hand the maximum error may well be insignificant and you can stick with the code you originally had.

Right, but at that point I thought I could write code that would give me perfect funcionality at no extra cost without preliminary testings. Now it seems that there's no solution that works in each and every case. I need to work out such code that would serve its purpose in my program, while in another program it could yield unsatisfactoty results.
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I made two almost identical implementations of the Packaging and its Tester classes, the only difference between the types used: double in one version, BigDecimal in the other.

Then I ran a test for each. I created 1000 instances of the class and measured memory usage. (The Packaging class is a bit more complex than what we discussed here, by the way.) The results I got made me drop my jaw on the floor and be unable to lift it back up for another 5 minutes.

Memory used for allocation (double): 56 112
Memory used for allocation (BigDecimal): 322 144

Is it possible?

Tomorrow I'll try to time code execution.
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And why are you worrying about using 322kB memory? How much does memory cost nowadays?
 
Piet Souris
Master Rancher
Posts: 2041
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That is one way to look at OP'S efforts.

Another way is that OP is trying to collect data about the
questions that he has, so that he can make a well informed
decision.

Greetz,
Piet
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:And why are you worrying about using 322kB memory? How much does memory cost nowadays?

It's not the 322kB usage that I'm astonished at, it's the 266kB difference between the two.
 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Checked elapsed time throughout creation of 1000 objects.

Double: 10 ms
BigDecimal: 40 ms
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not really that surprising, Mike. A double is 64 bits while a BigDecimal is an object that has quite a number of internal fields. See http://stackoverflow.com/questions/2501176/java-bigdecimal-memory-usage
 
Derik Davenport
Ranch Hand
Posts: 92
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Matthews,
I would say that you have already answered your own question. Yes, BigDecimal is overkill for your situation because it makes your code harder to read, harder to understand, and harder to debug. Your task is to convert from cubic centimeters (an integer) to cubic meters which will probably not be an integer.
Your first solution was to use a float; but that gave you aesthetically displeasing rounding errors. Look at the Formatter class. The rounding errors of floating point arithmetic still exist, but they will never be visible to end user.

If you don't like Formatter, you could consider fixed point arithmetic to avoid the rounding errors. http://en.wikipedia.org/wiki/Fixed-point_arithmetic

Is it ever necessary to use Big Decimals? Yes, whenever doing so makes your code easier to read. There has been an excellent discussion here about performance issues of BigDecimal versus other things. But I don't think you need to worry about that for a utility app like the one you are writing. Instead I would focus on clarity of code.


 
Mike Matthews
Ranch Hand
Posts: 49
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Junilu Lacar wrote:Not really that surprising, Mike. A double is 64 bits while a BigDecimal is an object that has quite a number of internal fields. See http://stackoverflow.com/questions/2501176/java-bigdecimal-memory-usage

Yes, but to me it's still a discovery because it was the first time for me to actually measure that memory usage. I was expecting BigDecimal to render greater usage, but not so much.

Derik Davenport wrote:I would say that you have already answered your own question.

In a way, yes. Still I can't diminish everyone's contribution cos discussing the problem with myself wouldn't help much. I'll look at fixed-point arithmetic more closely later on. You're not the first person to suggest using integers, and I'm slowly leaning towards it. I'll be sure to run more tests as my code grows, but I might as well keep that feedback to myself. I don't think it'd be particularly interesting to most people here, and I don't want to look like a spammer.

P.S. Thanks for the link, Junilu. I'm getting so comfortable... just asking questions without looking for answers. Then again, I wanted to to learn not only about memory usage, but about the best ratio between memory usage and usability in my specific case so hopefully it's enough an excuse.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mike Matthews wrote:… so BigDecimal seems to be the best/only option where precision is needed?

I would probably try to see whether I can do with longs first.

You need to track dimensions and volume. What is the required precision of the dimensions, and what is the maximal volume you might ever need to track?

In your first post, you mention dimensions specified in centimeters. I'll assume that in no circumstances you would need to express the dimensions with a greater precision than a millimeter.

So, assuming the basic unit is millimeter, the biggest volume we can express in a long is 2^63 cubic millimeters. This is about 9.22e18 mm^3, or, if my calculations aren't wrong, about 9 cubic kilometers. That's surely enough, isn't it?

I guess that the most of cargo is being transferred by ships. According to this article, the world container fleet capacity in 2012 was 15.4 million TEUs and growing fast. TEU is quite an inexact unit, but some conversion website helpfully converted that number to 0.5 cubic kilometer, give or take.

This is uncomfortably close to our theoretical limit. If there was a reserve of several orders of magnitude, I'd advice to use longs as described to avoid the hassle with BigDecimal. But if you plan to track all packages the largest shipping companies over several years, or perhaps do some statistics over the entire world's cargo operations, then I can imagine you might run into troubles with longs some time in the future.

Measuring dimensions in centimeters instead would save us about three orders of magnitude. That would make longs somewhat safer to use.
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Martin Vajsar wrote: . . . the world container fleet capacity in 2012 was 15.4 million TEUs and growing fast. TEU is quite an inexact unit, . . .
TEU?

I thought it was TFU which is Twenty‑foot‑unit the capacity of a twenty foot long container (half the length of a railway carriage) using old‑fashioned British units rather than those new‑fangled French meters. Most containers are actually 40′ long which is about 12m.
I presume that a ship with 4000TFU capacity will be able to carry 4000 20′ containers or 2000 40′ containers.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:TEU?

Twenty-foot Equivalent Unit, according to Wikipedia.
 
Campbell Ritchie
Marshal
Posts: 56518
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaah!
 
Don't get me started about those stupid light bulbs.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!