• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Devaka Cooray
  • Ron McLeod
  • Jeanne Boyarsky
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Carey Brown
  • Tim Holloway
Bartenders:
  • Martijn Verburg
  • Frits Walraven
  • Himai Minh

Trying to understand difference between float vs double

 
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Greetings, i read online that float is 32 bit and that double is 64 bit. What does this mean?? i know what a bit is, i can convert bit to decimal...but what i don't know is the importance between the difference between the two.

I wrote this small code to try see what is difference but that didn't really help paint the picture for me:



Any guidance is appreciated
 
Saloon Keeper
Posts: 7488
171
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That code doesn't really demonstrate anything. Try this to start:


Also read https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
 
Saloon Keeper
Posts: 14515
325
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In integral types, the number of bits determines the length of the range of numbers you can save.

In floating point types, the number of bits determines the maximum precision of numbers you can save.

In short, double gives you less rounding error than float.
 
Tatenda Mawoneke
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Stephan van Hulst wrote:In integral types, the number of bits determines the length of the range of numbers you can save.

In floating point types, the number of bits determines the maximum precision of numbers you can save.

In short, double gives you less rounding error than float.



using "Tim Moores" code above, Is this what you meant by errors:



like how float value rounds to 6130.4346 instead of 6130.4347?
 
Stephan van Hulst
Saloon Keeper
Posts: 14515
325
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Imagine you had a floating point data type that supports 2 significant decimal digits, and a data type that supports 4 significant decimal digits. We'll call them dec2 and dec4 respectively. Have a look at the following code:

This yields:

As you can see, both data types will try to represent the assigned value as precisely as possible, but they can't do it exactly because they have a limited amount of digits. The rounding errors are

  • 0.002345679, and
  • 0.000045679 respectively.


  • You'll see that dec4 has less rounding error, and that's because it's more precise.

    It's the same for float and double, but instead of 2 and 4 decimal digits, they use 23 and 52 binary digits. When it appears that one of them is rounding down a value when it should be rounding up, that's because the rounding operation operates on binary digits, not the decimal digits that you see.

    You might be wondering about the numbers 23 and 52. Don't float and double have 32 and 64 bits? Yes, but not all of their bits are used to represent digits. One bit is used to tell whether the number is negative, and the remaining bits are used to represent the exponent.

    Take the values 9877000 and 0.9877. One is represented as 9877*10^3 and the other as 9877*10^-4. Both values use the significand 9877, but the first uses the exponent 3 and the second uses the exponent -4.

    float has 8 bits to represent the exponent, and double has 11 bits.
     
    Marshal
    Posts: 76888
    366
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I know two versions of the answer to your question. One is, do you have a “principles of computing” module; that should explain how IEEE754 numbers work. There is an article on Wikipedia, but I found that incomprehensible. The only thing that enabled me to know anything about IEEE754 numbers was a half‑hour's explanation in a lecture. That includes the bit about how a double uses 52 bits' space to get 53 bits' precision. But that still isn't enough for all applications. If I look at my internet banking and find there is a difference between my “balance“ and “available balance”, I can go into JShell and subtract the figures and get a result like 18.48999999999997. Then I can remember that I put £18.49 on the card at shop XXX yesterday.
    If you need to specify the kind of rounding, or you need precision, don't use float or double arithmetic: use decimal arithmetic with one of the following classes:- BigInteger (integers only) BigDecimal (decimal fractions). Never use floating‑point arithmetic for money.
     
    Campbell Ritchie
    Marshal
    Posts: 76888
    366
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    The second answer is much simpler:-

    You use doubles and you don't use floats.

    Not unless some API forces you to use a float.

     
    Did Steve tell you that? Fuh - Steve. Just look at this tiny ad:
    the value of filler advertising in 2021
    https://coderanch.com/t/730886/filler-advertising
    reply
      Bookmark Topic Watch Topic
    • New Topic