Forums

Decimal vs binary floating point

General discussion about Cobra. Releases and general news will also be posted here.
Feel free to ask questions or just say "Hello".

Decimal vs binary floating point

Postby benhoyt » Sun Sep 29, 2013 5:33 pm

I have a heavy Python background and really like the looks of Cobra. It seems like a really nice language, and good idea to build on .NET so you don't need to reinvent the standard libraries, GC, VM, etc.

One question I have is about the choice of "decimal by default". See http://cobra-language.com/docs/python/

I don't quite understand this. A friend and I were debating the other day about what the "decimal" type was for, whether for its accuracy or for its decimal-ness. I think in most cases (general calculations, statistics, number crunching, drawing Mandelbrot sets on the screen, etc) users want fast binary floating point. But you should use decimal when the base-10 "decimal exactness" of the numbers is important. This is really only financial calculations, when yes, you should be using decimal. In other cases the base-10-ness doesn't matter.

Decimals have accuracy issues too, it's just that the default precision is usually higher (in Python, the default precision is 28 decimal places). Whereas the default floating point precision is closer to 16 decimal places (more precisely, 53 bits of precision).

Am I totally off base here?

Relatedly, Python 2.7 has improved the repr() of floats so that this example is no longer valid:

Code: Select all
>>> from __future__ import division
>>> 4 / 5  # normal division
0.80000000000000004  # <-- there is an extra '4' at the end


That now returns just "0.8" on Python 2.7.
benhoyt
 
Posts: 2

Re: Decimal vs binary floating point

Postby Charles » Sun Sep 29, 2013 7:42 pm

Thanks for your interest in Cobra.

First let me point out that you can override the default choice through the command line with -number or in source code with a directive:
@number float64

class X
def foo(a as number, b as number) as number
return a * b

I agree that there are a lot of applications that would prefer float64 or float32 especially when driven by libraries for graphics, science, etc.

Regarding my reasoning behind preferring decimal as the default, a lot of input values are human-based and therefore decimal based, even outside financial applications. If you have a config value of "0.1" for "ten percent" then you immediately have a value that fits in neither float32 nor float64 with accuracy. And that's before you have even done any calculations.

In Python 2.7, is the fix that that type 'float' can now accurately store numbers like 1/10.0 or is the fix to tweak repr() to output what str() outputs? I ask because if the change was to repr() that seems like a disingenuous fix: traditionally, str() shows "nice output" and repr() shows "technical output".

Also, I still get <type 'float'> in Python 2.7. Is 'float' still a 64-bit IEEE float, or is it a new type?

Is there a url that describes what change they made?

TIA for any info/pointers.
Charles
 
Posts: 2515
Location: Los Angeles, CA

Re: Decimal vs binary floating point

Postby benhoyt » Mon Sep 30, 2013 2:36 pm

If you have a config value of "0.1" for "ten percent" then you immediately have a value that fits in neither float32 nor float64 with accuracy. And that's before you have even done any calculations.


True. However, when using any floating point type (decimal or binary), it's almost always a good idea to round to N decimal places when you display or present values to the user, because 17 decimal places after any calculation is almost never what the user should see.

In Python 2.7, is the fix that that type 'float' can now accurately store numbers like 1/10.0 or is the fix to tweak repr() to output what str() outputs? I ask because if the change was to repr() that seems like a disingenuous fix: traditionally, str() shows "nice output" and repr() shows "technical output".


They definitely didn't change the float type itself -- that would be a very invasive change -- they changed just the repr() output to be smaller in many cases, but still 100% correct. I don't know how the Python 2.7 repr() is different from str(). However, they're not the same, and repr() is still "technical output". They've carefully designed it so that repr() produces the smallest string representation where float(repr(f)) will exactly equal f -- that's the guarantee.

Yes, the "What's New in Python 2.7" is what you want to read. It's here: http://docs.python.org/dev/whatsnew/2.7.html -- search for repr(). Here's the relevant quote:

Related to this, the repr() of a floating-point number x now returns a result based on the shortest decimal string that’s guaranteed to round back to x under correct rounding (with round-half-to-even rounding mode). Previously it gave a string based on rounding x to 17 decimal digits.
benhoyt
 
Posts: 2

Re: Decimal vs binary floating point

Postby Charles » Sun Oct 20, 2013 10:07 am

Sorry, your post got held up in a moderation queue and I didn't see that.

benhoyt wrote:True. However, when using any floating point type (decimal or binary), it's almost always a good idea to round to N decimal places when you display or present values to the user, because 17 decimal places after any calculation is almost never what the user should see.


repr() isn't for users, it's for programmers. str() is for users.

benhoyt wrote:They definitely didn't change the float type itself -- that would be a very invasive change -- they changed just the repr() output to be smaller in many cases, but still 100% correct. I don't know how the Python 2.7 repr() is different from str(). However, they're not the same, and repr() is still "technical output". They've carefully designed it so that repr() produces the smallest string representation where float(repr(f)) will exactly equal f -- that's the guarantee.


If by "correct" you mean that you will get the value back with float(repr(f)) then I agree with you. But if by "correct" one would mean an accurate representation of what is stored in a base-2 floating point number, it is not. There is no "0.1" in float64 and as soon as you store "0.1" as a float64, you have introduced a small error in your program state. Whether that small error matters will be a function of your application and requirements.

After all these years it would be nice if computers just worked with the types of numbers that humans work with--and without the performance penalty. Other areas have improved such as color, high resolution, voice recognition, storage, etc. Numerics are still quite primitive and "non-human" by comparison.
Charles
 
Posts: 2515
Location: Los Angeles, CA


Return to Discussion

Who is online

Users browsing this forum: No registered users and 36 guests