"Romanenko Mikhail" <[email protected]> writes:
testfloat=# update tftbl set f1=10.1 where f1=10 and f2=20;
UPDATE 1
testfloat=# update tftbl set f2=20.2 where f1=10.1 and f2=20;
UPDATE 0
The second update is failing to find any tuple that satisfies f1 = 10.1,
because f1 is a float4 variable whereas 10.1 is implicitly a float8
constant. 6.5 also treated 10.1 as float8, but managed to find equality
anyway.

I think this change in behavior is my fault :-(. About a year ago
I cleaned up some ugly coding in float.c and (without thinking about
it very carefully) changed float48eq and related operators so that
float4-vs-float8 comparisons are done in float8 arithmetic not float4.
The old code truncated the double input down to float and did a float
equality check, while the new code promotes the float input to double
and does a double-precision comparison.

This behavior is arguably more correct than the old way from a purely
mathematical point of view, but now that I think about it, it's not
clear that it's more useful than the old way. In particular, in an
example like the above, it's now impossible for any float4 value to be
considered exactly equal to the float8 constant 10.1, because the float4
value just hasn't got the right low-order bits after widening.

Perhaps the old way of considering equality only to float accuracy
is more useful, even though it opens us up to problems like overflow
errors in "float4var = 1e100". Comments anyone?

A general comment on your table design though: anyone who expects exact
equality tests on fractional float values to succeed is going to get
burnt sooner or later. If you must use this style of coding then
I recommend using numeric fields not float fields, and certainly not
float4 fields.

regards, tom lane

Search Discussions

  • Philip Warner at Aug 7, 2000 at 3:57 pm

    At 11:35 7/08/00 -0400, Tom Lane wrote:
    Perhaps the old way of considering equality only to float accuracy
    is more useful, even though it opens us up to problems like overflow
    errors in "float4var = 1e100". Comments anyone?
    The following frightened me a little:

    pjw=# select float4(10.1);
    float4
    --------
    10.1
    (1 row)

    pjw=# select float8(float4(10.1));
    float8
    ------------------
    10.1000003814697
    (1 row)


    I would have expected the latter to be at worst 10.10000000000000 +/-
    .00000000000001.

    Am I missing something?


    ----------------------------------------------------------------
    Philip Warner | __---_____
    Albatross Consulting Pty. Ltd. |----/ - \
    (A.C.N. 008 659 498) | /(@) ______---_
    Tel: (+61) 0500 83 82 81 | _________ \
    Fax: (+61) 0500 83 82 82 | ___________ |
    Http://www.rhyme.com.au | / \|
    --________--
    PGP key available upon request, | /
    and from pgp5.ai.mit.edu:11371 |/
  • Thomas Lockhart at Aug 7, 2000 at 4:09 pm

    I would have expected the latter to be at worst 10.10000000000000 +/-
    .00000000000001.
    Am I missing something?
    Well, yes :)

    10.1 can't be represented exactly, so the float8 representation has bits
    set way down at the low end of the mantissa. When converting to float4
    those low bits get rounded up or down into the lowest bit of the float4
    representation. At that point, you have lost knowledge that this ever
    was supposed to be *exactly* 10.1. And when converting back to float8,
    that float4 low bit becomes a middle-range bit in the float8
    representation, with all the bits underneath that zeroed.

    Back in the old days, before printf() implementations settled down, you
    would be reminded of this any time you did anything, since just
    assigning 10.1 and then printing it out would give you some goofy
    10.099999999998 or 10.10000000001 (don't count the number of digits here
    too closely, they are only qualitatively correct).

    - Thomas
  • Philip Warner at Aug 7, 2000 at 4:28 pm

    At 16:12 7/08/00 +0000, Thomas Lockhart wrote:
    I would have expected the latter to be at worst 10.10000000000000 +/-
    .00000000000001.
    Am I missing something?
    10.1 can't be represented exactly, so the float8 representation has bits
    set way down at the low end of the mantissa. When converting to float4
    those low bits get rounded up or down into the lowest bit of the float4
    representation. At that point, you have lost knowledge that this ever
    was supposed to be *exactly* 10.1. And when converting back to float8,
    that float4 low bit becomes a middle-range bit in the float8
    representation, with all the bits underneath that zeroed.
    Now I understand, but it doesn't quite make sense given what was displayed.
    The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
    that there was a level of either accuracy or display rouding happening.
    When this value is converted to float8, I hoped that the result would be
    the same as:

    Cast( Cast(f4val as varchar(32)) as float8)

    Maybe this hope is naieve, but it it a lot more useful than the current
    situation. But now that I understand what is happening, I see that (short
    of varchar conversions!), it is probably quite hard to do since we can't
    tell the 'correct' value.


    ----------------------------------------------------------------
    Philip Warner | __---_____
    Albatross Consulting Pty. Ltd. |----/ - \
    (A.C.N. 008 659 498) | /(@) ______---_
    Tel: (+61) 0500 83 82 81 | _________ \
    Fax: (+61) 0500 83 82 82 | ___________ |
    Http://www.rhyme.com.au | / \|
    --________--
    PGP key available upon request, | /
    and from pgp5.ai.mit.edu:11371 |/
  • Tom Lane at Aug 7, 2000 at 4:54 pm

    Philip Warner writes:
    Now I understand, but it doesn't quite make sense given what was displayed.
    The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
    that there was a level of either accuracy or display rouding happening.
    In float4-to-ASCII, yes. Modern printf implementations have some
    heuristics about the actual accuracy of float4 and float8 and where they
    ought to round off the printed result accordingly. But float4 to float8
    is normally done just by appending zeroes to the mantissa.

    I suppose we could implement the conversion as "float8in(float4out(x))"
    instead of "(double) x" but it'd be several orders of magnitude slower,
    as well as being *less* useful to those who know what they're doing with
    float math (since the result would actually be a less-precise conversion
    except in cases where the intended value has a short decimal
    representation).

    After thinking about it some more, I'm of the opinion that the current
    behavior of float48eq and friends is the right thing, and that people
    who expect 10.1 to be an exact value should be told to use type NUMERIC.
    We should not kluge up the behavior of the float operations to try to
    make it look like inexact values are exact. That will just cause
    failures in other situations.

    regards, tom lane
  • Philip Warner at Aug 8, 2000 at 3:38 am

    At 12:53 7/08/00 -0400, Tom Lane wrote:
    Philip Warner <[email protected]> writes:
    Now I understand, but it doesn't quite make sense given what was displayed.
    The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
    that there was a level of either accuracy or display rouding happening.
    I suppose we could implement the conversion as "float8in(float4out(x))"
    instead of "(double) x" but it'd be several orders of magnitude slower,
    as well as being *less* useful to those who know what they're doing with
    float math (since the result would actually be a less-precise conversion
    except in cases where the intended value has a short decimal
    representation).
    Would I be right in saying that "those who know what they're doing with
    float math" would totally avoid mixing float4 & float8? If so, then ISTM
    that any changes to float4/8 conversions should not affect them.

    There seem to be a few choices:

    - go with zero-extending the bits. This is easy, and what would be expected
    for normal float ops, at least by people who understand float implementations.

    - do an intermediate text or numeric conversion. This will produce more
    expected results, but at the expense of speed. If people complain about
    speed, then they can change all float types to matching precision, or use
    numeric data types.

    - take the code from 'printf' or 'numeric' and do the appropriate
    bit-munging to get the value to use in conversion. No idea if this would
    work, but it is probably better than doing a text conversion since we won't
    be at the mercy of the occasional C library that produces 10.1000001.

    Would it be worth having some kind of DB setting for how it handles
    float4/8 conversion? Or is it just too much work, when just using all
    float8 or numeric is an acceptable workaround?

    Do you know how fast 'numeric' is?

    That will just cause
    failures in other situations.
    If there are genuine failures that would be introduced, then clearly it's a
    bad idea. But, since it will only affect people who compare float4/8, it
    doesn't seem too likely to produce worse failures than the change you have
    already made. I ask this mainly out of curiosity - I assume there are more
    aspects to this issue that I have missed...


    Bye for now,

    Philip.




    ----------------------------------------------------------------
    Philip Warner | __---_____
    Albatross Consulting Pty. Ltd. |----/ - \
    (A.C.N. 008 659 498) | /(@) ______---_
    Tel: (+61) 0500 83 82 81 | _________ \
    Fax: (+61) 0500 83 82 82 | ___________ |
    Http://www.rhyme.com.au | / \|
    --________--
    PGP key available upon request, | /
    and from pgp5.ai.mit.edu:11371 |/
  • Don Baccus at Aug 8, 2000 at 12:57 pm

    At 01:37 PM 8/8/00 +1000, Philip Warner wrote:
    - do an intermediate text or numeric conversion. This will produce more
    expected results
    By who? I'm serious. I sure wouldn't. I can't think of any language
    implementation that does this. The standard approach has the advantage
    of maintaining a defined significance. The approach you suggest doesn't,
    you're actually losing significance. It gives the illusion of increasing
    for the particular example you've chosen, but it is nothing but illusion.
    Would it be worth having some kind of DB setting for how it handles
    float4/8 conversion?
    Use type numeric when you need precise decimal results. Your suggested
    kludge won't give you what you want.
    Do you know how fast 'numeric' is?
    Not as fast as float by any means, but there's a reason why they exist
    in all languages which include the financial sphere in their presumed
    application space.

    The simplest thing is to realize that using float4 leaves you with
    just over 7 significant digits, and to only print out 7 digits.
    Then you'll get the answer you expect (10.100000).



    - Don Baccus, Portland OR <[email protected]>
    Nature photos, on-line guides, Pacific Northwest
    Rare Bird Alert Service and other goodies at
    http://donb.photo.net.
  • Philip Warner at Aug 8, 2000 at 1:53 pm

    At 05:50 8/08/00 -0700, Don Baccus wrote:
    The simplest thing is to realize that using float4 leaves you with
    just over 7 significant digits, and to only print out 7 digits.
    Then you'll get the answer you expect (10.100000).
    You may have missed the point; my suggestions are only aimed at changing
    the results of float4/float8 conversions & comparisons.

    My (very vague) recollections of this stuff is that the machine
    representation is only guaranteed to be within a certain machine/language
    accuracy, so the stored value is within +/-(machine error) of the 'real
    value'. Further, my recollection is that one or more bits are usually used
    to provide rounding information so that, eg., the 7 digit representations
    are consistent.

    Given this, I have assumed that printf etc use these least significant bits
    to determine the 'correct' representation. The idea is to do exactly the
    same in converting float4 to float8, so that:

    '4.1'::float4 = '4.1'::float8

    will be true.

    Maybe my recollection is false...




    ----------------------------------------------------------------
    Philip Warner | __---_____
    Albatross Consulting Pty. Ltd. |----/ - \
    (A.B.N. 75 008 659 498) | /(@) ______---_
    Tel: (+61) 0500 83 82 81 | _________ \
    Fax: (+61) 0500 83 82 82 | ___________ |
    Http://www.rhyme.com.au | / \|
    --________--
    PGP key available upon request, | /
    and from pgp5.ai.mit.edu:11371 |/
  • Tom Lane at Aug 8, 2000 at 2:06 pm

    Philip Warner writes:
    Given this, I have assumed that printf etc use these least significant bits
    to determine the 'correct' representation.
    No. What float4-to-text really does is *discard* information, by
    rounding off the printed result to only 7 digits (when there are
    actually 7-and-change in there). This means values that are actually
    distinct float4 values may get printed as the same thing:

    regression=# select 1.234567 :: float4;
    ?column?
    ----------
    1.23457
    (1 row)

    regression=# select 1.234568 :: float4;
    ?column?
    ----------
    1.23457
    (1 row)

    regression=# select 1.234567 :: float4 = 1.234568 :: float4;
    ?column?
    ----------
    f
    (1 row)

    regression=# select 1.234567 :: float4 - 1.234568 :: float4;
    ?column?
    --------------
    -9.53674e-07
    (1 row)

    I don't much care for this behavior (since it means dump and reload of
    float columns is lossy), and I certainly won't hold still for
    introducing it into other operations on floats.

    regards, tom lane
  • Don Baccus at Aug 8, 2000 at 2:25 pm

    At 10:04 AM 8/8/00 -0400, Tom Lane wrote:
    Philip Warner <[email protected]> writes:
    Given this, I have assumed that printf etc use these least significant bits
    to determine the 'correct' representation.
    No. What float4-to-text really does is *discard* information, by
    rounding off the printed result to only 7 digits (when there are
    actually 7-and-change in there).
    Which is the standard approach for such conversion routines. If you
    keep generating digits they're just garbage anyway.

    As far as the rest, Phil, it is true that well-designed floating-point
    hardware such as that which follows the IEEE spec (well, at least
    better-designed compared to most of its predecessors) strictly specifies
    how extra rounding information is to be used during various mathematical
    operations (add, multiply, etc). This is done so that error due to rounding
    can be strictly bounded. The result (for float4) is a 24-bit mantissa
    with strictly defined significance.

    However, the float4 itself once stored only consists of that 24-bit
    mantissa. There's no way to know the history of how that 24th bit
    was generated, i.e. whether all the bits to the right were exactly
    zero or whether it was the result of rounding (or truncation if the
    user specified it and the hardware supports it).

    Kludging conversion by using decimal conversion will simply lose
    significance. In your 10.1 case you'll be happy because that 24th
    bit becomes zero.

    All you've accomplished, though, is to throw away (at least) one bit.
    Your float8 now has no more than 23 bits of significance rather than
    24. Repeat this process a few times and you could store the result
    in a boolean, in terms of the bits you could guarantee to be
    significant ...
    I don't much care for this behavior (since it means dump and reload of
    float columns is lossy),
    A good reason for binary backup programs!
    and I certainly won't hold still for
    introducing it into other operations on floats.
    No, it flies in the face of not only convention, but a lot of investigation
    into how to implement floating point arithmetic in a way that's useful
    to those who have to depend on the results having mathematically definable
    error bounds.



    - Don Baccus, Portland OR <[email protected]>
    Nature photos, on-line guides, Pacific Northwest
    Rare Bird Alert Service and other goodies at
    http://donb.photo.net.
  • Philip Warner at Aug 8, 2000 at 3:40 pm

    At 10:04 8/08/00 -0400, Tom Lane wrote:
    No. What float4-to-text really does is *discard* information, by
    rounding off the printed result to only 7 digits (when there are
    actually 7-and-change in there). This means values that are actually
    distinct float4 values may get printed as the same thing:
    Thanks guys for some remarkably patient explanations. I now know more than
    I want to know about float values.

    I don't much care for this behavior (since it means dump and reload of
    float columns is lossy), and I certainly won't hold still for
    introducing it into other operations on floats.
    This makes me think that some kind of binary dump in pg_dump is probably
    not a bad idea. Has anybody looked at doing a cross-platform binary COPY?
    Or some other way of representing base types - we have <type>in/out maybe
    <type>exp/imp (export/import) might be useful to get a portable, lossless
    representation.


    ----------------------------------------------------------------
    Philip Warner | __---_____
    Albatross Consulting Pty. Ltd. |----/ - \
    (A.B.N. 75 008 659 498) | /(@) ______---_
    Tel: (+61) 0500 83 82 81 | _________ \
    Fax: (+61) 0500 83 82 82 | ___________ |
    Http://www.rhyme.com.au | / \|
    --________--
    PGP key available upon request, | /
    and from pgp5.ai.mit.edu:11371 |/
  • Don Baccus at Aug 8, 2000 at 5:36 pm

    At 01:40 AM 8/9/00 +1000, Philip Warner wrote:
    This makes me think that some kind of binary dump in pg_dump is probably
    not a bad idea. Has anybody looked at doing a cross-platform binary COPY?
    Or some other way of representing base types - we have <type>in/out maybe
    <type>exp/imp (export/import) might be useful to get a portable, lossless
    representation.
    Another way to do it is to dump/restore floats in hex, maintaining the
    actual binary values.

    Conversion to hex, unlike conversion to decimal, is exact (16 is a power
    of 2 while 10 is not, to add to your "more knowledge than you want" about
    floats!)



    - Don Baccus, Portland OR <[email protected]>
    Nature photos, on-line guides, Pacific Northwest
    Rare Bird Alert Service and other goodies at
    http://donb.photo.net.
  • Tom Lane at Aug 7, 2000 at 4:12 pm

    Philip Warner writes:
    pjw=# select float8(float4(10.1));
    float8
    ------------------
    10.1000003814697
    (1 row)
    I would have expected the latter to be at worst 10.10000000000000 +/-
    .00000000000001.
    float4 is good to about 7 decimal digits (24 mantissa bits) on
    IEEE-standard machines. Thus the above result is actually closer
    than you have any right to expect.

    Don't they teach people about float arithmetic in CS 101 anymore?

    regards, tom lane
  • Philip Warner at Aug 7, 2000 at 4:15 pm

    At 12:11 7/08/00 -0400, Tom Lane wrote:
    Philip Warner <[email protected]> writes:
    pjw=# select float8(float4(10.1));
    float8
    ------------------
    10.1000003814697
    (1 row)
    I would have expected the latter to be at worst 10.10000000000000 +/-
    .00000000000001.
    float4 is good to about 7 decimal digits (24 mantissa bits) on
    IEEE-standard machines. Thus the above result is actually closer
    than you have any right to expect.

    Don't they teach people about float arithmetic in CS 101 anymore?
    No idea. It's a couple of decades since I did it.

    I wasn't complaining about the float4 accuracy; I was complaining about the
    way it was converted to float8. It seems more intuitive to zero-extend base
    10 zeros...




    ----------------------------------------------------------------
    Philip Warner | __---_____
    Albatross Consulting Pty. Ltd. |----/ - \
    (A.C.N. 008 659 498) | /(@) ______---_
    Tel: (+61) 0500 83 82 81 | _________ \
    Fax: (+61) 0500 83 82 82 | ___________ |
    Http://www.rhyme.com.au | / \|
    --________--
    PGP key available upon request, | /
    and from pgp5.ai.mit.edu:11371 |/
  • Don Baccus at Aug 7, 2000 at 6:35 pm

    At 02:15 AM 8/8/00 +1000, Philip Warner wrote:
    I wasn't complaining about the float4 accuracy; I was complaining about the
    way it was converted to float8. It seems more intuitive to zero-extend base
    10 zeros...
    You're right, it absolutely is more intuitive - in a base-10 representation,
    i.e. NUMERIC. A float4 is a binary number, and the only zeros available
    are binary ones. The same is true of the non-zeros, for that matter. Not
    that it matters, adding zero to the right of any number doesn't add
    significance,
    something they pound into people's heads in the physical sciences.

    Doing a type conversion from float4 to float8 is in general not a safe thing
    to do, because you only can depend on 24 bits of mantissa significance
    afterwards anyway. One such conversion will propagate that lesser
    significance
    all throughout the expressions using it. Take great care when you do this.

    As Tom pointed out you're getting 8 digits of decimal significance in
    your example (10.100000) due to the particular number involved. You can
    only expect 24/log2(10) digits, which as he points out is just 7 digits plus
    change.

    The basic problem is that we evolved with 10 fingers, rather than 8 or 16 :)



    - Don Baccus, Portland OR <[email protected]>
    Nature photos, on-line guides, Pacific Northwest
    Rare Bird Alert Service and other goodies at
    http://donb.photo.net.
  • Thomas Lockhart at Aug 7, 2000 at 4:01 pm

    Perhaps the old way of considering equality only to float accuracy
    is more useful, even though it opens us up to problems like overflow
    errors in "float4var = 1e100". Comments anyone?
    I would not have anticipated this either. I agree that downconverting to
    float4 is the right solution.

    Possible overflow errors can be checked in advance using the macros or
    routines already there. This may be an example of why those could be A
    Good Thing in some instances.

    - Thomas
  • Tom Lane at Aug 7, 2000 at 6:59 pm

    <[email protected]> writes:
    Is there a reason we can't perform the conversion and then copy the
    low-order bits manually, with some bit-shifting and masking?
    *What* low-order bits? The fundamental problem is we don't have 'em.

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedAug 7, '00 at 3:36p
activeAug 8, '00 at 5:36p
posts17
users4
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2023 Grokbase