FAQ
% psql test1
Welcome to the POSTGRESQL interactive sql monitor:
Please read the file COPYRIGHT for copyright terms of POSTGRESQL
[PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]

type \? for help on slash commands
type \q to quit
type \g or terminate with semicolon to execute query
You are currently connected to the database: test1

test1=> select count(*), max("ID"), min("ID"), avg("ID") from "ItemsBars";
count| max| min| avg
------+-------+-----+----
677719|3075717|61854|-251
(1 row)

Overflow, perhaps?

Gene Sokolov.

Search Discussions

  • Thomas Lockhart at Jun 16, 1999 at 1:02 pm

    [PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]
    test1=> select count(*), max("ID"), min("ID"), avg("ID") from "ItemsBars";
    count| max| min| avg
    ------+-------+-----+----
    677719|3075717|61854|-251
    (1 row)
    Overflow, perhaps?
    Of course. These are integer fields? I've been considering changing
    all accumulators (and results) for integer aggregate functions to
    float8, but have not done so yet. I was sort of waiting for a v7.0
    release, but am not sure why...

    - Thomas

    --
    Thomas Lockhart lockhart@alumni.caltech.edu
    South Pasadena, California
  • Gene Sokolov at Jun 16, 1999 at 1:28 pm

    [PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]
    test1=> select count(*), max("ID"), min("ID"), avg("ID") from
    "ItemsBars";
    count| max| min| avg
    ------+-------+-----+----
    677719|3075717|61854|-251
    (1 row)
    Overflow, perhaps?
    Of course. These are integer fields? I've been considering changing
    Yes, the fields are int4
    all accumulators (and results) for integer aggregate functions to
    float8, but have not done so yet. I was sort of waiting for a v7.0
    release, but am not sure why...
    Float8 accumulator seems to be a good solution if AVG is limited to
    int/float types. I wonder if it could produce system dependency in AVG due
    to rounding errors. Some broader solution should be considered though if you
    want AVG to work on numeric/decimal as well.

    Gene Sokolov.
  • Thomas Lockhart at Jun 16, 1999 at 2:01 pm

    Float8 accumulator seems to be a good solution if AVG is limited to
    int/float types. I wonder if it could produce system dependency in AVG due
    to rounding errors. Some broader solution should be considered though if you
    want AVG to work on numeric/decimal as well.
    The implementation can be specified for each datatype individually, so
    that's not a problem. afaik the way numeric/decimal work it would be
    fine to use those types as their own accumulators. It's mostly the
    int2/int4/int8 types which are the problem, since they silently
    overflow (on most machines?).

    - Thomas

    --
    Thomas Lockhart lockhart@alumni.caltech.edu
    South Pasadena, California
  • Tom Lane at Jun 16, 1999 at 2:54 pm

    Thomas Lockhart writes:
    Some broader solution should be considered though if you
    want AVG to work on numeric/decimal as well.
    The implementation can be specified for each datatype individually,
    In the current implementation, each datatype does use its own type as
    the accumulator --- and also as the counter. float8 and numeric are
    fine, float4 is sort of OK (a float8 accumulator would be better for
    accuracy reasons), int4 loses, int2 loses *bad*.

    To fix it we'd need to invent operators that do the appropriate cross-
    data-type operations. For example, int4 avg using float8 accumulator
    would need "float8 + int4 yielding float8" and "float8 / int4 yielding
    int4", neither of which are to be found in pg_proc at the moment. But
    it's a straightforward thing to do.

    int8 is the only integer type that I wouldn't want to use a float8
    accumulator for. Maybe numeric would be the appropriate thing here,
    slow though it be.

    Note that switching over to float accumulation would *not* be real
    palatable until we have fixed the memory-leak issue. avg() on int4
    doesn't leak memory currently, but it would with a float accumulator...

    regards, tom lane
  • Jan Wieck at Jun 16, 1999 at 2:08 pm
    Thomas Lockhart wrote:
    [PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]
    test1=> select count(*), max("ID"), min("ID"), avg("ID") from "ItemsBars";
    count| max| min| avg
    ------+-------+-----+----
    677719|3075717|61854|-251
    (1 row)
    Overflow, perhaps?
    Of course. These are integer fields? I've been considering changing
    all accumulators (and results) for integer aggregate functions to
    float8, but have not done so yet. I was sort of waiting for a v7.0
    release, but am not sure why...
    Wouldn't it be better to use NUMERIC for the avg(int) state
    values? It will never loose any significant digit.


    Jan

    --

    #======================================================================#
    # It's easier to get forgiveness for being wrong than for being right. #
    # Let's break this rule - forgive me. #
    #========================================= wieck@debis.com (Jan Wieck) #
  • Thomas Lockhart at Jun 16, 1999 at 2:19 pm

    Of course. These are integer fields? I've been considering changing
    all accumulators (and results) for integer aggregate functions to
    float8, but have not done so yet. I was sort of waiting for a v7.0
    release, but am not sure why...
    Wouldn't it be better to use NUMERIC for the avg(int) state
    values? It will never loose any significant digit.
    Sure. It would be fast, right? avg(int) is likely to be used a lot,
    and should be as fast as possible.

    - Thomas

    --
    Thomas Lockhart lockhart@alumni.caltech.edu
    South Pasadena, California
  • Jan Wieck at Jun 16, 1999 at 11:22 pm
    Thomas Lockhart wrote:
    Of course. These are integer fields? I've been considering changing
    all accumulators (and results) for integer aggregate functions to
    float8, but have not done so yet. I was sort of waiting for a v7.0
    release, but am not sure why...
    Wouldn't it be better to use NUMERIC for the avg(int) state
    values? It will never loose any significant digit.
    Sure. It would be fast, right? avg(int) is likely to be used a lot,
    and should be as fast as possible.
    I think it would be fast enough, even if I have things in
    mind how to speed it up. But that would result in a total
    rewrite of NUMERIC from scratch.

    The only math function of NUMERIC which is time critical for
    AVG() is ADD. And even for int8 the number of digits it has
    to perform is relatively small. I expect the time spent on
    that is negligible compared to the heap scanning required to
    get all the values.


    Jan

    --

    #======================================================================#
    # It's easier to get forgiveness for being wrong than for being right. #
    # Let's break this rule - forgive me. #
    #========================================= wieck@debis.com (Jan Wieck) #
  • Tom Lane at Jun 16, 1999 at 2:39 pm

    "Gene Sokolov" <hook@aktrad.ru> writes:
    test1=> select count(*), max("ID"), min("ID"), avg("ID") from "ItemsBars";
    count| max| min| avg
    ------+-------+-----+----
    677719|3075717|61854|-251
    Overflow, perhaps?
    sum() and avg() for int fields use int accumulators. You might want
    to use avg(float8(field)) to get a less-likely-to-overflow result.

    Someday it'd be a good idea to revise the sum() and avg() aggregates
    to use float or numeric accumulators in all cases. This'd require
    inventing a few more cross-data-type operators...

    regards, tom lane
  • Jackson, DeJuan at Jun 16, 1999 at 3:22 pm
    What does the spec have to say? It bothers me somewhat that an AVG is
    expected to return an integer result at all. Isn't the Average of 1 and 2,
    1.5 not 1?

    just my $0.02,
    -DEJ
    -----Original Message-----
    From: Tom Lane [SMTP:tgl@sss.pgh.pa.us]
    Sent: Wednesday, June 16, 1999 9:52 AM
    To: Thomas Lockhart
    Cc: Gene Sokolov; pgsql-hackers@postgreSQL.org
    Subject: Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )

    Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
    Some broader solution should be considered though if you
    want AVG to work on numeric/decimal as well.
    The implementation can be specified for each datatype individually,
    In the current implementation, each datatype does use its own type as
    the accumulator --- and also as the counter. float8 and numeric are
    fine, float4 is sort of OK (a float8 accumulator would be better for
    accuracy reasons), int4 loses, int2 loses *bad*.

    To fix it we'd need to invent operators that do the appropriate cross-
    data-type operations. For example, int4 avg using float8 accumulator
    would need "float8 + int4 yielding float8" and "float8 / int4 yielding
    int4", neither of which are to be found in pg_proc at the moment. But
    it's a straightforward thing to do.

    int8 is the only integer type that I wouldn't want to use a float8
    accumulator for. Maybe numeric would be the appropriate thing here,
    slow though it be.

    Note that switching over to float accumulation would *not* be real
    palatable until we have fixed the memory-leak issue. avg() on int4
    doesn't leak memory currently, but it would with a float accumulator...

    regards, tom lane
  • Thomas Lockhart at Jun 16, 1999 at 3:28 pm

    What does the spec have to say? It bothers me somewhat that an AVG is
    expected to return an integer result at all. Isn't the Average of 1 and 2,
    1.5 not 1?
    Yeah, well, it's a holdover from the original Postgres code. We just
    haven't made an effort to change it yet, but it seems a good candidate
    for a makeover, no?

    I'm pretty sure that the spec would suggest a float8 return value for
    avg(int), but I haven't looked recently to refresh my memory.

    - Thomas

    --
    Thomas Lockhart lockhart@alumni.caltech.edu
    South Pasadena, California
  • Tom Lane at Jun 16, 1999 at 3:32 pm

    "Jackson, DeJuan" <djackson@cpsgroup.com> writes:
    What does the spec have to say? It bothers me somewhat that an AVG is
    expected to return an integer result at all. Isn't the Average of 1
    and 2, 1.5 not 1?
    That bothered me too. The draft spec that I have sez:

    b) If SUM is specified and DT is exact numeric with scale
    S, then the data type of the result is exact numeric with
    implementation-defined precision and scale S.

    c) If AVG is specified and DT is exact numeric, then the data
    type of the result is exact numeric with implementation-
    defined precision not less than the precision of DT and
    implementation-defined scale not less than the scale of DT.

    d) If DT is approximate numeric, then the data type of the
    result is approximate numeric with implementation-defined
    precision not less than the precision of DT.

    65)Subclause 6.5, "<set function specification>": The precision of
    the value derived from application of the SUM function to a data
    type of exact numeric is implementation-defined.

    66)Subclause 6.5, "<set function specification>": The precision and
    scale of the value derived from application of the AVG function
    to a data type of exact numeric is implementation-defined.

    67)Subclause 6.5, "<set function specification>": The preci-
    sion of the value derived from application of the SUM func-
    tion or AVG function to a data type of approximate numeric is
    implementation-defined.


    This would seem to give license for the result of AVG() on an int4 field
    to be NUMERIC with a fraction part, but not FLOAT. But I suspect we
    could get away with making it be FLOAT anyway. Anyone know what other
    databases do?

    regards, tom lane
  • Thomas Lockhart at Jun 16, 1999 at 3:45 pm

    This would seem to give license for the result of AVG() on an int4 field
    to be NUMERIC with a fraction part, but not FLOAT. But I suspect we
    could get away with making it be FLOAT anyway.
    Sure, that can't be worse in practice than what we do now. But it is
    interesting that we are currently SQL92 conforming (except for that
    nasty overflow business; they probably don't mention that ;).

    For int2/int4, we could bump the accumulator to int8 (certainly faster
    than our numeric implementation?), but there are a very few platforms
    which don't support int8 and we shouldn't break the aggregates for
    them. We could get around that by defining explicit routines to be
    used in the aggregates, and then having some #ifdef alternate code if
    int8 is not available...

    Tom, do you think that a hack in the aggregate support code which
    compares the pointer returned to the pointer input, then pfree'ing the
    input area if they differ, would fix the major leakage? We could even
    have a backend global variable which enables/disables the feature to
    allow performance tuning.

    - Thomas

    --
    Thomas Lockhart lockhart@alumni.caltech.edu
    South Pasadena, California
  • Tom Lane at Jun 16, 1999 at 4:09 pm

    Thomas Lockhart writes:
    For int2/int4, we could bump the accumulator to int8 (certainly faster
    than our numeric implementation?), but there are a very few platforms
    which don't support int8 and we shouldn't break the aggregates for
    them.
    Right, that's why I preferred the idea of using float8.

    Note that any reasonable floating-point implementation will deliver an
    exact result for the sum of integer inputs, up to the point at which the
    sum exceeds the number of mantissa bits in a float (2^52 or so in IEEE
    float8). After that you start to lose accuracy. Using int8 would give
    an exact sum up to 2^63, but if we want to start delivering a fractional
    average then float still looks like a better deal...
    Tom, do you think that a hack in the aggregate support code which
    compares the pointer returned to the pointer input, then pfree'ing the
    input area if they differ, would fix the major leakage?
    Yeah, that would probably work OK, although you'd have to be careful of
    the initial condition --- is the initial value always safely pfreeable?
    We could even have a backend global variable which enables/disables
    the feature to allow performance tuning.
    Seems unnecessary.

    regards, tom lane
  • José Soares at Jun 17, 1999 at 6:49 am
    PostgreSQL:
    ^^^^^^^^^^^
    prova=> select min(a), max(a), avg(a) from aa;
    min|max|avg
    ---+---+---
    1| 2| 1
    (1 row)


    informix:----------- hygea@hygea ------------ Press CTRL-W for Help --------

    ^^^^^^^^^
    (min) (max) (avg)

    1 2 1.50000000000000

    oracle:
    ^^^^^^^
    SQL> select min(a), max(a), avg(a) from aa;

    MIN(A) MAX(A) AVG(A)
    ---------- ---------- ----------
    1 2 1.5




    Tom Lane ha scritto:
    "Jackson, DeJuan" <djackson@cpsgroup.com> writes:
    What does the spec have to say? It bothers me somewhat that an AVG is
    expected to return an integer result at all. Isn't the Average of 1
    and 2, 1.5 not 1?
    That bothered me too. The draft spec that I have sez:

    b) If SUM is specified and DT is exact numeric with scale
    S, then the data type of the result is exact numeric with
    implementation-defined precision and scale S.

    c) If AVG is specified and DT is exact numeric, then the data
    type of the result is exact numeric with implementation-
    defined precision not less than the precision of DT and
    implementation-defined scale not less than the scale of DT.

    d) If DT is approximate numeric, then the data type of the
    result is approximate numeric with implementation-defined
    precision not less than the precision of DT.

    65)Subclause 6.5, "<set function specification>": The precision of
    the value derived from application of the SUM function to a data
    type of exact numeric is implementation-defined.

    66)Subclause 6.5, "<set function specification>": The precision and
    scale of the value derived from application of the AVG function
    to a data type of exact numeric is implementation-defined.

    67)Subclause 6.5, "<set function specification>": The preci-
    sion of the value derived from application of the SUM func-
    tion or AVG function to a data type of approximate numeric is
    implementation-defined.

    This would seem to give license for the result of AVG() on an int4 field
    to be NUMERIC with a fraction part, but not FLOAT. But I suspect we
    could get away with making it be FLOAT anyway. Anyone know what other
    databases do?

    regards, tom lane
    ______________________________________________________________
    PostgreSQL 6.5.0 on i586-pc-linux-gnu, compiled by gcc 2.7.2.3
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    Jose'
  • Vince Vielhaber at Jun 17, 1999 at 10:47 am

    On Wed, 16 Jun 1999, [iso-8859-1] José Soares wrote:

    PostgreSQL:
    ^^^^^^^^^^^
    prova=> select min(a), max(a), avg(a) from aa;
    min|max|avg
    ---+---+---
    1| 2| 1
    (1 row)
    Sybase - I'm guessing/ass-u-me ing it's around version 4.9

    1> select min(a), max(a), avg(a) from aa
    2> go

    ----------- ----------- -----------
    1 2 1

    (1 row affected)
    1>
    This would seem to give license for the result of AVG() on an int4 field
    to be NUMERIC with a fraction part, but not FLOAT. But I suspect we
    could get away with making it be FLOAT anyway. Anyone know what other
    databases do?

    Vince.
    --
    ==========================================================================
    Vince Vielhaber -- KA8CSH email: vev@michvhf.com flame-mail: /dev/null
    # include <std/disclaimers.h> TEAM-OS2
    Online Campground Directory http://www.camping-usa.com
    Online Giftshop Superstore http://www.cloudninegifts.com
    ==========================================================================

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJun 16, '99 at 6:42a
activeJun 17, '99 at 10:47a
posts16
users7
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase