Shouldn't this work?

select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be used
in an aggregate function

If I am asking for a specific column value, should I, technically
speaking, need to group by that column?

Search Discussions

  • Stephen Frost at Oct 17, 2006 at 4:34 pm

    * Mark Woodward (pgsql@mohawksoft.com) wrote:
    If I am asking for a specific column value, should I, technically
    speaking, need to group by that column?
    Technically speaking, if you're asking for a specific tuple, should you
    be allowed to request an aggregation?

    Thanks,

    Stephen
  • D'Arcy J.M. Cain at Oct 17, 2006 at 4:46 pm

    On Tue, 17 Oct 2006 12:08:07 -0400 Stephen Frost wrote:
    * Mark Woodward (pgsql@mohawksoft.com) wrote:
    If I am asking for a specific column value, should I, technically
    speaking, need to group by that column?
    Technically speaking, if you're asking for a specific tuple, should you
    be allowed to request an aggregation?
    One column value doesn't necessarily mean one tuple unless it has a
    unique index on that column.

    SELECT COUNT(*) FROM table WHERE field = 'value';

    That's perfectly reasonable. You don't need the GROUP BY clause.

    However, this doesn't sound like a hackers question. Next time, please
    ask on another list such as pgsql-sql or even pgsql-novice. You can
    review the mailing lists and their purpose at
    http://www.postgresql.org/community/lists/

    --
    D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
  • Shane Ambler at Oct 17, 2006 at 5:11 pm

    Stephen Frost wrote:
    * Mark Woodward (pgsql@mohawksoft.com) wrote:
    If I am asking for a specific column value, should I, technically
    speaking, need to group by that column?
    Technically speaking, if you're asking for a specific tuple, should you
    be allowed to request an aggregation?
    Only with the assumption that the value in the where clause is for a
    unique column.

    If you want min(col2) and avg(col2) where col1=x you can get it without
    a group by, the same as if you put col1<x - if you want an aggregate of
    all records returned not the aggregate based on each value of col1.

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    But back to the query the issue comes in that the ycis_id value is
    included with the return values requested (a single row value with
    aggregate values that isn't grouped) - if ycis_id is not unique you will
    get x number of returned tuples with ycis_id=15 and the same min() and
    avg() values for each row.
    Removing the ycis_id after the select will return the aggregate values
    you want without the group by.


    --

    Shane Ambler
    Postgres@007Marketing.com

    Get Sheeky @ http://Sheeky.Biz
  • Mark L. Woodward at Oct 17, 2006 at 5:38 pm

    Stephen Frost wrote:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    But back to the query the issue comes in that the ycis_id value is
    included with the return values requested (a single row value with
    aggregate values that isn't grouped) - if ycis_id is not unique you will
    get x number of returned tuples with ycis_id=15 and the same min() and
    avg() values for each row.
    Removing the ycis_id after the select will return the aggregate values
    you want without the group by.
    I still assert that there will always only be one row to this query. This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it should
    not need to be grouped.

    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
  • Nolan Cafferky at Oct 17, 2006 at 5:55 pm

    Mark Woodward wrote:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    I still assert that there will always only be one row to this query. This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it should
    not need to be grouped.

    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    I think your point is that every non-aggregate column in the results of
    the query also appears in the where clause and is given a single value
    there, so conceivably, an all-knowing, all-powerful postgres could
    recognize this and do the implied GROUP by on these columns.

    I'm not in a position to give a definitive answer on this, but I suspect
    that adjusting the query parser/planner to allow an implied GROUP BY
    either gets prohibitively complicated, or fits too much of a special
    case to be worth implementing.

    select
    ycis_id,
    some_other_id,
    min(tindex),
    avg(tindex)
    from
    y
    where
    ycis_id = 15
    group by
    some_other_id;


    Here, postgres would have to use the group by you specified, and also
    recognize the single-valued constant assigned to ycis_id. Maybe not too
    bad, but:

    select
    ycis_id,
    some_other_id,
    min(tindex),
    avg(tindex)
    from
    y
    where
    ycis_id = some_single_valued_constant(foo, bar)
    group by
    some_other_id;

    In this case, postgres doesn't know whether
    some_single_valued_constant() will really return the same single value
    for every tuple. Ultimately, as more complex queries are introduced, it
    would become a lot simpler for the query writer to just specify the
    group by columns instead of trying to guess it from the where clause.

    Final note: I could also see situations where an implied group by would
    silently allow a poorly written query to execute, instead of throwing an
    error that suggests to the query writer that they did something wrong.

    --
    Nolan Cafferky
    Software Developer
    IT Department
    RBS Interactive
    nolan.cafferky@rbsinteractive.com
  • Mark L. Woodward at Oct 17, 2006 at 6:40 pm

    Mark Woodward wrote:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    I still assert that there will always only be one row to this query.
    This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it
    should
    not need to be grouped.

    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    I think your point is that every non-aggregate column in the results of
    the query also appears in the where clause and is given a single value
    there, so conceivably, an all-knowing, all-powerful postgres could
    recognize this and do the implied GROUP by on these columns.
    Not exactly.
    I'm not in a position to give a definitive answer on this, but I suspect
    that adjusting the query parser/planner to allow an implied GROUP BY
    either gets prohibitively complicated, or fits too much of a special
    case to be worth implementing.

    select
    ycis_id,
    some_other_id,
    min(tindex),
    avg(tindex)
    from
    y
    where
    ycis_id = 15
    group by
    some_other_id;
    This is not, in fact, like the example I gave and confuses the point I am
    trying to make.


    The original query:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ycis_id is unambiguous and MUST be only one value, there should be no
    requirement of grouping. In fact, a "group by" implies multiple result
    rows in an aggregate query.

    As I said in other branches of this thread, this isn't a SQL question, it
    is a question of whether or not the PostgreSQL parser is correct or not,
    and I do not believe that it is working correctly.
  • Andrew Dunstan at Oct 17, 2006 at 5:56 pm

    Mark Woodward wrote:
    Stephen Frost wrote:

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    But back to the query the issue comes in that the ycis_id value is
    included with the return values requested (a single row value with
    aggregate values that isn't grouped) - if ycis_id is not unique you will
    get x number of returned tuples with ycis_id=15 and the same min() and
    avg() values for each row.
    Removing the ycis_id after the select will return the aggregate values
    you want without the group by.
    I still assert that there will always only be one row to this query. This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it should
    not need to be grouped.

    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    AFAIK what you want is not per sql spec. What if you had instead written


    select ycis_id, min(tindex), avg(tindex) from y where frobnitz(ycis_id) = 15;


    ? I think you are expecting too much reasoning from the engine.

    cheers

    andrew
  • Mark L. Woodward at Oct 17, 2006 at 6:35 pm

    Mark Woodward wrote:
    Stephen Frost wrote:

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    But back to the query the issue comes in that the ycis_id value is
    included with the return values requested (a single row value with
    aggregate values that isn't grouped) - if ycis_id is not unique you
    will
    get x number of returned tuples with ycis_id=15 and the same min() and
    avg() values for each row.
    Removing the ycis_id after the select will return the aggregate values
    you want without the group by.
    I still assert that there will always only be one row to this query.
    This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it
    should
    not need to be grouped.

    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    AFAIK what you want is not per sql spec. What if you had instead written


    select ycis_id, min(tindex), avg(tindex) from y where frobnitz(ycis_id)
    = 15;


    ? I think you are expecting too much reasoning from the engine.
    Regardless, I can get the results I need and have already worked around
    this. The reason why I posted the question to hackers was that I think it
    is a bug.

    The output column "ycis_id" is unabiguously a single value with regards to
    the query. Shouldn't PostgreSQL "know" this? AFAIR, I think I've used this
    exact type of query before either on PostgreSQL or another system, maybe
    Oracle, and it did work.
  • Joe Sunday at Oct 17, 2006 at 8:54 pm

    On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote:

    The output column "ycis_id" is unabiguously a single value with regards to
    the query. Shouldn't PostgreSQL "know" this? AFAIR, I think I've used this
    exact type of query before either on PostgreSQL or another system, maybe
    Oracle, and it did work.
    Doesn't work in Oracle 10g:

    SELECT ycis_id, tindex from x where ycis_id = 15;
    YCIS_ID TINDEX
    ======= ======
    15 10
    15 20

    SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15;
    ORA-00937: not a single-group group function

    SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP BY ycis_id;
    YCIS_ID MIN(TINDEX) AVG(TINDEX)
    ======= =========== ===========
    15 10 15

    --Joe

    --
    Joe Sunday <sunday@csh.rit.edu> http://www.csh.rit.edu/~sunday/
    Computer Science House, Rochester Inst. Of Technology
  • Mark L. Woodward at Oct 17, 2006 at 8:37 pm

    On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote:
    The output column "ycis_id" is unabiguously a single value with regards
    to
    the query. Shouldn't PostgreSQL "know" this? AFAIR, I think I've used
    this
    exact type of query before either on PostgreSQL or another system, maybe
    Oracle, and it did work.
    Doesn't work in Oracle 10g:

    SELECT ycis_id, tindex from x where ycis_id = 15;
    YCIS_ID TINDEX
    ======= ======
    15 10
    15 20

    SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15;
    ORA-00937: not a single-group group function

    SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP
    BY ycis_id;
    YCIS_ID MIN(TINDEX) AVG(TINDEX)
    ======= =========== ===========
    15 10 15

    That's interesting. I am digging through the SQL99 spec, and am trying to
    find a definitive answer.
  • Shane Ambler at Oct 17, 2006 at 9:31 pm

    Joe Sunday wrote:
    On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote:

    The output column "ycis_id" is unabiguously a single value with regards to
    the query. Shouldn't PostgreSQL "know" this? AFAIR, I think I've used this
    exact type of query before either on PostgreSQL or another system, maybe
    Oracle, and it did work.
    Doesn't work in Oracle 10g:

    SELECT ycis_id, tindex from x where ycis_id = 15;
    YCIS_ID TINDEX
    ======= ======
    15 10
    15 20

    SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15;
    ORA-00937: not a single-group group function

    SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP BY ycis_id;
    YCIS_ID MIN(TINDEX) AVG(TINDEX)
    ======= =========== ===========
    15 10 15

    --Joe
    MySQL reports -
    Mixing of GROUP columns (MIN(),MAX(),COUNT()...) with no GROUP columns
    is illegal if there is no GROUP BY clause


    I found one that actually returns the desired result - SQLite3.

    sqlite> select * from test;
    15|20
    15|10
    sqlite> select ycis_id,min(tindex),avg(tindex) from test where ycis_id=15;
    15|10|15
    sqlite>


    --

    Shane Ambler
    Postgres@007Marketing.com

    Get Sheeky @ http://Sheeky.Biz
  • Tom Lane at Oct 17, 2006 at 9:22 pm

    Andrew Dunstan writes:
    Mark Woodward wrote:
    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    AFAIK what you want is not per sql spec.
    It would in fact be a violation of spec. Consider the case where there
    are no rows matching 15. In this case

    select min(tindex), avg(tindex) from y where ycis_id = 15;

    will yield one row containing NULLs, whereas

    select min(tindex), avg(tindex) from y where ycis_id = 15 group by ycis_id;

    will yield no rows (because there are no groups). Therefore, if
    Postgres were to implicitly insert a GROUP BY to make it legal to
    reference ycis_id directly, we'd be changing the query behavior
    and breaking spec.

    regards, tom lane
  • Przemek at Oct 18, 2006 at 11:06 am

    Dnia 17-10-2006 o godz. 23:21 Tom Lane napisał(a):
    Andrew Dunstan <andrew@dunslane.net> writes:
    Mark Woodward wrote:
    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    AFAIK what you want is not per sql spec.
    It would in fact be a violation of spec. Consider the case where there
    are no rows matching 15. In this case

    select min(tindex), avg(tindex) from y where ycis_id = 15;

    will yield one row containing NULLs, whereas

    select min(tindex), avg(tindex) from y where ycis_id = 15 group by
    ycis_id;

    will yield no rows (because there are no groups). Therefore, if
    Postgres were to implicitly insert a GROUP BY to make it legal to
    reference ycis_id directly, we'd be changing the query behavior
    and breaking spec.

    regards, tom lane

    Hello

    IMHO:

    Also take into consider that - what should be expected behavior of this:

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = truncate(random()*100);

    Since result of comparission is not known on parsing phase it couldn't be done here if there is no other requirements on argum
  • Markus Schaber at Oct 17, 2006 at 6:00 pm
    Hi, Mark,

    Mark Woodward wrote:
    Stephen Frost wrote:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    But back to the query the issue comes in that the ycis_id value is
    included with the return values requested (a single row value with
    aggregate values that isn't grouped) - if ycis_id is not unique you will
    get x number of returned tuples with ycis_id=15 and the same min() and
    avg() values for each row.
    Removing the ycis_id after the select will return the aggregate values
    you want without the group by.
    I still assert that there will always only be one row to this query. This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it should
    not need to be grouped.

    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    I think that it's a lack of special-casing the = operator. Imagine
    "where ycis_id>15" or "where ycis_id @|< $RECTANGLE" or other (probably
    user defined) operators on (probably user defined) datatypes.

    The parser has no real knowledge what the operators do, it simply
    requests one that returns a bool.

    One could make the parser to special case the = operator, and maybe some
    others, however I doubt it's worth the effort.


    HTH,
    Markus

    --
    Markus Schaber | Logical Tracking&Tracing International AG
    Dipl. Inf. | Software Development GIS

    Fight against software patents in Europe! www.ffii.org
    www.nosoftwarepatents.org
  • Mark L. Woodward at Oct 17, 2006 at 6:29 pm

    Hi, Mark,

    Mark Woodward wrote:
    Stephen Frost wrote:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    But back to the query the issue comes in that the ycis_id value is
    included with the return values requested (a single row value with
    aggregate values that isn't grouped) - if ycis_id is not unique you
    will
    get x number of returned tuples with ycis_id=15 and the same min() and
    avg() values for each row.
    Removing the ycis_id after the select will return the aggregate values
    you want without the group by.
    I still assert that there will always only be one row to this query.
    This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it
    should
    not need to be grouped.

    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    I think that it's a lack of special-casing the = operator. Imagine
    "where ycis_id>15" or "where ycis_id @|< $RECTANGLE" or other (probably
    user defined) operators on (probably user defined) datatypes.

    The parser has no real knowledge what the operators do, it simply
    requests one that returns a bool.

    One could make the parser to special case the = operator, and maybe some
    others, however I doubt it's worth the effort.
    I understand the SQL, and this isn't a "sql" question else it would be on
    a different list, it is a PostgreSQL internals question and IMHO potential
    bug.

    The original query:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    Should NOT require a "group by" to get ycis_id in the results.
  • Markus Schaber at Oct 20, 2006 at 11:58 am
    Hi, Mark,

    Mark Woodward wrote:
    My question, is it a syntactic technicality that PostgreSQL asks for a
    "group by," or a bug in the parser?
    I think that it's a lack of special-casing the = operator. Imagine
    "where ycis_id>15" or "where ycis_id @|< $RECTANGLE" or other (probably
    user defined) operators on (probably user defined) datatypes.

    The parser has no real knowledge what the operators do, it simply
    requests one that returns a bool.

    One could make the parser to special case the = operator, and maybe some
    others, however I doubt it's worth the effort.
    I understand the SQL, and this isn't a "sql" question else it would be on
    a different list, it is a PostgreSQL internals question and IMHO potential
    bug.
    And that's why I talked about PostgreSQL internals.
    The original query:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    Should NOT require a "group by" to get ycis_id in the results.
    And, as I wrote, this is only possible when the query parser
    special-cases the = operator (compared to all other operators).

    HTH,
    Markus
    --
    Markus Schaber | Logical Tracking&Tracing International AG
    Dipl. Inf. | Software Development GIS

    Fight against software patents in Europe! www.ffii.org
    www.nosoftwarepatents.org
  • Shane Ambler at Oct 17, 2006 at 6:58 pm

    Mark Woodward wrote:
    Stephen Frost wrote:
    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
    But back to the query the issue comes in that the ycis_id value is
    included with the return values requested (a single row value with
    aggregate values that isn't grouped) - if ycis_id is not unique you will
    get x number of returned tuples with ycis_id=15 and the same min() and
    avg() values for each row.
    Removing the ycis_id after the select will return the aggregate values
    you want without the group by.
    I still assert that there will always only be one row to this query. This
    is an aggregate query, so all the rows with ycis_id = 15, will be
    aggregated. Since ycis_id is the identifying part of the query, it should
    not need to be grouped.
    SELECT ycis_id FROM table WHERE ycis_id=15; returns a single tuple when
    ycis_id is unique otherwise multiple tuples
    which means that SELECT ycis_id is technically defined as returning a
    multiple row tuple even if ycis_id is unique - the data in the tuple
    returned is data directly from one table row

    SELECT max(col2) FROM table WHERE ycis_id=15; returns an aggregate tuple

    SELECT ycis_id FROM table WHERE ycis_id=15 GROUP BY ycis_id; returns an
    aggregate tuple (aggregated with the GROUP BY clause making the ycis_id
    after the SELECT an aggregate as well)

    You can't have both a single tuple and an aggregate tuple returned in
    the one statement. If you want the column value of ycis_id in the
    results you need the group by to unify all returned results as being
    aggregates.


    --

    Shane Ambler
    Postgres@007Marketing.com

    Get Sheeky @ http://Sheeky.Biz
  • Markus Schaber at Oct 17, 2006 at 5:53 pm
    Hi, Mark,

    Mark Woodward wrote:
    Shouldn't this work?

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be used
    in an aggregate function

    If I am asking for a specific column value, should I, technically
    speaking, need to group by that column?
    Try:

    SELECT 15 as ycis_id, min(tindex), avt(tindex) from y where ycis_id = 15;

    HTH,
    Markus
    --
    Markus Schaber | Logical Tracking&Tracing International AG
    Dipl. Inf. | Software Development GIS

    Fight against software patents in Europe! www.ffii.org
    www.nosoftwarepatents.org
  • Mark L. Woodward at Oct 17, 2006 at 6:31 pm

    Hi, Mark,

    Mark Woodward wrote:
    Shouldn't this work?

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be used
    in an aggregate function

    If I am asking for a specific column value, should I, technically
    speaking, need to group by that column?
    Try:

    SELECT 15 as ycis_id, min(tindex), avt(tindex) from y where ycis_id = 15;
    This isn't a "SQL" question!!! This is a question of whether or not
    PostgreSQL is correct in requiring a "group by" in the query. I assert
    that since it is unabiguous as to what "ycis_id" should be, PostgreSQL
    should not require a grouping.
  • Andrew Dunstan at Oct 17, 2006 at 6:48 pm

    Mark Woodward wrote:
    Hi, Mark,

    Mark Woodward wrote:
    Shouldn't this work?

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be used
    in an aggregate function

    If I am asking for a specific column value, should I, technically
    speaking, need to group by that column?
    Try:

    SELECT 15 as ycis_id, min(tindex), avt(tindex) from y where ycis_id = 15;

    This isn't a "SQL" question!!! This is a question of whether or not
    PostgreSQL is correct in requiring a "group by" in the query. I assert
    that since it is unabiguous as to what "ycis_id" should be, PostgreSQL
    should not require a grouping.
    Of course it's an SQL question. How can you ask about the correctness of
    a piece of text which purports to be SQL and then say it isn't an SQL
    question?

    If you can point to a place in the spec or our docs that sanctions the
    usage you expect, then please do so, Until then I (and I suspect
    everyone else) will persist in saying it's not a bug.

    cheers

    andrew
  • Peter Eisentraut at Oct 17, 2006 at 7:19 pm

    Mark Woodward wrote:
    Shouldn't this work?

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be
    used in an aggregate function
    This would require a great deal of special-casing, in particular
    knowledge of the = operator, and then the restriction to a particular
    form of the WHERE clause. For overall consistency, I don't think this
    should be allowed.
  • Chris Campbell at Oct 17, 2006 at 8:21 pm

    On Oct 17, 2006, at 15:19, Peter Eisentraut wrote:

    Mark Woodward wrote:
    Shouldn't this work?

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be
    used in an aggregate function
    This would require a great deal of special-casing, in particular
    knowledge of the = operator, and then the restriction to a particular
    form of the WHERE clause. For overall consistency, I don't think this
    should be allowed.
    In this particular case, the client constructing the query *knows*
    the value of ycis_id (since the client is generating the "ycis_id =
    15" clause). It's technically just a waste of bandwidth and server
    resources to recalculate it. If you really want to replicate the
    output of the query you proposed, you could rewrite it on the client as:

    select 15 as ycis_id, min(tindex), avg(tindex) from y where
    ycis_id = 15;

    You could argue that the server should do this for you, but it seems
    ugly to do in the general case. And, like Peter points out, would
    need a lot of special-casing. I guess the parser could do it for
    expressions in the SELECT clause that exactly match expressions in
    the WHERE clause.

    Thanks!

    - Chris
  • Mark L. Woodward at Oct 17, 2006 at 8:41 pm

    On Oct 17, 2006, at 15:19, Peter Eisentraut wrote:
    Mark Woodward wrote:
    Shouldn't this work?

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be
    used in an aggregate function
    This would require a great deal of special-casing, in particular
    knowledge of the = operator, and then the restriction to a particular
    form of the WHERE clause. For overall consistency, I don't think this
    should be allowed.
    In this particular case, the client constructing the query *knows*
    the value of ycis_id (since the client is generating the "ycis_id =
    15" clause). It's technically just a waste of bandwidth and server
    resources to recalculate it. If you really want to replicate the
    output of the query you proposed, you could rewrite it on the client as:

    select 15 as ycis_id, min(tindex), avg(tindex) from y where
    ycis_id = 15;

    You could argue that the server should do this for you, but it seems
    ugly to do in the general case. And, like Peter points out, would
    need a lot of special-casing. I guess the parser could do it for
    expressions in the SELECT clause that exactly match expressions in
    the WHERE clause.
    But, and here's the rub, which is the "correct" way to handle it? I'm
    looking through the SQL99 spec to see if I can find an answer.
  • Mark L. Woodward at Oct 17, 2006 at 8:39 pm

    Mark Woodward wrote:
    Shouldn't this work?

    select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

    ERROR: column "y.ycis_id" must appear in the GROUP BY clause or be
    used in an aggregate function
    This would require a great deal of special-casing, in particular
    knowledge of the = operator, and then the restriction to a particular
    form of the WHERE clause. For overall consistency, I don't think this
    should be allowed.
    Well, this started out as a "huh, that's funny, that should work, is that
    a bug?" and is turning into a search through the SQL99 spec for a clear
    answer. I've already worked around it, but to me, at least, it seems it
    should work.
  • Martijn van Oosterhout at Oct 17, 2006 at 9:20 pm

    On Tue, Oct 17, 2006 at 04:45:49PM -0400, Mark Woodward wrote:
    Well, this started out as a "huh, that's funny, that should work, is that
    a bug?" and is turning into a search through the SQL99 spec for a clear
    answer. I've already worked around it, but to me, at least, it seems it
    should work.
    What you're asking for is difficult, not done by anyone else (so far
    demostrated) and not mandated by the spec, so I don't see how it could
    be construed a bug.

    As for the spec, this is what I have from SQL2003:

    7.12.15) If T is a grouped table, then let G be the set of grouping
    columns of T. In each <value expression> contained in <select list>,
    each column reference that references a column of T shall reference
    some column C that is functionally dependent on G or shall be
    contained in an aggregated argument of a <set function specification>
    whose aggregation query is QS.

    Which to me says that everything in the output is either grouped by or
    part of an aggregate. That together with a statement elsewhere saying
    that if no group by clause is present, GROUP BY () is implied seems to
    seal it for me.

    (BTW, the "functionally dependent" is new and postgresql only supports
    the older SQL standards where C has to actually be a grouping column).

    Have a nice day,
    --
    Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
    From each according to his ability. To each according to his ability to litigate.

Related Discussions

People

Translate

site design / logo © 2022 Grokbase