Hopefully this is the right place for a few feature requests that
would address some of the things that I've noticed in postgres.

1. Just a minor annoyance, but why must subqueries in FROM clauses
have an alias? For instance suppose that I have an orders table, and
one of the fields is userid. The following is unambiguous and is
legal in Oracle:

SELECT order_count
, count(*) as people
FROM (
SELECT count(*) as order_count
FROM orders
GROUP BY userid
)
GROUP BY order_count

It annoys me that it isn't legal in postgres. (Yes, I know how to fix
the query. But it still is an annoyance, and it comes up fairly often
in reporting purposes.)

2. Why is 'non-integer constant in GROUP BY' an error?

I find it inconvenient. For reporting purposes I often have to
dynamically build queries in code. An easy way to do that is just
interpolate in a set of possible statements which will either be empty
strings or have trailing commas. But then I need this (possibly
empty) list to have a valid group by statement at the end. In Oracle
I used to just write it like this:

SELECT ...
GROUP BY $possible_term_1
$possible_term_2
$possible_term_3
'end of possible groupings'

In postgres I either have to use a different strategy to build up
these strings, or else use a more complicated term to finish that off.
This works for now:

case when true then true end

but I don't know whether some future version of postgres might break
my code by banning that as well.

3. How hard would it be to have postgres ignore aliases in group by
clauses? Per my comments above, I often build complex queries in
code. I can't easily use the shortcut of referring to the select
column by number because the position is hard to determine. So my
code has to copy the select terms. But I can't copy them exactly
because the select terms include lots of "...as foo" clauses that are
not allowed in a group by. So I have to store very similar terms to
use twice.

It would be nice if I could just make the group by look like the
select, and have the (obviously irrelevant) aliases just be ignored.

4) Items 2 and 3 would both be made irrelevant if postgres did
something that I'd really, really would like. Which is to assume that
a query without a group by clause, but with an aggregate function in
the select, should have an implicit group by clause where you group by
all non-aggregate functions in the select.

For example

SELECT foo, count(*)
FROM bar

would be processed as:

SELECT foo, count(*)
FROM bar
GROUP BY foo

If I write a query with an aggregate function in the select, better
than 95% of the time this is the group by clause that I want. (This
email has one of the few exceptions.) In the remaining cases I could
easily add the extra stuff in the group by to the select without
problems. Therefore if postgres could just insert the obvious group
by clause in, I would never again write the words "group by" when
working with postgres. And I predict that many other people would do
the same.

But it doesn't. So when working with postgres, just like every other
database that I've used, I have to constantly type in group by clauses
with entirely redundant information. (But they're not EXACTLY the
same as the select clauses that they are redundant with...)

Cheers,
Ben

Search Discussions

  • Josh Berkus at Aug 23, 2007 at 12:05 am
    Ben,

    pgsql-sql is probably the appropriate list for future queries of this
    nature.

    Note that the below is my personal opinion; each PG developer has their
    own.
    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias? For instance suppose that I have an orders table, and
    one of the fields is userid. The following is unambiguous and is
    legal in Oracle:
    I *think* the alias is a requirement of the SQL standard. Yes/No?
    2. Why is 'non-integer constant in GROUP BY' an error?
    Again, this needs to reference one of the SQL standards if you're
    interested in a change of behavior. If we're out of standards compliance,
    that's a strong argument. If we're in compliance, you have a pretty steep
    hurdle to justify new syntax.
    3. How hard would it be to have postgres ignore aliases in group by
    clauses?
    Unfortunately, I think this is also a SQL compliance issue. However, I'd
    be more liable to support your arguments for it; it's much more obviously
    useful functionality.
    4) Items 2 and 3 would both be made irrelevant if postgres did
    something that I'd really, really would like. Which is to assume that
    a query without a group by clause, but with an aggregate function in
    the select, should have an implicit group by clause where you group by
    all non-aggregate functions in the select.
    In addition to SQL compliance issues, we're reluctant to do anything which
    makes implicit assumptions which could easily be wrong in PostgreSQL.
    Such shortcutting all to often leads to runaway queries or wrong data when
    the assumptions are incorrect. MySQL gives us lots of examples of what
    can happen if you do too many things for convenience and compromise
    correctness.

    --
    --Josh

    Josh Berkus
    PostgreSQL @ Sun
    San Francisco
  • Michael Glaesemann at Aug 23, 2007 at 12:17 am

    On Aug 22, 2007, at 18:45 , Ben Tilly wrote:

    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias?
    It's required by the SQL standard, AIUI. I wonder what EnterpriseDB
    does?
    2. Why is 'non-integer constant in GROUP BY' an error?
    This works for now:

    case when true then true end

    but I don't know whether some future version of postgres might break
    my code by banning that as well.
    The PostgreSQL developers generally tries hard to preserve backwards
    compatibility, so I doubt the case expression as you have it would go
    away (though I'm kind of surprised it's allowed). Am I wrong in
    thinking that Oracle would accept the same format PostgreSQL does? In
    that case, couldn't you use whatever method works in PostgreSQL in
    Oracle? I haven't checked the SQL standard, but it seems unlikely
    it'd allow something like

    GROUP BY , , , ;

    AIUI, Integers are only allowed because the SQL standard explicitly
    allows you to refer to columns by the order they appear in the SELECT
    list. Otherwise the GROUP BY items need to be column names.

    Both 1 and 2 seem to me to be places where Oracle is likely deviating
    from the standard. If you're targeting Oracle, then using Oracle-
    specific syntax might be warranted. If you're hoping to target more
    than one possible backend, I'd think it be better to use more
    portable syntax (e.g., SQL-standard syntax) than expecting other
    DBMSs to follow another's deviations. That's not to say PostgreSQL
    does not have non-standard syntax: in places, it does. But it does
    try to hew very closely to the standard.

    Again, I wonder what EnterpriseDB does in this case?
    3. How hard would it be to have postgres ignore aliases in group by
    clauses? Per my comments above, I often build complex queries in
    code. I can't easily use the shortcut of referring to the select
    column by number because the position is hard to determine. So my
    code has to copy the select terms. But I can't copy them exactly
    because the select terms include lots of "...as foo" clauses that are
    not allowed in a group by. So I have to store very similar terms to
    use twice.
    Perhaps someone else knows what you're referring to here, but I'm
    having a hard time without an example. Here's what I *think* you're
    trying to say:

    test=# select * from observation;
    observation_id | record_id | score_id
    ----------------+-----------+----------
    3240 | 1 | 1
    3239 | 1 | 1
    3238 | 1 | 2
    3237 | 1 | 1
    2872 | 2 | 1
    2869 | 2 | 2
    2870 | 2 | 1
    2871 | 2 | 1
    3218 | 3 | 2
    3217 | 3 | 1
    (10 rows)

    test=# select record_id as foo, count(observation_id) from
    observation group by record_id;
    foo | count
    -----+-------
    3 | 2
    2 | 4
    1 | 4
    (3 rows)

    test=# select record_id as foo, count(observation_id) from
    observation group by foo;
    foo | count
    -----+-------
    3 | 2
    2 | 4
    1 | 4
    (3 rows)

    test=# select record_id as foo, count(observation_id) as bar from
    observation group by foo;
    foo | bar
    -----+-----
    3 | 2
    2 | 4
    1 | 4
    (3 rows)

    test=# select record_id as foo, count(observation_id) as bar from
    observation group by record_id;
    foo | bar
    -----+-----
    3 | 2
    2 | 4
    1 | 4
    (3 rows)

    test=# select version();

    version
    ------------------------------------------------------------------------
    ----------------------------------------------------------------------
    PostgreSQL 8.2.4 on powerpc-apple-darwin8.9.0, compiled by GCC
    powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc.
    build 5367)
    (1 row)

    I'm not getting an error in any permutation that I can think of. What
    am I missing?
    Which is to assume that
    a query without a group by clause, but with an aggregate function in
    the select, should have an implicit group by clause where you group by
    all non-aggregate functions in the select.

    For example

    SELECT foo, count(*)
    FROM bar

    would be processed as:

    SELECT foo, count(*)
    FROM bar
    GROUP BY foo
    It's been discussed before. I don't believe it's been rejected out of
    hand (though you can check the archives), just that no one's gotten
    around to it. (Don't know what the SQL-spec says on this point.)

    I'm not trying to dismiss your points, just trying to address them.
    I'm interested to hear what others have to say.

    Michael Glaesemann
    grzm seespotcode net
  • Ben Tilly at Aug 23, 2007 at 1:36 am

    On 8/22/07, Michael Glaesemann wrote:
    On Aug 22, 2007, at 18:45 , Ben Tilly wrote:

    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias?
    It's required by the SQL standard, AIUI. I wonder what EnterpriseDB
    does?
    I can well believe that the standard says that you must accept
    subqueries with aliases. But does it say you must reject subqueries
    without aliases? I strongly doubt that.

    I have no clue what EnterpriseDB does.
    2. Why is 'non-integer constant in GROUP BY' an error?
    This works for now:

    case when true then true end

    but I don't know whether some future version of postgres might break
    my code by banning that as well.
    The PostgreSQL developers generally tries hard to preserve backwards
    compatibility, so I doubt the case expression as you have it would go
    away (though I'm kind of surprised it's allowed). Am I wrong in
    thinking that Oracle would accept the same format PostgreSQL does? In
    The reason for my comparing to Oracle is that I used to work at an
    Oracle shop. I now work at a postgres shop. Portability is not my
    issue, just the annoyances that I experienced moving from one to the
    other.

    As for whether that case expression would go away, that it is allowed
    is such an obscure feature that I doubt anyone changing that code
    would notice if it was removed.
    that case, couldn't you use whatever method works in PostgreSQL in
    Oracle? I haven't checked the SQL standard, but it seems unlikely
    it'd allow something like

    GROUP BY , , , ;
    That's not what Oracle accepts that postgres does not. What Oracle accepts is:

    ...
    GROUP BY 'foo';
    AIUI, Integers are only allowed because the SQL standard explicitly
    allows you to refer to columns by the order they appear in the SELECT
    list. Otherwise the GROUP BY items need to be column names.
    Need to be?

    The SQL-92 standard is clear that you must accept a list of column
    names. It is also clear that a column name must be be of the form
    field or table.field. In no way, shape or form does that allow having
    terms like trim(foo.bar) in a group by.

    But every reasonable database that I know - including postgres - allows that.

    The standard very wisely does not forbid extensions. Every database
    has extensions. In some cases, such as allowing trim(foo.bar) in a
    group by clause, some extensions are so common as to be a standard.
    (I don't have a copy of any later standards so I don't know whether
    that has since been explicitly allowed.) Therefore the real question
    is how much farther than the standard you go.

    Postgres explicitly disallows a constant character expression. But it
    allows the constant case expression that I gave. It would be nice for
    me to not have to remember that very obscure and convoluted case.
    Both 1 and 2 seem to me to be places where Oracle is likely deviating
    from the standard. If you're targeting Oracle, then using Oracle-
    specific syntax might be warranted. If you're hoping to target more
    than one possible backend, I'd think it be better to use more
    portable syntax (e.g., SQL-standard syntax) than expecting other
    DBMSs to follow another's deviations. That's not to say PostgreSQL
    does not have non-standard syntax: in places, it does. But it does
    try to hew very closely to the standard.
    The queries that I'm writing are not hoping to target more than one
    database at one company.
    Again, I wonder what EnterpriseDB does in this case?
    No clue.
    3. How hard would it be to have postgres ignore aliases in group by
    clauses? Per my comments above, I often build complex queries in
    code. I can't easily use the shortcut of referring to the select
    column by number because the position is hard to determine. So my
    code has to copy the select terms. But I can't copy them exactly
    because the select terms include lots of "...as foo" clauses that are
    not allowed in a group by. So I have to store very similar terms to
    use twice.
    Perhaps someone else knows what you're referring to here, but I'm
    having a hard time without an example. Here's what I *think* you're
    trying to say:
    What I'm trying to say is that it would be convenient for me to be
    able to write:

    select bar as "baz"
    , count(*) as "some count"
    from foo
    group by bar as "baz"

    That's not allowed right now because as is not allowed in a group by statement.

    [...]
    Which is to assume that
    a query without a group by clause, but with an aggregate function in
    the select, should have an implicit group by clause where you group by
    all non-aggregate functions in the select.

    For example

    SELECT foo, count(*)
    FROM bar

    would be processed as:

    SELECT foo, count(*)
    FROM bar
    GROUP BY foo
    It's been discussed before. I don't believe it's been rejected out of
    hand (though you can check the archives), just that no one's gotten
    around to it. (Don't know what the SQL-spec says on this point.)
    I don't know what the SQL spec says, but I know (having talked to
    other developers) that many people would find it very nice.
    I'm not trying to dismiss your points, just trying to address them.
    I'm interested to hear what others have to say.
    Cheers,
    Ben
  • Michael Glaesemann at Aug 23, 2007 at 3:51 am

    On Aug 22, 2007, at 20:36 , Ben Tilly wrote:
    On 8/22/07, Michael Glaesemann wrote:
    On Aug 22, 2007, at 18:45 , Ben Tilly wrote:

    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias?
    It's required by the SQL standard, AIUI. I wonder what EnterpriseDB
    does?
    I can well believe that the standard says that you must accept
    subqueries with aliases. But does it say you must reject subqueries
    without aliases? I strongly doubt that.
    If I'm reading my draft copy of the SQL:2003 spec right (and there's
    a good chance that I'm not, as it's not the easiest document for me
    to parse), aliases *are* required.

    From 5WD-02-Foundation-2003-09 (the fifth working draft)
    7.5 <from clause>

    <from clause> ::= FROM <table reference list>
    <table reference list> ::=
    <table reference> [ { <comma> <table reference> }... ]
    7.6 <table reference>

    <table reference> ::=
    <table factor>
    <joined table>
    <table factor> ::= <table primary> [ <sample clause> ]
    <table primary> ::=
    <table or query name> [ [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ] ]
    <derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
    <lateral derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
    <collection derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
    <table function derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
    <only spec> [ [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ] ]
    <left paren> <joined table> <right paren>
    <derived table> ::= <table subquery>
    <correlation name> I believe is what we commonly refer to as an
    alias. I think <table or query name> is a table or view name, and
    <derived table> is a subquery. <correlation name> is optional for
    <table or query name> but not for <derived table>. The fact that the
    standard clearly makes it optional in some cases and required in
    others is pretty clear that <derived table> without a <correlation
    name> is to be rejected, don't you think?
    I have no clue what EnterpriseDB does.
    In case it wasn't clear, the reason I bring it up is that
    EnterpriseDB, while working from a PostgreSQL base, strives for
    Oracle compatibility.
    that case, couldn't you use whatever method works in PostgreSQL in
    Oracle? I haven't checked the SQL standard, but it seems unlikely
    it'd allow something like

    GROUP BY , , , ;
    That's not what Oracle accepts that postgres does not. What
    Oracle accepts is:

    ...
    GROUP BY 'foo';
    Thanks for clarifying. If it wasn't obvious by now, you know I'm not
    familiar with Oracle :)
    AIUI, Integers are only allowed because the SQL standard explicitly
    allows you to refer to columns by the order they appear in the SELECT
    list. Otherwise the GROUP BY items need to be column names.
    Need to be?

    The SQL-92 standard is clear that you must accept a list of column
    names. It is also clear that a column name must be be of the form
    field or table.field.
    The 2003 draft (same as above) seems to agree with the SQL92 standard:
    7.9 <group by clause>
    Format
    <group by clause> ::=
    GROUPBY [ <set quantifier> ] <grouping element list>
    <grouping element list> ::=
    <grouping element> [ { <comma> <grouping element> }... ]
    <grouping element> ::=
    <ordinary grouping set>
    <rollup list>
    <cube list>
    <grouping sets specification>
    <empty grouping set>
    <ordinary grouping set> ::=
    <grouping column reference>
    <left paren> <grouping column reference list><right paren>
    <grouping column reference> ::=
    <column reference> [ <collate clause> ]
    <grouping column reference list> ::=
    <grouping column reference> [ { <comma><grouping column
    reference> }... ]
    <rollup list> ::=
    ROLLUP<left paren> <ordinary grouping set list><right paren>
    <ordinary grouping set list> ::=
    <ordinary grouping set> [ { <comma> <ordinary grouping set> }... ]
    <cube list> ::=
    CUBE<left paren> <ordinary grouping set list><right paren>
    <grouping sets specification> ::=
    GROUPINGSETS <left paren> <grouping set list><right paren>
    <grouping set list> ::=
    <grouping set> [ { <comma> <grouping set> }... ]
    <grouping set> ::=
    <ordinary grouping set>
    <rollup list>
    <cube list>
    <grouping sets specification>
    <empty grouping set>
    <empty grouping set> ::= <left paren><right paren>
    6.7 <column reference>
    Format
    <column reference> ::=
    <basic identifier chain>
    MODULE<period> <qualified identifier><period> <column name>
    There'd have to be a pretty strong reason to extend this, more than
    just a convenience, I should think.
    In no way, shape or form does that allow having
    terms like trim(foo.bar) in a group by.

    But every reasonable database that I know - including postgres -
    allows that.
    Can you give an example of something like this working in PostgreSQL?
    I get an error when I try to use a text value in a GROUP BY clause.
    (Or are you referring specifically to the CASE expression corner case?)

    test=# select record_id as foo, count(observation_id) as bar from
    observation group by trim(observation.record_id::text);
    ERROR: column "observation.record_id" must appear in the GROUP BY
    clause or be used in an aggregate function
    test=# select record_id as foo, count(observation_id) as bar from
    observation group by trim(observation.record_id);
    ERROR: column "observation.record_id" must appear in the GROUP BY
    clause or be used in an aggregate function
    Postgres explicitly disallows a constant character expression. But it
    allows the constant case expression that I gave. It would be nice for
    me to not have to remember that very obscure and convoluted case.
    I agree, and would move that it should be disallowed if there isn't a
    reason for it to be maintained, for exactly the reason you give:
    there shouldn't be such convoluted, and obscure corner case.

    test=# select record_id as foo, count(observation_id) as bar from
    observation group by record_id, true;
    ERROR: non-integer constant in GROUP BY
    test=# select record_id as foo, count(observation_id) as bar from
    observation group by record_id, case when true then true end;
    What I'm trying to say is that it would be convenient for me to be
    able to write:

    select bar as "baz"
    , count(*) as "some count"
    from foo
    group by bar as "baz"

    That's not allowed right now because as is not allowed in a group
    by statement.
    Ah, I see. Thanks for clarifying.
    Which is to assume that
    a query without a group by clause, but with an aggregate function in
    the select, should have an implicit group by clause where you
    group by
    all non-aggregate functions in the select.

    For example

    SELECT foo, count(*)
    FROM bar

    would be processed as:

    SELECT foo, count(*)
    FROM bar
    GROUP BY foo
    It's been discussed before. I don't believe it's been rejected out of
    hand (though you can check the archives), just that no one's gotten
    around to it. (Don't know what the SQL-spec says on this point.)
    I don't know what the SQL spec says, but I know (having talked to
    other developers) that many people would find it very nice.
    Since I had the spec open, I tried to look at this as well, though I
    must admit I found it very tough going.

    I think this is the key section:
    10.9 <aggregate function>

    ...

    Syntax Rules

    ...

    4) The argument source of an <aggregate function> is
    Case:
    a) If AF is immediately contained in a <set function
    specification>, then a table or group of a grouped table as
    specified in Subclause 7.10, “<having clause>”, and Subclause 7.12,
    “<query specification>”.

    b) Otherwise, the collection of rows in the current row's window
    frame defined by the window structure descriptor identified by the
    <window function> that simply contains AF, as defined in Subclause
    7.11, “<window clause>”.
    Now the <query specification> stuff is pretty dense, and the <window
    clause> is currently impenetrable for me, so I just looked at the
    <having clause> and <query specification> sections referenced above.
    7.10 <having clause>

    ...

    Syntax Rules

    1) Let HC be the <having clause>. Let TE be the <table expression>
    that immediately contains HC. If TE does not immediately contain a
    <group by clause>, then “GROUPBY ()” is implicit. Let T be the
    descriptor of the table defined by the <group by clause>GBC
    immediately contained in TE and let R be the result of GBC.
    So it looks like if there's no explicit GROUP BY, we're to assume
    GROUP BY (), at least in the case of HAVING.
    7.12 <query specification>

    ...

    Syntax Rules

    ...

    14) If <table expression> does not immediately contain a <group by
    clause> and <table expression> is simply contained in a <query
    expression> that is the aggregation query of some <set function
    specification>, then GROUP BY () is implicit.
    Again, assume GROUP BY () if no explicit GROUP BY with a <set
    function specification>, which I understand to be some form of
    aggregation.
    15) If T is a grouped table, then let G be the set of grouping
    columns of T. In each <value expression> contained in <select
    list> , each column reference that references a column of T shall
    reference some column C that is functionally dependent on G or
    shall be contained in an aggregated argument of a <set function
    specification> whose aggregation query is QS.
    And a definition of grouped table is found:
    4.14.2 Types of tables

    ...

    A grouped table is a set of groups derived during the evaluation of
    a <group by clause>. A group G is a collection of rows in which,
    for every grouping column GC, if the value of GC in some row is not
    distinct from GV, then the value of GC in every row is GV;
    moreover, if R1 is a row in group G1 of grouped table GT and R2 is
    a row in GT such that for every grouping column GC the value of GC
    in R1 is not distinct from the value of GC in R2, then R2 is in G1.
    Every row in GT is in exactly one group. A group may be considered
    as a table. Set functions operate on groups.
    So we're already talking about an explicit GROUP BY clause, and it
    doesn't seem to leave much wiggle room grouping columns that aren't
    explicitly listed. I'll readily admit I very easily could be reading
    this wrong or have missed a section of the spec that discusses this
    in more detail. (And that goes for any of my interpretations of the
    spec!)

    One place you and I differ is that I think you'd be in favor of many
    more extensions in cases where the SQL spec doesn't explicitly say
    "Thou shalt not do X". I'm a bit more conservative here: in my
    opinion, if the SQL speaks to a particular issue (for example, a
    WHERE clause), then you should pretty much keep to the letter of what
    it says. Extensions are more for areas where the standard is silent.
    For example, full text search will be added to PostgreSQL in 8.3.
    AFAIK, the SQL spec doesn't talk about full text search at all, so
    this is an area where extensions are a natural fit. Others, such as
    PostgreSQL's DISTINCT ON, is a much more difficult issue, and I'd
    suspect if it were proposed as a new feature now it would be
    rejected. However, it's been a part of PostgreSQL for a long time, so
    for historical reasons it remains. Likewise, I doubt "text" as a
    synonym for "varchar" would be accepted.

    Anyway, enough excursions into SpecLand for me. I'd welcome anyone
    else taking a crack at it.

    Michael Glaesemann
    grzm seespotcode net
  • Tom Lane at Aug 23, 2007 at 5:11 am

    Michael Glaesemann writes:
    On Aug 22, 2007, at 20:36 , Ben Tilly wrote:
    I can well believe that the standard says that you must accept
    subqueries with aliases. But does it say you must reject subqueries
    without aliases? I strongly doubt that.
    If I'm reading my draft copy of the SQL:2003 spec right (and there's
    a good chance that I'm not, as it's not the easiest document for me
    to parse), aliases *are* required.
    Yes, they are. It's been like that since SQL92 if not before (I've
    never seen a copy of SQL89). I was meaning to go look up whether
    SQL2003 had gotten any laxer, but thanks for saving me the trouble.

    The question for us is whether we should exceed the spec by allowing
    something it does not. The fundamental reason why not is that we would
    have to invent an alias for the unnamed sub-select. That has several
    drawbacks: more code to try to pick an unused alias, more bug reports
    from people whose queries conflicted with whatever we picked. In return
    for which, we're providing absolutely zip real increase in
    functionality, and we're encouraging people to write unportable
    SQL-spec-incompatible code. There is a future-proofing argument too:
    if the committee ever decides this is a good idea (which may be unlikely
    after 15 years, but you never know), they could well define the default
    alias in a different way than we had done, and then we are stuck behind
    the eight ball.

    One could also assume that if the SQL committee has not made this
    oh-so-obvious improvement, there is some darn good reason why not.
    I'm not privy to their minutes so I don't know what arguments were
    made there, but I think we can eliminate "they never thought of it"
    as an explanation. Three generations of the spec have been written
    specifically to exclude this.

    In short, lots of downsides here, and not a whole lot of upside.

    regards, tom lane
  • Gregory Stark at Aug 23, 2007 at 9:28 am

    "Tom Lane" <tgl@sss.pgh.pa.us> writes:

    In short, lots of downsides here, and not a whole lot of upside.
    I highly doubt the spec would ever conflict with allowing the user to elide
    the aliases given that Oracle (and others?) have always allowed this. Moreover
    if it's been 15 years without them adding it surely that argues we can be
    pretty sure they won't add them?

    This seems like a particularly petty case compared to a lot of other
    extensions we do allow. Surely allowing arbitrary expressions in GROUP BY is
    far more likely to conflict in the future given how it constrains our grammar.
    And in theory that provides no added functionality over aside from programmer
    convenience as well. There are tons of extensions to the spec in the Postgres
    grammar. This would be one of the simplest safest ones.

    The upside is the convenience which after all is the same upside as most of
    our spec grammar extensions. Many many programmers are accustomed to entering
    ad-hoc queries of this form and forcing them to enter an alias for no purpose
    is just silly pedanticism from their point of view. The portability of ad-hoc
    queries is meaningless and if you don't refer to the alias in the query then
    it's truly pointless.

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
  • Alvaro Herrera at Aug 23, 2007 at 2:51 pm

    Gregory Stark escribió:

    The upside is the convenience which after all is the same upside as most of
    our spec grammar extensions. Many many programmers are accustomed to entering
    ad-hoc queries of this form and forcing them to enter an alias for no purpose
    is just silly pedanticism from their point of view. The portability of ad-hoc
    queries is meaningless and if you don't refer to the alias in the query then
    it's truly pointless.
    So there's the compromise: allow not specifying an alias only if it's
    not used in the rest of the query at all, so the subquery would be
    effectively anonymous.

    --
    Alvaro Herrera http://www.CommandPrompt.com/
    The PostgreSQL Company - Command Prompt, Inc.
  • Gregory Stark at Aug 23, 2007 at 4:38 pm

    "Alvaro Herrera" <alvherre@commandprompt.com> writes:

    Gregory Stark escribió:
    The upside is the convenience which after all is the same upside as most of
    our spec grammar extensions. Many many programmers are accustomed to entering
    ad-hoc queries of this form and forcing them to enter an alias for no purpose
    is just silly pedanticism from their point of view. The portability of ad-hoc
    queries is meaningless and if you don't refer to the alias in the query then
    it's truly pointless.
    So there's the compromise: allow not specifying an alias only if it's
    not used in the rest of the query at all, so the subquery would be
    effectively anonymous.
    If what's not used in the rest of the query? If you haven't specified the
    alias what would you use? Surely even if we did generate an alias name nobody
    would think the generated name was guaranteed to be stable and reference it?

    I think the compromise is to instead of generating aliases at all just use an
    alias like "*Anonymous Subquery*" and add a boolean flag indicating that that
    range table is anonymous and not a valid target for references. I started
    doing that a while back but got distracted (and discouraged since it seemed
    not to have widespread support).

    IMHO even generating non-anonymous aliases like "*Anonymous Subquery*1" would
    be fine but I'm happy to have a flag hiding them too.

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
  • Alvaro Herrera at Aug 23, 2007 at 4:55 pm

    Gregory Stark escribió:
    "Alvaro Herrera" <alvherre@commandprompt.com> writes:
    Gregory Stark escribió:
    The upside is the convenience which after all is the same upside as most of
    our spec grammar extensions. Many many programmers are accustomed to entering
    ad-hoc queries of this form and forcing them to enter an alias for no purpose
    is just silly pedanticism from their point of view. The portability of ad-hoc
    queries is meaningless and if you don't refer to the alias in the query then
    it's truly pointless.
    So there's the compromise: allow not specifying an alias only if it's
    not used in the rest of the query at all, so the subquery would be
    effectively anonymous.
    If what's not used in the rest of the query?
    The alias, of course. If you reread Tom's argument, his problem is that
    the alias we choose may collide with what an hypotetical future standard
    may define, so the users who start to depend on the names with choose
    automatically would be screwed.
    I think the compromise is to instead of generating aliases at all just use an
    alias like "*Anonymous Subquery*" and add a boolean flag indicating that that
    range table is anonymous and not a valid target for references. I started
    doing that a while back but got distracted (and discouraged since it seemed
    not to have widespread support).
    Hey, you are describing an implementation of my suggestion.

    --
    Alvaro Herrera http://www.advogato.org/person/alvherre
    A male gynecologist is like an auto mechanic who never owned a car.
    (Carrie Snow)
  • Josh Berkus at Aug 23, 2007 at 6:06 pm

    On Thursday 23 August 2007 09:55, Alvaro Herrera wrote:
    I think the compromise is to instead of generating aliases at all just
    use an alias like "*Anonymous Subquery*" and add a boolean flag
    indicating that that range table is anonymous and not a valid target for
    references. I started doing that a while back but got distracted (and
    discouraged since it seemed not to have widespread support).
    Hey, you are describing an implementation of my suggestion.
    I just don't really see the gain in doing this. Any violations of the SQL
    standard are bad things in and of themselves; they encourage people to write
    non-spec applications which are then hard to port/support alongside other
    databases. It's also bad PR for us.

    So if we're going to support a non-spec feature (gods know we have) then it's
    going to be because it offers *significant* extra functionality. SELECT
    DISTINCT ON (), LIMIT/OFFSET, custom data types, CREATE OPERATOR, RULES, etc.
    all give us stuff that would be *very* hard to do -- or even impossible --
    without these features.

    I just don't see the ability to omit the alias in a query with only one
    subquery (the only circumstances under which it would be safe to do so) as
    any significant gain in fuctionality.

    --
    Josh Berkus
    PostgreSQL @ Sun
    San Francisco
  • Tom Lane at Aug 23, 2007 at 7:25 pm

    Josh Berkus writes:
    I just don't see the ability to omit the alias in a query with only one
    subquery (the only circumstances under which it would be safe to do so) as
    any significant gain in fuctionality.
    Why do you think it'd be restricted to only one subquery?

    As long as you take care that the subquery's column names don't match
    any other ones in the query, you don't *need* an alias for it ---
    there'll be no need to qualify the column names. This extends just
    fine to multiple subqueries.

    (But I'm still bugged by the idea that there's some deep reason for the
    SQL committee to have disallowed this syntax. There's certainly not any
    obvious reason why they did so. Conceivably it was just politics ---
    some Big Vendor using a parser that couldn't deal with it --- but maybe
    there is something else, like a planned future extension that'd break
    this syntax.)

    regards, tom lane
  • Michael Glaesemann at Aug 23, 2007 at 7:39 pm

    On Aug 23, 2007, at 14:25 , Tom Lane wrote:

    Josh Berkus <josh@agliodbs.com> writes:
    I just don't see the ability to omit the alias in a query with
    only one
    subquery (the only circumstances under which it would be safe to
    do so) as
    any significant gain in fuctionality.
    Why do you think it'd be restricted to only one subquery?

    As long as you take care that the subquery's column names don't match
    any other ones in the query, you don't *need* an alias for it ---
    there'll be no need to qualify the column names. This extends just
    fine to multiple subqueries.
    How about something like gensym? One alias you could always use and
    be guaranteed it would give a unique value. Still provide the alias,
    but don't have to think about name collisions.

    Michael Glaesemann
    grzm seespotcode net
  • Alvaro Herrera at Aug 23, 2007 at 7:50 pm

    Michael Glaesemann wrote:
    On Aug 23, 2007, at 14:25 , Tom Lane wrote:

    Josh Berkus <josh@agliodbs.com> writes:
    I just don't see the ability to omit the alias in a query with only one
    subquery (the only circumstances under which it would be safe to do so)
    as
    any significant gain in fuctionality.
    Why do you think it'd be restricted to only one subquery?

    As long as you take care that the subquery's column names don't match
    any other ones in the query, you don't *need* an alias for it ---
    there'll be no need to qualify the column names. This extends just
    fine to multiple subqueries.
    How about something like gensym? One alias you could always use and be
    guaranteed it would give a unique value. Still provide the alias, but don't
    have to think about name collisions.
    It is dangerous to provide a synthetic name; if the standard ever gets
    modified to support alias-less subqueries, they would likely choose a
    different name-generating algorithm, and we would have a
    backward-compatibility problem.

    Or is that a backwards-compatibility problem? I remain unsure.

    --
    Alvaro Herrera http://www.CommandPrompt.com/
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Tom Lane at Aug 23, 2007 at 8:06 pm

    Alvaro Herrera writes:
    Michael Glaesemann wrote:
    How about something like gensym?
    It is dangerous to provide a synthetic name; if the standard ever gets
    modified to support alias-less subqueries, they would likely choose a
    different name-generating algorithm, and we would have a
    backward-compatibility problem.
    I concur. If we do this, the thing to do is have *no* alias, ie, there
    is no way to qualify the names of the subquery output columns. (IIRC
    unaliased joins already work approximately like that.) Then there's
    not anything to have a backward compatibility problem with.

    regards, tom lane
  • Josh Berkus at Aug 23, 2007 at 7:44 pm
    Tom,
    As long as you take care that the subquery's column names don't match
    any other ones in the query, you don't *need* an alias for it ---
    there'll be no need to qualify the column names. This extends just
    fine to multiple subqueries.
    Oh, hmmm. I was thinking this would somehow be more of an issue with
    anonymous subqueries, but it isn't really. You can still duplicate column
    names in named subqueries.

    So, never mind that.

    I still don't see it as terrifically useful functionality, given that it's
    just saving you 4 keystrokes ...

    --
    Josh Berkus
    PostgreSQL @ Sun
    San Francisco
  • Tom Lane at Aug 23, 2007 at 8:01 pm

    Josh Berkus writes:
    I still don't see it as terrifically useful functionality, given that it's
    just saving you 4 keystrokes ...
    Less than that, because the AS is optional. The only argument I've
    heard that carries much weight with me is that it eases porting from
    other DBMS's that allow this. Are there any others besides Oracle?

    In a quick check, mysql 5.0.45 hews to the straight and narrow path:

    mysql> create table t(f1 int);
    Query OK, 0 rows affected (0.01 sec)

    mysql> select * from (select f1 from t) x;
    Empty set (0.00 sec)

    mysql> select * from (select f1 from t) ;
    ERROR 1248 (42000): Every derived table must have its own alias
    mysql>

    so you don't get to point to them as one that we'd improve
    compatibility with.

    regards, tom lane
  • Josh Berkus at Aug 23, 2007 at 8:04 pm
    Tom,
    so you don't get to point to them as one that we'd improve
    compatibility with.
    Derby & DB2 are also strict.

    --
    Josh Berkus
    PostgreSQL @ Sun
    San Francisco
  • Kevin Grittner at Aug 23, 2007 at 8:22 pm
    On Thu, Aug 23, 2007 at 3:01 PM, in message <11856.1187899268@sss.pgh.pa.us>,
    Tom Lane wrote:
    The only argument I've
    heard that carries much weight with me is that it eases porting from
    other DBMS's that allow this. Are there any others besides Oracle?
    select * from (select f1 from t)
    In Sybase:

    com.sybase.jdbc2.jdbc.SybSQLException: The derived table expression is missing a correlation name. Check derived table syntax in the Reference Manual.
    Error code: 11753
    SQL state: ZZZZZ
  • Florian G. Pflug at Aug 23, 2007 at 9:12 pm

    Kevin Grittner wrote:
    On Thu, Aug 23, 2007 at 3:01 PM, in message <11856.1187899268@sss.pgh.pa.us>,
    Tom Lane wrote:
    The only argument I've
    heard that carries much weight with me is that it eases porting from
    other DBMS's that allow this. Are there any others besides Oracle?
    select * from (select f1 from t)
    In Sybase:

    com.sybase.jdbc2.jdbc.SybSQLException: The derived table expression is missing a correlation name. Check derived table syntax in the Reference Manual.
    Error code: 11753
    SQL state: ZZZZZ
    The really funny thing is that pgsql, mysql and at least sybase
    *explicitly* dissallow the no-alias case. Which shows that
    .) This seems to be common source of confusion and errors.
    .) Aliasless-Subqueries wouldn't lead to ambigous grammras in those databases.
    Otherwise, you'd expect to get some more generic syntax error, and not
    the very explicit "No alias, but expected one".

    I agree with Tom - knowing *why* the standard committee disallows that syntax -
    and why everybody except oracle chose to agree with it would be quite interesting.

    greetings, Florian Pflug
  • Peter Eisentraut at Aug 23, 2007 at 10:57 pm

    Florian G. Pflug wrote:
    knowing *why* the standard committee disallows that syntax -
    and why everybody except oracle chose to agree with it would be quite
    interesting.
    Consider that we rightfully disallow

    SELECT * FROM tab1, tab1;

    because there are two tables with the same name involved. This hints to
    the fact that the internal query processing needs to have a unique name
    for each table source. You could argue why that would have to be, but
    it seems quite useful. Otherwise you would also have to allow

    SELECT * FROM tab1 AS a, tab2 AS a;

    and that cannot possibly be good.

    If you believe that, then a table source without any name at all, such
    as one produced by subqueries, should also be disallowed. Requiring a
    unique name appears to be a stronger condition than requiring a name at
    all. This is actually false if you consider that SQL itself does
    distinguish between unique constraints and not-null constraints. But
    you don't really want to invent a logic by which nameless table sources
    are managed. Strict logic would probably require the results of all
    such queries to be unknown.

    Note that Oracle is notorious for mishandling null values in various
    ways. So it's not surprising that they mishandle "null" table names in
    what appears to be pretty much the same way.
  • Tom Lane at Aug 24, 2007 at 1:49 am

    Peter Eisentraut writes:
    Florian G. Pflug wrote:
    knowing *why* the standard committee disallows that syntax -
    and why everybody except oracle chose to agree with it would be quite
    interesting.
    ...
    If you believe that, then a table source without any name at all, such
    as one produced by subqueries, should also be disallowed.
    Good point. However, what I would kind of expect the standards
    committee to do is to specify that each such subquery has an
    automatically generated name assigned to it. Compare what SQL92 says
    about assigning aliases for SELECT output columns:

    a) If the i-th <derived column> in the <select list> specifies
    an <as clause> that contains a <column name> C, then the
    <column name> of the i-th column of the result is C.

    b) If the i-th <derived column> in the <select list> does not
    specify an <as clause> and the <value expression> of that
    <derived column> is a single <column reference>, then the
    <column name> of the i-th column of the result is C.

    c) Otherwise, the <column name> of the i-th column of the <query
    specification> is implementation-dependent and different
    from the <column name> of any column, other than itself, of
    a table referenced by any <table reference> contained in the
    SQL-statement.

    Implementation-dependent is a term of art here: it means that each DBMS
    can do whatever the heck it wants. (But note that we fail to meet (c)
    exactly, since we don't bother to generate names that are distinct ---
    but in practice no one seems to care about that.)

    If we could expect that any future relaxation of the spec would likewise
    specify generating implementation-dependent unique table aliases for
    subqueries, then we'd not be having this discussion. What troubles me
    most is why the spec hasn't had identical verbiage for table aliases as
    it does for column aliases, since day one. There's *something* going on
    there that we don't know about.

    (BTW, your argument would seem to also disallow alias-less JOIN tables,
    but the spec has allowed those all along.)

    regards, tom lane
  • Gregory Stark at Aug 24, 2007 at 11:51 am

    "Tom Lane" <tgl@sss.pgh.pa.us> writes:

    c) Otherwise, the <column name> of the i-th column of the <query
    specification> is implementation-dependent and different
    from the <column name> of any column, other than itself, of
    a table referenced by any <table reference> contained in the
    SQL-statement.

    Implementation-dependent is a term of art here: it means that each DBMS
    can do whatever the heck it wants. (But note that we fail to meet (c)
    exactly, since we don't bother to generate names that are distinct ---
    but in practice no one seems to care about that.)
    Actually I suspect there are people who get annoyed by it when they try to
    reference a column by name in a client driver like DBI which allows that.

    Note that if you use something like fetchrow_hashref it will actually condense
    out duplicate column names since it loads the row into a hash. So if you
    you're writing a program which just wants to dump the record without
    understanding it you probably load it into a hash and then dump the hash in
    key=>value form. And that will cause some columns to be dropped in the output.

    But those people probably just figure it was their own fault and put in
    aliases in their queries.

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
  • Tom Lane at Aug 24, 2007 at 3:13 pm

    Gregory Stark writes:
    "Tom Lane" <tgl@sss.pgh.pa.us> writes:
    ... note that we fail to meet (c)
    exactly, since we don't bother to generate names that are distinct ---
    but in practice no one seems to care about that.)
    Actually I suspect there are people who get annoyed by it when they try to
    reference a column by name in a client driver like DBI which allows that.
    Note that if you use something like fetchrow_hashref it will actually condense
    out duplicate column names since it loads the row into a hash. So if you
    you're writing a program which just wants to dump the record without
    understanding it you probably load it into a hash and then dump the hash in
    key=>value form. And that will cause some columns to be dropped in the output.
    But those people probably just figure it was their own fault and put in
    aliases in their queries.
    Well, if you're using client-side code that depends on access by name
    rather than field position, you definitely have to put in AS clauses.
    Even if we did generate distinct names, a client couldn't rely on
    knowing in advance what they'd be.

    regards, tom lane
  • Gregory Stark at Aug 24, 2007 at 3:52 pm

    "Tom Lane" <tgl@sss.pgh.pa.us> writes:

    Gregory Stark <stark@enterprisedb.com> writes:
    Note that if you use something like fetchrow_hashref it will actually condense
    out duplicate column names since it loads the row into a hash. So if you
    you're writing a program which just wants to dump the record without
    understanding it you probably load it into a hash and then dump the hash in
    key=>value form. And that will cause some columns to be dropped in the output.
    But those people probably just figure it was their own fault and put in
    aliases in their queries.
    Well, if you're using client-side code that depends on access by name
    rather than field position, you definitely have to put in AS clauses.
    Even if we did generate distinct names, a client couldn't rely on
    knowing in advance what they'd be.
    That's why I got tied up trying to describe a scenario where you wouldn't have
    to rely on knowing in advance what they would be. If you're running some kind
    of reporting tool it could let the user type in arbitrary queries and then
    look at what names are returned from the describe message to put in its column
    headings.

    If such a tool was written in perl using fetchrow_hashref I think it would end
    up receiving only one of each distinct column name. Whereas it should be able
    to depend on receiving all of them with distinct names, even if it won't know
    what those names will be in advance.

    If it's written to use arrays and column positions then it would still work
    properly though. And we haven't seen any real complaints.

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
  • Tom Lane at Aug 24, 2007 at 1:19 am

    "Florian G. Pflug" <fgp@phlo.org> writes:
    The really funny thing is that pgsql, mysql and at least sybase
    *explicitly* dissallow the no-alias case.
    I knew that pgsql does that, because I made sure it did ;-). But it is
    pretty interesting that these other DBMSes also go out of their way to
    produce a specific error. As you say, that suggests that they could
    have accepted it, if they'd wanted to, at least from the standpoint of
    not having a grammar structure problem.

    regards, tom lane
  • Bruce Momjian at Mar 12, 2008 at 5:21 pm
    FYI, we decided we didn't want this additional capability.

    ---------------------------------------------------------------------------

    Florian G. Pflug wrote:
    Kevin Grittner wrote:
    On Thu, Aug 23, 2007 at 3:01 PM, in message <11856.1187899268@sss.pgh.pa.us>,
    Tom Lane wrote:
    The only argument I've
    heard that carries much weight with me is that it eases porting from
    other DBMS's that allow this. Are there any others besides Oracle?
    select * from (select f1 from t)
    In Sybase:

    com.sybase.jdbc2.jdbc.SybSQLException: The derived table expression is missing a correlation name. Check derived table syntax in the Reference Manual.
    Error code: 11753
    SQL state: ZZZZZ
    The really funny thing is that pgsql, mysql and at least sybase
    *explicitly* dissallow the no-alias case. Which shows that
    .) This seems to be common source of confusion and errors.
    .) Aliasless-Subqueries wouldn't lead to ambigous grammras in those databases.
    Otherwise, you'd expect to get some more generic syntax error, and not
    the very explicit "No alias, but expected one".

    I agree with Tom - knowing *why* the standard committee disallows that syntax -
    and why everybody except oracle chose to agree with it would be quite interesting.

    greetings, Florian Pflug


    ---------------------------(end of broadcast)---------------------------
    TIP 5: don't forget to increase your free space map settings
    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://postgres.enterprisedb.com

    + If your life is a hard drive, Christ can be your backup. +
  • Zeugswetter Andreas ADI SD at Aug 25, 2007 at 7:06 pm
    I still don't see it as terrifically useful functionality, given
    that it's
    just saving you 4 keystrokes ...
    Less than that, because the AS is optional. The only argument I've
    heard that carries much weight with me is that it eases porting from
    other DBMS's that allow this. Are there any others besides Oracle?
    FWIW IBM implemented this syntax in Informix 11.1 Cheetah released in
    July 2007.
    They also allow to omit the alias, and it imho makes sense.

    Andreas
  • Tom Lane at Aug 23, 2007 at 4:50 pm

    Gregory Stark writes:
    I highly doubt the spec would ever conflict with allowing the user to elide
    the aliases given that Oracle (and others?) have always allowed this. Moreover
    if it's been 15 years without them adding it surely that argues we can be
    pretty sure they won't add them?
    The $64 question in my mind is exactly why hasn't the spec allowed this?
    It's clear that they have gone out of their way to not allow it, and
    I think it's unwise to say "oh let's do it" without understanding why not.
    This seems like a particularly petty case compared to a lot of other
    extensions we do allow.
    That's exactly the problem. Most of our other extensions are justified
    by some significant capability gain. This isn't --- it provides zero
    new functionality, and the "convenience" factor amounts to the saving of
    one keystroke (ok, maybe two if you insist on a space before the alias).
    Pretty weak argument...

    regards, tom lane
  • Chuck McDevitt at Aug 23, 2007 at 5:17 pm

    This seems like a particularly petty case compared to a lot of other
    extensions we do allow.
    That's exactly the problem. Most of our other extensions are justified
    by some significant capability gain. This isn't --- it provides zero
    new functionality, and the "convenience" factor amounts to the saving
    of
    one keystroke (ok, maybe two if you insist on a space before the
    alias).
    Pretty weak argument...

    regards, tom lane
    Tom, it isn't just a case of "convenience". When we are trying to
    convert users from another database (say Oracle for example) to
    PostgeSQL, one of the big stumbling blocks that slows down the work is
    all the little query changes that people have to make (people who might
    not have written the query as well), and it can require the review of
    hundreds or thousands of SQL scripts and applications. The harder it
    is, the more reluctant they are to convert.

    Sometimes supporting "de-facto" standards as well as official standards
    makes sense.
  • Joshua D. Drake at Aug 23, 2007 at 5:26 pm

    Chuck McDevitt wrote:

    Tom, it isn't just a case of "convenience". When we are trying to
    convert users from another database (say Oracle for example) to
    PostgeSQL, one of the big stumbling blocks that slows down the work is
    all the little query changes that people have to make (people who might
    not have written the query as well), and it can require the review of
    hundreds or thousands of SQL scripts and applications. The harder it
    is, the more reluctant they are to convert.

    Sometimes supporting "de-facto" standards as well as official standards
    makes sense.
    That is not the goal of the project.

    Sincerely,

    Joshua D. Drake


    ---------------------------(end of broadcast)---------------------------
    TIP 4: Have you searched our list archives?

    http://archives.postgresql.org

    - --

    === The PostgreSQL Company: Command Prompt, Inc. ===
    Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
    PostgreSQL solutions since 1997 http://www.commandprompt.com/
    UNIQUE NOT NULL
    Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
    PostgreSQL Replication: http://www.commandprompt.com/products/
  • Andrew Dunstan at Aug 23, 2007 at 5:26 pm

    Chuck McDevitt wrote:
    Sometimes supporting "de-facto" standards as well as official standards
    makes sense.

    On that basis we would support huge pieces of stuff that emulates MySQL
    too. Or perhaps you'd like us to follow Oracle's treatment of NULL. Who
    gets to choose what is the de facto standard we follow?

    cheers

    andrew
  • Chuck McDevitt at Aug 23, 2007 at 6:08 pm

    -----Original Message-----
    From: Andrew Dunstan
    Sent: Thursday, August 23, 2007 10:26 AM
    To: Chuck McDevitt
    Cc: Tom Lane; Gregory Stark; Michael Glaesemann; Ben Tilly; pgsql-
    hackers@postgresql.org
    Subject: Re: [HACKERS] SQL feature requests



    Chuck McDevitt wrote:
    Sometimes supporting "de-facto" standards as well as official standards
    makes sense.

    On that basis we would support huge pieces of stuff that emulates MySQL
    too. Or perhaps you'd like us to follow Oracle's treatment of NULL. Who
    gets to choose what is the de facto standard we follow?

    cheers

    Andrew
    You must be joking... PostgreSQL already has a huge amount of
    "non-standard" syntax and semantics (perhaps "extensions" is a better
    word?).
    Everything from non-standard cast operator, non-standard substr,
    non-standard trim, non standard group by semantics (allowing simple ints
    to mean column number)... Given a day, we could probably write down
    several pages of "non-standard" features of PGSQL.

    Who decides what de facto standards to support, and which not? The
    PostgreSQL community of course.

    In general, we wouldn't want to support any de facto standard that:

    1. Is supported only by one vendor
    2. Causes any standard SQL statement to fail, or return a different
    answer from the standard.

    The proposed change doesn't fail either of these.
  • Andrew Dunstan at Aug 23, 2007 at 6:53 pm

    Chuck McDevitt wrote:
    PostgreSQL already has a huge amount of
    "non-standard" syntax and semantics (perhaps "extensions" is a better
    word?).
    Everything from non-standard cast operator, non-standard substr,
    non-standard trim, non standard group by semantics (allowing simple ints
    to mean column number)... Given a day, we could probably write down
    several pages of "non-standard" features of PGSQL.

    Quite so, and I've perpetrated a few myself. But for the most part they
    are either there for legacy reasons or add significant extra functionality.

    I rather like Alvaro's compromise suggestion re aliases in fact. At
    least there seems to be a better case for that than for "group by 'blurfl'".

    But the argument that convinces me is not that it follows some de facto
    standard, but that it would add to clarity. Requiring an alias where it
    isn't used seems to me a piece of less than excellent design.

    cheers

    andrew
  • Bruce Momjian at Sep 14, 2007 at 4:03 am
    This has been saved for the 8.4 release:

    http://momjian.postgresql.org/cgi-bin/pgpatches_hold

    ---------------------------------------------------------------------------

    Andrew Dunstan wrote:

    Chuck McDevitt wrote:
    PostgreSQL already has a huge amount of
    "non-standard" syntax and semantics (perhaps "extensions" is a better
    word?).
    Everything from non-standard cast operator, non-standard substr,
    non-standard trim, non standard group by semantics (allowing simple ints
    to mean column number)... Given a day, we could probably write down
    several pages of "non-standard" features of PGSQL.

    Quite so, and I've perpetrated a few myself. But for the most part they
    are either there for legacy reasons or add significant extra functionality.

    I rather like Alvaro's compromise suggestion re aliases in fact. At
    least there seems to be a better case for that than for "group by 'blurfl'".

    But the argument that convinces me is not that it follows some de facto
    standard, but that it would add to clarity. Requiring an alias where it
    isn't used seems to me a piece of less than excellent design.

    cheers

    andrew

    ---------------------------(end of broadcast)---------------------------
    TIP 2: Don't 'kill -9' the postmaster
    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://www.enterprisedb.com

    + If your life is a hard drive, Christ can be your backup. +
  • Andrew Sullivan at Aug 24, 2007 at 2:29 pm

    On Thu, Aug 23, 2007 at 02:06:16PM -0400, Chuck McDevitt wrote:
    In general, we wouldn't want to support any de facto standard that:

    1. Is supported only by one vendor
    2. Causes any standard SQL statement to fail, or return a different
    answer from the standard.

    The proposed change doesn't fail either of these.
    From what I can see upthread, it fails 1 and possibly 2. Given that
    we don't seem to know _why_ it is forbidden, there could well be a
    case under 2 is a problem, and we haven't thought of it.

    A

    --
    Andrew Sullivan | ajs@crankycanuck.ca
    Everything that happens in the world happens at some place.
    --Jane Jacobs
  • Tom Lane at Aug 23, 2007 at 5:30 pm

    "Chuck McDevitt" <cmcdevitt@greenplum.com> writes:
    Tom, it isn't just a case of "convenience". When we are trying to
    convert users from another database (say Oracle for example) to
    PostgeSQL, one of the big stumbling blocks that slows down the work is
    all the little query changes that people have to make
    Well, if you're trying to sell it on the grounds of Oracle
    compatibility, then it should actually *be* Oracle compatible.
    What exactly do they do about the default-alias problem?

    regards, tom lane
  • Ben Tilly at Aug 23, 2007 at 5:55 pm

    On 8/23/07, Tom Lane wrote:
    "Chuck McDevitt" <cmcdevitt@greenplum.com> writes:
    Tom, it isn't just a case of "convenience". When we are trying to
    convert users from another database (say Oracle for example) to
    PostgeSQL, one of the big stumbling blocks that slows down the work is
    all the little query changes that people have to make
    Well, if you're trying to sell it on the grounds of Oracle
    compatibility, then it should actually *be* Oracle compatible.
    What exactly do they do about the default-alias problem?
    To the best of my knowledge such subqueries are completely anonymous.
    There is no way to explicitly refer to them unless you provide an
    alias. Which is exactly the solution that was proposed twice in this
    thread, and has the further benefit of being forwards compatible with
    any reasonable future standard.

    As verification I asked a certified Oracle DBA. His understanding is
    that Oracle may choose to rewrite the query for you or not. If it
    does not rewrite the query, then it has an internal identifier but
    there is no way you can get to it.

    Cheers,
    Ben
  • Brian Hurt at Aug 23, 2007 at 5:42 pm

    Chuck McDevitt wrote:
    Tom, it isn't just a case of "convenience". When we are trying to
    convert users from another database (say Oracle for example) to
    PostgeSQL, one of the big stumbling blocks that slows down the work is
    all the little query changes that people have to make (people who might
    not have written the query as well), and it can require the review of
    hundreds or thousands of SQL scripts and applications. The harder it
    is, the more reluctant they are to convert.

    Sometimes supporting "de-facto" standards as well as official standards
    makes sense.
    One of the ways "de-facto" standards are even better than standards is
    that there are even more of them. Which de-facto standard should we
    follow: Oracle, MySQL, or MS SQL Server? Note that following all of
    them is a bad idea, as one persons convience is another persons
    stumbling block.

    Brian
  • Gregory Stark at Aug 23, 2007 at 5:20 pm

    "Tom Lane" <tgl@sss.pgh.pa.us> writes:

    Gregory Stark <stark@enterprisedb.com> writes:
    This seems like a particularly petty case compared to a lot of other
    extensions we do allow.
    That's exactly the problem. Most of our other extensions are justified
    by some significant capability gain. This isn't --- it provides zero
    new functionality, and the "convenience" factor amounts to the saving of
    one keystroke (ok, maybe two if you insist on a space before the alias).
    Pretty weak argument...
    I think we're talking about two different things. I'm referring to this:

    postgres=# select * from (select * from test);
    ERROR: subquery in FROM must have an alias
    HINT: For example, FROM (SELECT ...) [AS] foo.

    I get this about once a day because I frequently type this ad-hoc query and
    always forget to put the alias in the first time:

    postgres=# select count(*),n from (select count(*) as n from test group by i) group by n;
    ERROR: subquery in FROM must have an alias
    HINT: For example, FROM (SELECT ...) [AS] foo.


    Not to say that four is a lot larger than one but if we're talking about two
    different things then we're not likely to reach any kind of agreement...

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
  • Alvaro Herrera at Aug 23, 2007 at 5:38 pm

    Gregory Stark escribió:

    I get this about once a day because I frequently type this ad-hoc query and
    always forget to put the alias in the first time:

    postgres=# select count(*),n from (select count(*) as n from test group by i) group by n;
    ERROR: subquery in FROM must have an alias
    HINT: For example, FROM (SELECT ...) [AS] foo.
    Exactly one extra keystroke gets you a non-error answer:

    alvherre=# select count(*),n from (select count(*) as n from test group by i)a group by n;

    --
    Alvaro Herrera http://www.PlanetPostgreSQL.org/
    "Si quieres ser creativo, aprende el arte de perder el tiempo"
  • Ben Tilly at Aug 23, 2007 at 5:26 pm

    On 8/22/07, Michael Glaesemann wrote:
    On Aug 22, 2007, at 20:36 , Ben Tilly wrote:
    [...]
    I can well believe that the standard says that you must accept
    subqueries with aliases. But does it say you must reject subqueries
    without aliases? I strongly doubt that.
    If I'm reading my draft copy of the SQL:2003 spec right (and there's
    a good chance that I'm not, as it's not the easiest document for me
    to parse), aliases *are* required.
    Again, it lays out very carefully the syntax that must be valid. It
    does not say that extensions to that syntax are not valid. Every
    database implements such extensions.

    [...]
    I have no clue what EnterpriseDB does.
    In case it wasn't clear, the reason I bring it up is that
    EnterpriseDB, while working from a PostgreSQL base, strives for
    Oracle compatibility.
    I got the reference. But I don't know what EnterpriseDB does - I've
    never used it.

    [...]
    AIUI, Integers are only allowed because the SQL standard explicitly
    allows you to refer to columns by the order they appear in the SELECT
    list. Otherwise the GROUP BY items need to be column names.
    Need to be?

    The SQL-92 standard is clear that you must accept a list of column
    names. It is also clear that a column name must be be of the form
    field or table.field.
    The 2003 draft (same as above) seems to agree with the SQL92 standard:
    [ large snippet of the draft elided ]
    There'd have to be a pretty strong reason to extend this, more than
    just a convenience, I should think.
    It is already extended in postgres. For pretty good reasons.
    In no way, shape or form does that allow having
    terms like trim(foo.bar) in a group by.

    But every reasonable database that I know - including postgres -
    allows that.
    Can you give an example of something like this working in PostgreSQL?
    I get an error when I try to use a text value in a GROUP BY clause.
    (Or are you referring specifically to the CASE expression corner case?)
    This works in every reasonable database that I have tried it in:

    select trim(foo.bar), count(*)
    from foo
    group by trim(foo.bar)

    And yes, I have tried it in postgres.

    [...]
    Postgres explicitly disallows a constant character expression. But it
    allows the constant case expression that I gave. It would be nice for
    me to not have to remember that very obscure and convoluted case.
    I agree, and would move that it should be disallowed if there isn't a
    reason for it to be maintained, for exactly the reason you give:
    there shouldn't be such convoluted, and obscure corner case.
    There is an excellent reason to generally allow complex expressions in
    group by statements, and that reason is that many useful and
    reasonable queries won't work if you don't. Such as the one I gave
    above.

    [...]
    I don't know what the SQL spec says, but I know (having talked to
    other developers) that many people would find it very nice.
    Since I had the spec open, I tried to look at this as well, though I
    must admit I found it very tough going.

    I think this is the key section:
    10.9 <aggregate function>

    ...

    Syntax Rules

    ...

    4) The argument source of an <aggregate function> is
    Case:
    a) If AF is immediately contained in a <set function
    specification>, then a table or group of a grouped table as
    specified in Subclause 7.10, "<having clause>", and Subclause 7.12,
    "<query specification>".

    b) Otherwise, the collection of rows in the current row's window
    frame defined by the window structure descriptor identified by the
    <window function> that simply contains AF, as defined in Subclause
    7.11, "<window clause>".
    Now the <query specification> stuff is pretty dense, and the <window
    clause> is currently impenetrable for me, so I just looked at the
    <having clause> and <query specification> sections referenced above.
    I'm not surprised that the window clause section is impenetrable to
    you. Window clauses are part of the definition of analytic functions,
    which postgres does NOT yet implement. However they are on the todo
    list.

    Speaking personally, analytic functions are the single feature from
    Oracle that I've most missed when moving to postgres. I would be
    happy to explain what they are and how they should work either on the
    list or off to anyone who is interested in implementing them. However
    they are a topic for another thread, and probably for another week.
    (I'm out next week, and am loathe to open that can of worms just yet.)

    However I will note as a practical matter that implementing analytic
    functions will increase how often people will want to have anonymous
    subqueries in the from statement. (Because for a certain set of
    problems a useful technique is to do a group by query of an analytic
    query.)

    [...]
    A grouped table is a set of groups derived during the evaluation of
    a <group by clause>. A group G is a collection of rows in which,
    for every grouping column GC, if the value of GC in some row is not
    distinct from GV, then the value of GC in every row is GV;
    moreover, if R1 is a row in group G1 of grouped table GT and R2 is
    a row in GT such that for every grouping column GC the value of GC
    in R1 is not distinct from the value of GC in R2, then R2 is in G1.
    Every row in GT is in exactly one group. A group may be considered
    as a table. Set functions operate on groups.
    So we're already talking about an explicit GROUP BY clause, and it
    doesn't seem to leave much wiggle room grouping columns that aren't
    explicitly listed. I'll readily admit I very easily could be reading
    this wrong or have missed a section of the spec that discusses this
    in more detail. (And that goes for any of my interpretations of the
    spec!)
    Again, the spec specifies what must be accepted. It does not ban
    extensions. Postgres already implements the useful extension of
    accepting complicated expressions in a group by. It could implement
    more.
    One place you and I differ is that I think you'd be in favor of many
    more extensions in cases where the SQL spec doesn't explicitly say
    "Thou shalt not do X". I'm a bit more conservative here: in my
    opinion, if the SQL speaks to a particular issue (for example, a
    WHERE clause), then you should pretty much keep to the letter of what
    it says. Extensions are more for areas where the standard is silent.
    We definitely differ. To my eyes an extension that is well defined
    and clearly useful which is currently illegal deserves consideration.
    Utility is not the only concern, of course, you have to think about
    how well it fits with other features, its likelyhood of breaking in
    the future, etc.

    Inferring the group by statement that is likely to be meant the vast
    majority of the time is clearly useful. It would get rid of a lot of
    useless redundancy in a lot of people's queries. It may be that it is
    deemed too much work or too risky to implement.
    For example, full text search will be added to PostgreSQL in 8.3.
    AFAIK, the SQL spec doesn't talk about full text search at all, so
    this is an area where extensions are a natural fit. Others, such as
    PostgreSQL's DISTINCT ON, is a much more difficult issue, and I'd
    suspect if it were proposed as a new feature now it would be
    rejected. However, it's been a part of PostgreSQL for a long time, so
    for historical reasons it remains. Likewise, I doubt "text" as a
    synonym for "varchar" would be accepted.
    If we had this conversation in 15 years, I'm willing to bet that full
    text search would be in the spec and you'd be able to complain that
    something about how it was implemented was not quite standard but
    would have to remain for historical reasons...
    Anyway, enough excursions into SpecLand for me. I'd welcome anyone
    else taking a crack at it.
    Cheers,
    Ben
  • Michael Glaesemann at Aug 23, 2007 at 5:51 pm

    On Aug 23, 2007, at 12:25 , Ben Tilly wrote:

    It is already extended in postgres. For pretty good reasons.
    Yes, this particular point is moot. Whether or not the reasons are
    good is another thing and a difference of opinion.
    I'm not surprised that the window clause section is impenetrable to
    you. Window clauses are part of the definition of analytic functions,
    which postgres does NOT yet implement. However they are on the todo
    list.
    And being worked on, from what I gather. Just not for 8.3. So
    hopefully you won't have to wait too much longer.

    Michael Glaesemann
    grzm seespotcode net
  • Chuck McDevitt at Aug 23, 2007 at 6:59 am

    -----Original Message-----
    From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
    owner@postgresql.org] On Behalf Of Michael Glaesemann
    Sent: Wednesday, August 22, 2007 5:17 PM
    To: Ben Tilly
    Cc: pgsql-hackers@postgresql.org
    Subject: Re: [HACKERS] SQL feature requests

    On Aug 22, 2007, at 18:45 , Ben Tilly wrote:

    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias?
    It's required by the SQL standard, AIUI. I wonder what EnterpriseDB
    does?
    2. Why is 'non-integer constant in GROUP BY' an error?
    This works for now:

    case when true then true end

    but I don't know whether some future version of postgres might break
    my code by banning that as well.
    1. The SQL standard requires an alias for the subquery, but many
    real-world SQL implementations relax this requirement in the case where
    it is unambiguous. The standard doesn't say you have to reject
    statements without the alias, it says only that you must accept the ones
    that do. PostgreSQL has many things in its SQL where we accept things
    that the standard doesn't require, and I don't see a good argument why
    it shouldn't allow this.

    2. The SQL standard only allows column names in group by lists, NOT
    expressions. PostgreSQL extends the standard by allowing some, but not
    all, expressions in the group by list (somewhat inconsistent, in my
    view). Expressions in the group by list is actually a quite common
    extension. But PostgreSQL also extends the standard by attaching
    special meaning to simple integer constants in the group by list, and
    treating them as column numbers from the select list. As far as I
    remember, the SQL standard only allows that on ORDER BY clauses.
    Anyway, it seems reasonable to allow expressions in the group by, such
    as:

    Select a*10, sum(b) from x group by a*10;

    But once you allow that, it seems like you should allow any expression,
    even degenerate ones such as
    select 'foo',sum(b) from x group by 'foo';
  • Chuck McDevitt at Aug 23, 2007 at 7:19 am

    -----Original Message-----
    From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
    owner@postgresql.org] On Behalf Of Chuck McDevitt
    Sent: Wednesday, August 22, 2007 11:53 PM
    To: Michael Glaesemann; Ben Tilly
    Cc: pgsql-hackers@postgresql.org
    Subject: Re: [HACKERS] SQL feature requests
    -----Original Message-----
    From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
    owner@postgresql.org] On Behalf Of Michael Glaesemann
    Sent: Wednesday, August 22, 2007 5:17 PM
    To: Ben Tilly
    Cc: pgsql-hackers@postgresql.org
    Subject: Re: [HACKERS] SQL feature requests

    On Aug 22, 2007, at 18:45 , Ben Tilly wrote:

    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias?
    It's required by the SQL standard, AIUI. I wonder what EnterpriseDB
    does?
    2. Why is 'non-integer constant in GROUP BY' an error?
    This works for now:

    case when true then true end

    but I don't know whether some future version of postgres might
    break
    my code by banning that as well.
    1. The SQL standard requires an alias for the subquery, but many
    real-world SQL implementations relax this requirement in the case where
    it is unambiguous. The standard doesn't say you have to reject
    statements without the alias, it says only that you must accept the
    ones
    that do. PostgreSQL has many things in its SQL where we accept things
    that the standard doesn't require, and I don't see a good argument why
    it shouldn't allow this.

    2. The SQL standard only allows column names in group by lists, NOT
    expressions. PostgreSQL extends the standard by allowing some, but not
    all, expressions in the group by list (somewhat inconsistent, in my
    view). Expressions in the group by list is actually a quite common
    extension. But PostgreSQL also extends the standard by attaching
    special meaning to simple integer constants in the group by list, and
    treating them as column numbers from the select list. As far as I
    remember, the SQL standard only allows that on ORDER BY clauses.
    Anyway, it seems reasonable to allow expressions in the group by, such
    as:

    Select a*10, sum(b) from x group by a*10;

    But once you allow that, it seems like you should allow any
    expression,
    even degenerate ones such as
    select 'foo',sum(b) from x group by 'foo';
    Just wanted to point out that the group by thing is really just
    syntactic sugar.
    You can always get a SQL standard compliant system to accept the
    constants this way:

    Select z,sum(b) from (
    Select 'foo',b from x) a1 (z,b)
    Group by z;

    This will work (although with PostgreSQL, you might have to cast the
    string constant to text, because it is "unknown" datatype).
  • Gregory Stark at Aug 23, 2007 at 12:24 am

    "Ben Tilly" <btilly@gmail.com> writes:

    Hopefully this is the right place for a few feature requests that
    would address some of the things that I've noticed in postgres.

    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias? For instance suppose that I have an orders table, and
    one of the fields is userid. The following is unambiguous and is
    legal in Oracle:
    Thank you, this is one of my top pet peeves but when I proposed changing it I
    was told nobody's complained. Now we have at least one user complaint, any
    others out there?
    2. Why is 'non-integer constant in GROUP BY' an error?
    Hm... I was a bit surprised by this warning myself. IIRC there was an
    implementation convenience issue.
    3. How hard would it be to have postgres ignore aliases in group by
    clauses?
    That sounds like a strange idea.
    4) Items 2 and 3 would both be made irrelevant if postgres did
    something that I'd really, really would like. Which is to assume that
    a query without a group by clause, but with an aggregate function in
    the select, should have an implicit group by clause where you group by
    all non-aggregate functions in the select.

    For example

    SELECT foo, count(*)
    FROM bar

    would be processed as:

    SELECT foo, count(*)
    FROM bar
    GROUP BY foo
    I agree this would be convenient but it seems too scary to actually go
    anywhere. What would you group by in the case of:

    SELECT a+b, count(*) FROM bar

    Should it group by a,b or a+b ?

    Also, this might be a bit shocking for MySQL users who are accustomed to
    MySQL's non-standard extension for the same syntax. There it's treated as an
    assertion that the columns are equal for all records in a group or at least
    that it doesn't matter which such value is returned, effectively equivalent to
    our DISTINCT ON feature.

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
  • Ben Tilly at Aug 23, 2007 at 1:49 am

    On 8/22/07, Gregory Stark wrote:
    "Ben Tilly" <btilly@gmail.com> writes:
    Hopefully this is the right place for a few feature requests that
    would address some of the things that I've noticed in postgres.

    1. Just a minor annoyance, but why must subqueries in FROM clauses
    have an alias? For instance suppose that I have an orders table, and
    one of the fields is userid. The following is unambiguous and is
    legal in Oracle:
    Thank you, this is one of my top pet peeves but when I proposed changing it I
    was told nobody's complained. Now we have at least one user complaint, any
    others out there?
    Always happy to complain. :-)
    2. Why is 'non-integer constant in GROUP BY' an error?
    Hm... I was a bit surprised by this warning myself. IIRC there was an
    implementation convenience issue.
    If your implementation accepts:

    group by case when true then 'foo' end

    how much harder can it be to accept:

    group by 'foo'

    ?
    3. How hard would it be to have postgres ignore aliases in group by
    clauses?
    That sounds like a strange idea.
    It is a strange idea, but it makes dynamically building queries
    easier. Right now I'm following a strategy of storing what I'm going
    to insert in the select clause in one variable, and the group by
    clause in another. So I need 2 variables for each dynamic field that
    I might choose to group by and want to have a custom name for. With
    this change I would only need one variable.
    4) Items 2 and 3 would both be made irrelevant if postgres did
    something that I'd really, really would like. Which is to assume that
    a query without a group by clause, but with an aggregate function in
    the select, should have an implicit group by clause where you group by
    all non-aggregate functions in the select.

    For example

    SELECT foo, count(*)
    FROM bar

    would be processed as:

    SELECT foo, count(*)
    FROM bar
    GROUP BY foo
    I agree this would be convenient but it seems too scary to actually go
    anywhere. What would you group by in the case of:

    SELECT a+b, count(*) FROM bar

    Should it group by a,b or a+b ?
    It should group by a+b. Which is to say, every field in the select
    clause that currently triggers an error because it isn't in the group
    by clause.
    Also, this might be a bit shocking for MySQL users who are accustomed to
    MySQL's non-standard extension for the same syntax. There it's treated as an
    assertion that the columns are equal for all records in a group or at least
    that it doesn't matter which such value is returned, effectively equivalent to
    our DISTINCT ON feature.
    I don't mind shocking MySQL users. ;-)

    But seriously, if that objection is the barrier then I'd be happy to
    see it be something that is explicitly turned on in the query. For
    instance:

    select autogroup bar, count(*) from foo

    If that was available then I for one would type autogroup a lot more
    often than group by. After all autogroup is about as hard to type,
    and I don't have to type the redundant list of fields in the group by.

    Cheers,
    Ben
  • Florian G. Pflug at Aug 23, 2007 at 2:32 am

    Ben Tilly wrote:
    On 8/22/07, Gregory Stark wrote:
    "Ben Tilly" <btilly@gmail.com> writes:
    2. Why is 'non-integer constant in GROUP BY' an error?
    Hm... I was a bit surprised by this warning myself. IIRC there was an
    implementation convenience issue.
    If your implementation accepts:

    group by case when true then 'foo' end

    how much harder can it be to accept:

    group by 'foo'
    This is not about hardness of the implementation, but rather about
    non-confusing behaviour I think.

    AFAIK, "group by 1" means "group by the first selected column", not
    "group all rows together". But "group by 'foo'" would carry the second
    meaning - "group all rows together". This is so totally counter-intuitive,
    that it's not even funny...

    "group by case when true then 'foo' end" looks different enough compared to
    "group by 1" to make this less of a footgun.

    Seems that the "group by <integer>" syntax predates the appearance of
    aliases in the sql standard...

    greetings, Florian flug
  • Zeugswetter Andreas ADI SD at Aug 23, 2007 at 8:00 am

    how much harder can it be to accept:

    group by 'foo'
    Presumably you meant group by "foo".

    Imho pg should accept group by "foo". It could be part of a constant
    removal, that also takes burden off the sort.

    e.g. in "select x, count(*) from bar where x=5 group by x", x could be
    removed since it is constant.
    This is not about hardness of the implementation, but rather about
    non-confusing behaviour I think.

    AFAIK, "group by 1" means "group by the first selected column", not
    "group all rows together". But "group by 'foo'" would carry the second
    meaning - "group all rows together".
    Yes. I don't see the issue. 1 is imho sufficiently different even from
    "1".
    Pg is not alone in allowing column number in group by.

    Andreas
  • Gregory Stark at Aug 23, 2007 at 9:16 am

    "Zeugswetter Andreas ADI SD" <Andreas.Zeugswetter@s-itsolutions.at> writes:

    how much harder can it be to accept:

    group by 'foo'
    Presumably you meant group by "foo".
    No that's the whole point. He meant the constant string 'foo' not the column
    identifier "foo" which certainly should work now.

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
  • Michael Glaesemann at Aug 23, 2007 at 4:28 am

    On Aug 22, 2007, at 20:49 , Ben Tilly wrote:

    If your implementation accepts:

    group by case when true then 'foo' end
    What would that mean? Regardless of whether or not it's accepted, it
    should have *some* meaning.

    It's not equivalent to GROUP BY "foo"

    test=# select record_id as foo
    , count(observation_id) as bar
    from observation
    group by case when true
    then 'foo'
    end;
    ERROR: column "observation.record_id" must appear in the GROUP BY
    clause or be used in an aggregate function
    test=# select record_id
    , count(observation_id) as bar
    from observation
    group by case when true
    then 'record_id'
    end;
    ERROR: column "observation.record_id" must appear in the GROUP BY
    clause or be used in an aggregate function

    *This* seems like a bug:
    test=# select record_id
    , count(observation_id) as bar
    from observation
    group by record_id
    , case when true
    then 'foo'
    end;
    record_id | bar
    -----------+-----
    1 | 4
    2 | 4
    3 | 2
    (3 rows)

    And for good measure:

    test=# select record_id
    , count(observation_id) as bar
    from observation
    group by case when true
    then record_id
    end;
    ERROR: column "observation.record_id" must appear in the GROUP BY
    clause or be used in an aggregate function

    Michael Glaesemann
    grzm seespotcode net

Related Discussions

People

Translate

site design / logo © 2021 Grokbase