FAQ
Merlin sent me a test case off-list for the problem mentioned here:
http://archives.postgresql.org/pgsql-bugs/2010-07/msg00025.php

After some investigation I was able to simplify it to the following
example using the regression database:

select
(select sq1) as qq1
from
(select exists(select 1 from int4_tbl where f1 = q2) as sq1, 42 as dummy
from int8_tbl) sq0
join
int4_tbl i4 on dummy = i4.f1;

The problem is that flatten_join_alias_vars() can push SubLink
expressions down into sub-selects, as in this example when it replaces
the "sq1" reference with the EXISTS() subexpression that was previously
pulled up by flattening sq0. But it fails to set the hasSubLinks flag
in the sub-Query, so subsequent processing doesn't think it needs to do
SS_process_sublinks within the sub-Query, and eventually we fail when we
come across the unprocessed SubLink. This bug goes clear back to 7.4.
Fortunately it's simple to fix.

What seems more interesting is that I initially had a hard time
reproducing the bug under different conditions, and didn't figure out
what was going on until I realized that I had used pg_dump to
consolidate the multiple files Merlin sent ... and *reloading pg_dump's
version of the views didn't exhibit the bug*. This is because pg_dump,
or more accurately ruleutils.c, has a habit of qualifying variable
references whether or not they were qualified in the original query.
If you turn the above example into a view and then dump it, you'll get

... (select sq0.sq1) as qq1 ...

and that doesn't tickle this bug. (That's because "sq0.sq1" isn't a
join alias Var, whereas unqualified "sq1" is.)

So the question that seems worth discussing is whether this difference
ought to be considered a bug in ruleutils. In theory it shouldn't
matter if pg_dump adds an "unnecessary" qualification, but here's an
example where it did matter. Do we care? People tend to assume that
dumping and reloading will make no change in the behavior of their
views, so this seems kind of scary to me. On the other hand, the
"extra" qualifications make view definitions a bit more robust in the
face of column additions, renamings, etc. So there's certainly a case
to be made that the dump behavior is preferable as-is.

Thoughts?

regards, tom lane

Search Discussions

  • Robert Haas at Jul 7, 2010 at 11:04 pm

    On Wed, Jul 7, 2010 at 6:55 PM, Tom Lane wrote:
    After some investigation I was able to simplify it to the following
    example using the regression database:

    select
    (select sq1) as qq1
    from
    (select exists(select 1 from int4_tbl where f1 = q2) as sq1, 42 as dummy
    from int8_tbl) sq0
    join
    int4_tbl i4 on dummy = i4.f1;

    [discussion of bug]

    What seems more interesting is that I initially had a hard time
    reproducing the bug under different conditions, and didn't figure out
    what was going on until I realized that I had used pg_dump to
    consolidate the multiple files Merlin sent ... and *reloading pg_dump's
    version of the views didn't exhibit the bug*.  This is because pg_dump,
    or more accurately ruleutils.c, has a habit of qualifying variable
    references whether or not they were qualified in the original query.
    If you turn the above example into a view and then dump it, you'll get

    ... (select sq0.sq1) as qq1 ...

    and that doesn't tickle this bug.  (That's because "sq0.sq1" isn't a
    join alias Var, whereas unqualified "sq1" is.)
    I'm lost. What's a join alias var?

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Tom Lane at Jul 7, 2010 at 11:43 pm

    Robert Haas writes:
    I'm lost. What's a join alias var?
    Suppose we have t1 with columns a,b,c and t2 with columns d,e,f, then
    consider

    select a from t1 join t2 on (whatever)
    select t1.a from t1 join t2 on (whatever)

    In the first case the parser generates a Var that references a column of
    the unnamed join's RTE; in the second case you get a Var that references
    t1 directly. These particular cases are semantically equivalent, but
    there are lots of other cases where it's important to draw the
    distinction. One interesting example is

    select x from (t1 join t2 on (whatever)) as j(x,y,z,xx,yy,zz)

    where per SQL spec it'd actually be illegal to write a (or t1.a) because
    the named join hides its components. But I think what forced us to have
    different representations is FULL JOIN USING. If you have

    select id from taba full join tabb using (id)

    then taba.id and tabb.id and the join's output variable id are all
    semantically different and *must* be given different representations at
    the Var level.

    Anyway, the way it works is that the parser generates "alias Vars" that
    refer to the join RTE, mainly because this makes life simpler for
    ruleutils. But the planner prefers to work with the "real" underlying
    columns whenever those are semantically equivalent, so it has a pass
    that does the replacement, and that's what's broken ...

    regards, tom lane
  • Robert Haas at Jul 8, 2010 at 1:04 am

    On Wed, Jul 7, 2010 at 7:43 PM, Tom Lane wrote:
    Anyway, the way it works is that the parser generates "alias Vars" that
    refer to the join RTE, mainly because this makes life simpler for
    ruleutils.  But the planner prefers to work with the "real" underlying
    columns whenever those are semantically equivalent, so it has a pass
    that does the replacement, and that's what's broken ...
    Well, +1 from me for leaving the ruleutils as-is. I don't think we
    should go out of our way to generate join alias vars just on the off
    chance that there's a bug in the translation from join alias vars to
    plain ol' vars, and I agree with your statement upthread that
    qualification makes things more robust.

    I like robust.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJul 7, '10 at 10:56p
activeJul 8, '10 at 1:04a
posts4
users2
websitepostgresql.org...
irc#postgresql

2 users in discussion

Tom Lane: 2 posts Robert Haas: 2 posts

People

Translate

site design / logo © 2022 Grokbase