Tom Lane wrote:
Dennis Haney <davh@diku.dk> writes:

I saw it as though convert_IN_to_join rewrote the query from


select a.* from tenk1 a where a.unique1 in
(select c.thousand from tenk1 c where c.hundred = 99);


to


select a.* from tenk1 a, tenk1 c where a.unique1 = c.thousand AND
c.hundred = 99;


but after looking at it, I've reached the conclusion that the rewrite is
to this instead:


select a.* from tenk1 a, (select d.thousand from tenk1 d where
d.hundred = 99) as c where a.unique1 = c.thousand;
Right. We do that, and then subsequently pull_up_subqueries transforms
it to the other representation. The reason for this two-step approach
is that the intermediate form is still a useful improvement if the
subquery can't be pulled up for some reason (e.g., it's got grouping).
With improvement I can see that it can be materialized and thus used as
a normal table in the planner. Is there any additional reasons that I
can't see?
But this limited optimization makes me wonder, why the limitation to
optimizing '='?
And why must the lefthand of the sublink be a variable of the upper query?

except the subselect is added as a range table entry instead of a
subselect in the from-list (not that I understand this particular part,
do you mind explaining?).
Same thing. Every entry in the from-list will have both an RTE and an
entry in the join tree. This representation is partly historical
(before we had outer joins, there was only the range table and no join
tree at all), but it is convenient for many purposes.
Then I don't understand why it gives two different execution plans? And
the Query* is totally different for the two, eg. there is no RTE for the
subquery in the first query:

davh=# explain select a.* from test1 a, (select num from test1 where id = 2) as b where a.num = b.num;
QUERY PLAN
------------------------------------------------------------------------------------
Hash Join (cost=4.83..29.94 rows=11 width=8)
Hash Cond: ("outer".num = "inner".num)
-> Seq Scan on test1 a (cost=0.00..20.00 rows=1000 width=8)
-> Hash (cost=4.82..4.82 rows=2 width=4)
-> Index Scan using test1_pkey on test1 (cost=0.00..4.82 rows=2 width=4)
Index Cond: (id = 2)
(6 rows)

davh=# explain select a.* from test1 a where a.num in (select num from test1 where id = 2);
QUERY PLAN
------------------------------------------------------------------------------------
Hash IN Join (cost=4.83..28.75 rows=6 width=8)
Hash Cond: ("outer".num = "inner".num)
-> Seq Scan on test1 a (cost=0.00..20.00 rows=1000 width=8)
-> Hash (cost=4.82..4.82 rows=2 width=4)
-> Index Scan using test1_pkey on test1 (cost=0.00..4.82 rows=2 width=4)
Index Cond: (id = 2)
(6 rows)

PS: this is a bit off-topic for pgsql-general, please pursue it on
-hackers if you have more questions.
ok


--
Dennis

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 7 | next ›
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJan 23, '04 at 6:37p
activeJan 27, '04 at 5:28p
posts7
users3
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase