FAQ
Hi,


I've got performance problem and while I'dont ready to describe it
I'd like to ask about strange explain:

tour=# explain analyze select * from tours where
( operator_id in (2,3,4,5,7) and type_id = 2 ) or
( operator_id = 8 and type_id=4 );

NOTICE: QUERY PLAN:

Index Scan using type_idx, type_idx, type_idx, type_idx, type_idx, type_idx on tours (cost=0.00..12.25 rows=1 width=1091) (actual time=0.26..0.26 rows=0 loops=1)
Total runtime: 0.45 msec

EXPLAIN

What does many 'type_idx' means ?

Theare are 2 indices - operator_idx and type_idx.


Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Search Discussions

  • Tom Lane at May 13, 2002 at 3:01 pm

    Oleg Bartunov writes:
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 2 ) or
    ( operator_id = 8 and type_id=4 );
    Index Scan using type_idx, type_idx, type_idx, type_idx, type_idx, type_idx on tours (cost=0.00..12.25 rows=1 width=1091) (actual time=0.26..0.26 rows=0 loops=1)
    What does many 'type_idx' means ?
    Multiple indexscans.

    It looks to me like your WHERE clause is being flattened into

    ( operator_id = 2 and type_id=2 ) or
    ( operator_id = 3 and type_id=2 ) or
    ( operator_id = 4 and type_id=2 ) or
    ( operator_id = 5 and type_id=2 ) or
    ( operator_id = 7 and type_id=2 ) or
    ( operator_id = 8 and type_id=4 )

    and then it has a choice of repeated indexscans on operator_id or
    type_id. Depending on the selectivity stats it might pick either.
    You might find that a 2-column index on both would be a win.

    regards, tom lane
  • Oleg Bartunov at May 13, 2002 at 6:56 pm
    Thanks Tom,

    On Mon, 13 May 2002, Tom Lane wrote:

    Oleg Bartunov <oleg@sai.msu.su> writes:
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 2 ) or
    ( operator_id = 8 and type_id=4 );
    Index Scan using type_idx, type_idx, type_idx, type_idx, type_idx, type_idx on tours (cost=0.00..12.25 rows=1 width=1091) (actual time=0.26..0.26 rows=0 loops=1)
    What does many 'type_idx' means ?
    Multiple indexscans.

    It looks to me like your WHERE clause is being flattened into

    ( operator_id = 2 and type_id=2 ) or
    ( operator_id = 3 and type_id=2 ) or
    ( operator_id = 4 and type_id=2 ) or
    ( operator_id = 5 and type_id=2 ) or
    ( operator_id = 7 and type_id=2 ) or
    ( operator_id = 8 and type_id=4 )
    this is what I assume.
    and then it has a choice of repeated indexscans on operator_id or
    type_id. Depending on the selectivity stats it might pick either.
    You might find that a 2-column index on both would be a win.
    Yes, we've went exactly this way.

    I'm very exited how planner could be smart. When I played with the query
    and specify different values of type_id I notice it's chose plans depends
    on is value exists or not.

    regards, tom lane
    Regards,
    Oleg
    _____________________________________________________________
    Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
    Sternberg Astronomical Institute, Moscow University (Russia)
    Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
    phone: +007(095)939-16-83, +007(095)939-23-83
  • Rod Taylor at May 13, 2002 at 3:47 pm
    It appears it scanes the type_idx once per opereator. IN gets broken
    down into ORs

    Is this what the TODO entry 'Make IN / NOT IN have similar performance
    as EXISTS' means?
    --
    Rod
    ----- Original Message -----
    From: "Oleg Bartunov" <oleg@sai.msu.su>
    To: "Pgsql Hackers" <pgsql-hackers@postgresql.org>; "Tom Lane"
    <tgl@sss.pgh.pa.us>
    Sent: Monday, May 13, 2002 9:42 AM
    Subject: [HACKERS] strange explain

    Hi,


    I've got performance problem and while I'dont ready to describe it
    I'd like to ask about strange explain:

    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 2 ) or
    ( operator_id = 8 and type_id=4 );

    NOTICE: QUERY PLAN:

    Index Scan using type_idx, type_idx, type_idx, type_idx, type_idx,
    type_idx on tours (cost=0.00..12.25 rows=1 width=1091) (actual
    time=0.26..0.26 rows=0 loops=1)
    Total runtime: 0.45 msec

    EXPLAIN

    What does many 'type_idx' means ?

    Theare are 2 indices - operator_idx and type_idx.


    Regards,
    Oleg
    _____________________________________________________________
    Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
    Sternberg Astronomical Institute, Moscow University (Russia)
    Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
    phone: +007(095)939-16-83, +007(095)939-23-83


    ---------------------------(end of
    broadcast)---------------------------
    TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to
    majordomo@postgresql.org)
    >
  • Tom Lane at May 13, 2002 at 4:06 pm

    "Rod Taylor" <rbt@zort.ca> writes:
    Is this what the TODO entry 'Make IN / NOT IN have similar performance
    as EXISTS' means?
    No. The TODO item is talking about IN with a sub-SELECT, which is not
    optimized at all at the moment. IN with a list of scalar values is
    converted to ((x = value1) OR (x = value2) OR ...), which we can do
    something with.

    regards, tom lane
  • Oleg Bartunov at May 13, 2002 at 7:38 pm
    Tom,

    one more question.

    What's the difference for planner between 2 queries ?
    For the first query I have plain index scan, but multiple
    index scan for second.

    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 2 );
    NOTICE: QUERY PLAN:

    Index Scan using type_idx on tours (cost=0.00..2.03 rows=1 width=1091) (actual time=0.03..0.03 rows=0 loops=1)
    Total runtime: 0.16 msec

    EXPLAIN
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 4 ) or
    ( operator_id = 8 and type_id = 3);
    NOTICE: QUERY PLAN:

    Index Scan using type_idx, type_idx, type_idx, type_idx, type_idx, type_idx on tours (cost=0.00..12.25 rows=1 width=1091) (actual time=0.27..0.27 rows=0 loops=1)
    Total runtime: 0.44 msec

    EXPLAIN



    On Mon, 13 May 2002, Tom Lane wrote:

    Oleg Bartunov <oleg@sai.msu.su> writes:
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 2 ) or
    ( operator_id = 8 and type_id=4 );
    Index Scan using type_idx, type_idx, type_idx, type_idx, type_idx, type_idx on tours (cost=0.00..12.25 rows=1 width=1091) (actual time=0.26..0.26 rows=0 loops=1)
    What does many 'type_idx' means ?
    Multiple indexscans.

    It looks to me like your WHERE clause is being flattened into

    ( operator_id = 2 and type_id=2 ) or
    ( operator_id = 3 and type_id=2 ) or
    ( operator_id = 4 and type_id=2 ) or
    ( operator_id = 5 and type_id=2 ) or
    ( operator_id = 7 and type_id=2 ) or
    ( operator_id = 8 and type_id=4 )

    and then it has a choice of repeated indexscans on operator_id or
    type_id. Depending on the selectivity stats it might pick either.
    You might find that a 2-column index on both would be a win.

    regards, tom lane

    ---------------------------(end of broadcast)---------------------------
    TIP 3: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to majordomo@postgresql.org so that your
    message can get through to the mailing list cleanly
    Regards,
    Oleg
    _____________________________________________________________
    Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
    Sternberg Astronomical Institute, Moscow University (Russia)
    Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
    phone: +007(095)939-16-83, +007(095)939-23-83
  • Tom Lane at May 13, 2002 at 7:09 pm

    Oleg Bartunov writes:
    What's the difference for planner between 2 queries ?
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 2 );
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 4 ) or
    ( operator_id = 8 and type_id = 3);
    The first one's already in normal form and doesn't need any more
    flattening. I believe the system will consider a multiple indexscan
    on operator_idx for it, but probably the cost estimator is concluding
    that that's a loser compared to one indexscan using type_id = 2.
    Without any info on the selectivity of these conditions it's hard to say
    whether that's a correct choice or not.

    regards, tom lane
  • Andreas Zeugswetter at May 14, 2002 at 6:39 am

    EXPLAIN
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 4 ) or
    ( operator_id = 8 and type_id = 3);
    NOTICE: QUERY PLAN:

    Index Scan using type_idx, type_idx, type_idx, type_idx, type_idx, type_idx on tours (cost=>
    0.00..12.25 rows=1 width=1091) (actual time=0.27..0.27 rows=0 loops=1)
    Total runtime: 0.44 msec
    Actually this plan looks very strange to me. One would expect it to only use
    type_idx twice (in lack of a better index (type_id, operator_id)).
    Fetch all rows with type_id=4 and then filter the result on operator_id in (...),
    then do type_id=3 and filter operator_id=8.
    Seems there is room for another performance improvement here :-)

    Andreas
  • Tom Lane at May 14, 2002 at 1:40 pm

    "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
    tour=# explain analyze select * from tours where
    ( operator_id in (2,3,4,5,7) and type_id = 4 ) or
    ( operator_id = 8 and type_id = 3);
    Actually this plan looks very strange to me. One would expect it to only use
    type_idx twice (in lack of a better index (type_id, operator_id)).
    Seems there is room for another performance improvement here :-)
    Yeah, this demonstrates that reducing the quals to canonical form isn't
    always the best thing to do.

    Or maybe we could just look for duplicate indexqual conditions at the
    end of the process?

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedMay 13, '02 at 2:17p
activeMay 14, '02 at 1:40p
posts9
users4
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase