When I do a query on a table with child tables on certain queries pg
uses indexes and on others it doesn't. Why does this happen? For example:


[local]:playpen=> explain analyze select * from vis where id > 10747 ;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Result (cost=4.29..115.11 rows=325 width=634) (actual
time=0.063..0.116 rows=5 loops=1)
-> Append (cost=4.29..115.11 rows=325 width=634) (actual
time=0.053..0.090 rows=5 loops=1)
-> Bitmap Heap Scan on vis (cost=4.29..23.11 rows=5
width=948) (actual time=0.051..0.058 rows=5 loops=1)
Recheck Cond: (id > 10747)
-> Bitmap Index Scan on vis_pkey (cost=0.00..4.29
rows=5 width=0) (actual time=0.037..0.037 rows=5 loops=1)
Index Cond: (id > 10747)
-> Seq Scan on vis_for_seg_1_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id > 10747)
-> Seq Scan on vis_for_seg_4_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id > 10747)
-> Seq Scan on vis_for_seg_66_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (id > 10747)
-> Seq Scan on vis_for_seg_69_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id > 10747)
-> Seq Scan on vis_for_seg_79_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (id > 10747)
-> Seq Scan on vis_for_seg_80_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (id > 10747)
-> Seq Scan on vis_for_seg_82_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (id > 10747)
-> Seq Scan on vis_for_seg_87_2011_03 vis (cost=0.00..11.50
rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id > 10747)
Total runtime: 0.724 ms
(23 rows)

Time: 5.804 ms
[local]:playpen=> explain analyze select * from vis where id = 10747 ;

QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..74.41 rows=9 width=664) (actual time=0.060..0.503
rows=1 loops=1)
-> Append (cost=0.00..74.41 rows=9 width=664) (actual
time=0.053..0.493 rows=1 loops=1)
-> Index Scan using vis_pkey on vis (cost=0.00..8.27 rows=1
width=948) (actual time=0.051..0.055 rows=1 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_1_2011_03_pkey on
vis_for_seg_1_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.122..0.122 rows=0 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_4_2011_03_pkey on
vis_for_seg_4_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.043..0.043 rows=0 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_66_2011_03_pkey on
vis_for_seg_66_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.041..0.041 rows=0 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_69_2011_03_pkey on
vis_for_seg_69_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.041..0.041 rows=0 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_79_2011_03_pkey on
vis_for_seg_79_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.043..0.043 rows=0 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_80_2011_03_pkey on
vis_for_seg_80_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.041..0.041 rows=0 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_82_2011_03_pkey on
vis_for_seg_82_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.049..0.049 rows=0 loops=1)
Index Cond: (id = 10747)
-> Index Scan using vis_for_seg_87_2011_03_pkey on
vis_for_seg_87_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual
time=0.043..0.043 rows=0 loops=1)
Index Cond: (id = 10747)
Total runtime: 1.110 ms
(21 rows)

[local]:playpen=> select version();

version
------------------------------------------------------------------------------------------------------------
PostgreSQL 9.0.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-48), 32-bit
(1 row)

Search Discussions

  • Robert Haas at Apr 27, 2011 at 8:36 pm

    On Fri, Apr 1, 2011 at 2:41 AM, Joseph Shraibman wrote:
    When I do a query on a table with child tables on certain queries pg
    uses indexes and on others it doesn't. Why does this happen? For example:


    [local]:playpen=> explain analyze select * from vis where id > 10747 ;
    QUERY
    PLAN
    ----------------------------------------------------------------------------------------------------------------------------------------
    Result  (cost=4.29..115.11 rows=325 width=634) (actual
    time=0.063..0.116 rows=5 loops=1)
    ->  Append  (cost=4.29..115.11 rows=325 width=634) (actual
    time=0.053..0.090 rows=5 loops=1)
    ->  Bitmap Heap Scan on vis  (cost=4.29..23.11 rows=5
    width=948) (actual time=0.051..0.058 rows=5 loops=1)
    Recheck Cond: (id > 10747)
    ->  Bitmap Index Scan on vis_pkey  (cost=0.00..4.29
    rows=5 width=0) (actual time=0.037..0.037 rows=5 loops=1)
    Index Cond: (id > 10747)
    ->  Seq Scan on vis_for_seg_1_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
    Filter: (id > 10747)
    ->  Seq Scan on vis_for_seg_4_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
    Filter: (id > 10747)
    ->  Seq Scan on vis_for_seg_66_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
    Filter: (id > 10747)
    ->  Seq Scan on vis_for_seg_69_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
    Filter: (id > 10747)
    ->  Seq Scan on vis_for_seg_79_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
    Filter: (id > 10747)
    ->  Seq Scan on vis_for_seg_80_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
    Filter: (id > 10747)
    ->  Seq Scan on vis_for_seg_82_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1)
    Filter: (id > 10747)
    ->  Seq Scan on vis_for_seg_87_2011_03 vis  (cost=0.00..11.50
    rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1)
    Filter: (id > 10747)
    Total runtime: 0.724 ms
    (23 rows)

    Time: 5.804 ms
    [local]:playpen=> explain analyze select * from vis where id = 10747 ;

    QUERY
    PLAN

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Result  (cost=0.00..74.41 rows=9 width=664) (actual time=0.060..0.503
    rows=1 loops=1)
    ->  Append  (cost=0.00..74.41 rows=9 width=664) (actual
    time=0.053..0.493 rows=1 loops=1)
    ->  Index Scan using vis_pkey on vis  (cost=0.00..8.27 rows=1
    width=948) (actual time=0.051..0.055 rows=1 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_1_2011_03_pkey on
    vis_for_seg_1_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.122..0.122 rows=0 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_4_2011_03_pkey on
    vis_for_seg_4_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.043..0.043 rows=0 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_66_2011_03_pkey on
    vis_for_seg_66_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.041..0.041 rows=0 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_69_2011_03_pkey on
    vis_for_seg_69_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.041..0.041 rows=0 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_79_2011_03_pkey on
    vis_for_seg_79_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.043..0.043 rows=0 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_80_2011_03_pkey on
    vis_for_seg_80_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.041..0.041 rows=0 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_82_2011_03_pkey on
    vis_for_seg_82_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.049..0.049 rows=0 loops=1)
    Index Cond: (id = 10747)
    ->  Index Scan using vis_for_seg_87_2011_03_pkey on
    vis_for_seg_87_2011_03 vis  (cost=0.00..8.27 rows=1 width=629) (actual
    time=0.043..0.043 rows=0 loops=1)
    Index Cond: (id = 10747)
    Total runtime: 1.110 ms
    (21 rows)

    [local]:playpen=> select version();

    version
    ------------------------------------------------------------------------------------------------------------
    PostgreSQL 9.0.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
    20080704 (Red Hat 4.1.2-48), 32-bit
    (1 row)
    In the first case, PostgreSQL evidently thinks that using the indexes
    will be slower than just ignoring them. You could find out whether
    it's right by trying it with enable_seqscan=off.

    If it turns out that using the indexes really is better, then you
    probably want to adjust random_page_cost and seq_page_cost. The
    defaults assume a mostly-not-cached database, so if your database is
    heavily or completely cached you might need significantly lower
    values.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Joseph Shraibman at Apr 27, 2011 at 9:42 pm

    On 04/27/2011 04:32 PM, Robert Haas wrote:
    In the first case, PostgreSQL evidently thinks that using the indexes
    will be slower than just ignoring them. You could find out whether
    it's right by trying it with enable_seqscan=off.
    My point is that this is just a problem with inherited tables. It
    should be obvious to postgres that few rows are being returned, but in
    the inherited tables case it doesn't use indexes. This was just an
    example. In a 52 gig table I have a "select id from table limit 1 order
    by id desc" returns instantly, but as soon as you declare a child table
    it tries to seq scan all the tables.
  • Samuel Gendler at Apr 28, 2011 at 12:59 am

    On Wed, Apr 27, 2011 at 2:11 PM, Joseph Shraibman wrote:
    On 04/27/2011 04:32 PM, Robert Haas wrote:
    In the first case, PostgreSQL evidently thinks that using the indexes
    will be slower than just ignoring them. You could find out whether
    it's right by trying it with enable_seqscan=off.
    My point is that this is just a problem with inherited tables. It
    should be obvious to postgres that few rows are being returned, but in
    the inherited tables case it doesn't use indexes. This was just an
    example. In a 52 gig table I have a "select id from table limit 1 order
    by id desc" returns instantly, but as soon as you declare a child table
    it tries to seq scan all the tables.
    If I'm understanding correctly, this kind of obviates the utility of
    partitioning if you structure a warehouse in a traditional manner. Assuming
    a fact table partitioned by time, but with foreign keys to a time dimension,
    it is now not possible to gain any advantage from the partitioning if
    selecting on columns in the time dimension.

    "select * from fact_table f join time_dimension t on f.time_id = t.time_id
    where t.quarter=3 and t.year = 2010" will scan all partitions of the fact
    table despite the fact that all of the rows would come from 3 partitions,
    assuming a partitioning schema that uses one partition for each month.

    I use a time id that is calculable from the from the timestamp so it doesn't
    need to be looked up, and partitioning on time_id directly is easy enough to
    handle, but if I'm understanding the problem, it sounds like nothing short
    of computing the appropriate time ids before issuing the query and then
    including a 'where f.time_id between x and y' clause to the query will
    result in the partitions being correctly excluded. Is that what people are
    doing to solve this problem? The alternative is to leave a timestamp column
    in the fact table (something I tend to do since it makes typing ad-hoc
    queries in psql much easier) and partition on that column and then always
    include a where clause for that column that is at least as large as the
    requested row range. Both result in fairly ugly queries, though I can
    certainly see how I might structure my code to always build queries which
    adhere to this.

    I'm just in the process of designing a star schema for a project and was
    intending to use exactly the structure I described at the top of the email.
    Is there a postgres best-practices for solving this problem? There's no way
    I can get away without partitioning. I'm looking at a worst case table of
    100,000 rows being written every 5 minutes, 24x7 - 29 million rows per day,
    a billion rows per month - with most queries running over a single month or
    comparing same months from differing years and quarters - so a month based
    partitioning. Normal case is closer to 10K rows per 5 minutes.

    Suggestions?

    --sam
  • Greg Smith at Apr 28, 2011 at 2:18 am

    Joseph Shraibman wrote:
    In a 52 gig table I have a "select id from table limit 1 order
    by id desc" returns instantly, but as soon as you declare a child table
    it tries to seq scan all the tables.
    This is probably the limitation that's fixed in PostgreSQL 9.1 by this
    commit (following a few others leading up to it):
    http://archives.postgresql.org/pgsql-committers/2010-11/msg00028.php

    There was a good example showing what didn't work as expected before
    (along with an earlier patch that didn't everything the larger 9.1
    improvement does) at
    http://archives.postgresql.org/pgsql-hackers/2009-07/msg01115.php ;
    "ORDER BY x DESC LIMIT 1" returns the same things as MAX(x).

    It's a pretty serious issue with the partitioning in earlier versions.
    I know of multiple people, myself included, who have been compelled to
    apply this change to an earlier version of PostgreSQL to make larger
    partitioned databases work correctly. The other option is to manually
    decompose the queries into ones that target each of the child tables
    individually, then combine the results, which is no fun either.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
    "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
  • Robert Haas at Apr 29, 2011 at 10:53 pm

    On Apr 27, 2011, at 11:11 PM, Joseph Shraibman wrote:
    On 04/27/2011 04:32 PM, Robert Haas wrote:
    In the first case, PostgreSQL evidently thinks that using the indexes
    will be slower than just ignoring them. You could find out whether
    it's right by trying it with enable_seqscan=off.
    My point is that this is just a problem with inherited tables. It
    should be obvious to postgres that few rows are being returned, but in
    the inherited tables case it doesn't use indexes. This was just an
    example. In a 52 gig table I have a "select id from table limit 1 order
    by id desc" returns instantly, but as soon as you declare a child table
    it tries to seq scan all the tables.
    Oh, sorry, I must have misunderstood. As Greg says, this is fixed in 9.1.

    ...Robert

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedApr 1, '11 at 12:53a
activeApr 29, '11 at 10:53p
posts6
users4
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase