FAQ

planner costs in "warm cache" tests

Jesper Krogh
May 30, 2010 at 5:41 pm
Hi.

I'm trying to get the planner to do sort of the correct thing
when choosing between index-scans on btree indices and
bitmap heap scans.

There has been several things going on in parallel. One
is that the statistics data is off:
http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/141420/focus=141735

The other one that the costestimates (number of pages
to read) is inaccurate on gin indices:
http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php
which there also is coming a solution to that I'm testing out.

I was trying to nullify the problem with the wrongly estimated number
of pages to read and see if "the rest" seems to work as expected.

The theory was that if I set "seq_page_cost" and "random_page_cost" to
something "really low" (0 is not permitted) and ran tests
on a fully cached query (where both costs indeed is "really low").
then the "cheapest" query should indeed also be the fastest one.
Let me know if the logic is flawed.

The test dataset is 1365462 documents, running pg9.0b1, both queries run
twice to
see that the data actually is fully cached as expected.

testdb=# set seq_page_cost = 0.00001;
SET
testdb=# set random_page_cost = 0.00001;
SET
testdb=# set enable_indexscan = on;
SET
testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..432.82 rows=200 width=11) (actual
time=831.456..2167.302 rows=200 loops=1)
-> Index Scan using ref_acc_idx on reference (cost=0.00..61408.12
rows=28376 width=11) (actual time=831.451..2166.434 rows=200 loops=1)
Filter: (document_tsvector @@ to_tsquery('literature'::text))
Total runtime: 2167.982 ms
(4 rows)

testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..432.82 rows=200 width=11) (actual
time=842.990..2187.393 rows=200 loops=1)
-> Index Scan using ref_acc_idx on reference (cost=0.00..61408.12
rows=28376 width=11) (actual time=842.984..2186.540 rows=200 loops=1)
Filter: (document_tsvector @@ to_tsquery('literature'::text))
Total runtime: 2188.083 ms
(4 rows)

testdb=# set enable_indexscan = off;
SET
testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2510.68..2511.18 rows=200 width=11) (actual
time=270.016..270.918 rows=200 loops=1)
-> Sort (cost=2510.68..2581.62 rows=28376 width=11) (actual
time=270.011..270.321 rows=200 loops=1)
Sort Key: accession_number
Sort Method: top-N heapsort Memory: 34kB
-> Bitmap Heap Scan on reference (cost=219.94..1284.29
rows=28376 width=11) (actual time=13.897..216.700 rows=21613 loops=1)
Recheck Cond: (document_tsvector @@
to_tsquery('literature'::text))
-> Bitmap Index Scan on reference_fts_idx
(cost=0.00..212.85 rows=28376 width=0) (actual time=10.053..10.053
rows=21613 loops=1)
Index Cond: (document_tsvector @@
to_tsquery('literature'::text))
Total runtime: 271.323 ms
(9 rows)

testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2510.68..2511.18 rows=200 width=11) (actual
time=269.881..270.782 rows=200 loops=1)
-> Sort (cost=2510.68..2581.62 rows=28376 width=11) (actual
time=269.876..270.182 rows=200 loops=1)
Sort Key: accession_number
Sort Method: top-N heapsort Memory: 34kB
-> Bitmap Heap Scan on reference (cost=219.94..1284.29
rows=28376 width=11) (actual time=14.113..216.173 rows=21613 loops=1)
Recheck Cond: (document_tsvector @@
to_tsquery('literature'::text))
-> Bitmap Index Scan on reference_fts_idx
(cost=0.00..212.85 rows=28376 width=0) (actual time=10.360..10.360
rows=21613 loops=1)
Index Cond: (document_tsvector @@
to_tsquery('literature'::text))
Total runtime: 271.533 ms
(9 rows)


So in the situation where i have tried to "nullify" the actual
disc-cost, hopefully leaving only the
cpu and other cost back and running the query in fully cached mode (two
runs). the bitmap-heap-scan
is still hugely favorable in actual runtime. (which isn't that much a
suprise) but it seems strange that the
index-scan is still favored in the cost calculations?

I have tried to alter the cost of ts_match_vq but even setting it to
1000 does not change the overall picture.

Is the approach simply too naive?

--
Jesper
reply

Search Discussions

6 responses

  • Tom Lane at May 30, 2010 at 6:35 pm

    Jesper Krogh writes:
    testdb=# set seq_page_cost = 0.00001;
    SET
    testdb=# set random_page_cost = 0.00001;
    SET
    Well, hmm, I really doubt that that represents reality either. A page
    access is by no means "free" even when the page is already in cache.
    I don't recall anyone suggesting that you set these numbers to less
    than perhaps 0.01.

    In the case at hand, the problem is that the planner is preferring using
    an indexscan to an after-the-fact sort to obtain the specified result
    ordering. Making page fetches look too cheap definitely plays into
    that. There may also be a statistical problem, if the location of the
    desired records isn't independent of the accession_number ordering, but
    you're not doing yourself any favors by pushing the planner cost
    parameters several orders of magnitude outside the design envelope.

    regards, tom lane
  • Jesper Krogh at May 31, 2010 at 6:48 pm

    On 2010-05-30 20:34, Tom Lane wrote:
    Jesper Krogh<jesper@krogh.cc> writes:
    testdb=# set seq_page_cost = 0.00001;
    SET
    testdb=# set random_page_cost = 0.00001;
    SET
    Well, hmm, I really doubt that that represents reality either. A page
    access is by no means "free" even when the page is already in cache.
    I don't recall anyone suggesting that you set these numbers to less
    than perhaps 0.01.
    Thank you for the prompt response. Is it a "false assumption" that the
    cost should in some metric between different plans be a measurement
    of actual run-time in a dead-disk run?

    It should most likely be matching a typical workload situation, but that
    it really hard to tell anything about, so my "feeling" would be that the
    dead disk case is the one closest?

    --
    Jesper
  • Tom Lane at May 31, 2010 at 7:55 pm

    Jesper Krogh writes:
    On 2010-05-30 20:34, Tom Lane wrote:
    Well, hmm, I really doubt that that represents reality either. A page
    access is by no means "free" even when the page is already in cache.
    I don't recall anyone suggesting that you set these numbers to less
    than perhaps 0.01.
    Thank you for the prompt response. Is it a "false assumption" that the
    cost should in some metric between different plans be a measurement
    of actual run-time in a dead-disk run?
    Well, the default cost parameters (seq_page_cost=1, random_page_cost=4)
    are intended to model the non-cached state where most page fetches
    actually do require a disk access. They are definitely too large
    relative to the cpu_xxx_cost parameters when you have a fully-cached
    database, but what I've seen people recommending for that condition
    is to set them both to the same value in the vicinity of 0.1 or 0.01
    or so. If it's only mostly cached you might try intermediate settings.

    regards, tom lane
  • Scott Carey at Jun 1, 2010 at 7:15 am
    It is still best to have random_page_cost to be slightly larger (~50%) than sequential_page_cost, because even when entirely in RAM, sequential reads are faster than random reads. Today's CPU's do memory prefetching on sequential access. Perhaps try something like 0.3 and 0.2, or half that. You still don't want it to gratuitously scan a lot of RAM -- reading a page is not free and can kick out other pages from shared_buffers.

    On May 31, 2010, at 12:55 PM, Tom Lane wrote:

    Jesper Krogh <jesper@krogh.cc> writes:
    On 2010-05-30 20:34, Tom Lane wrote:
    Well, hmm, I really doubt that that represents reality either. A page
    access is by no means "free" even when the page is already in cache.
    I don't recall anyone suggesting that you set these numbers to less
    than perhaps 0.01.
    Thank you for the prompt response. Is it a "false assumption" that the
    cost should in some metric between different plans be a measurement
    of actual run-time in a dead-disk run?
    Well, the default cost parameters (seq_page_cost=1, random_page_cost=4)
    are intended to model the non-cached state where most page fetches
    actually do require a disk access. They are definitely too large
    relative to the cpu_xxx_cost parameters when you have a fully-cached
    database, but what I've seen people recommending for that condition
    is to set them both to the same value in the vicinity of 0.1 or 0.01
    or so. If it's only mostly cached you might try intermediate settings.

    regards, tom lane

    --
    Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
    To make changes to your subscription:
    http://www.postgresql.org/mailpref/pgsql-performance
  • Tom Lane at Jun 1, 2010 at 2:05 pm

    Scott Carey writes:
    It is still best to have random_page_cost to be slightly larger (~50%)
    than sequential_page_cost, because even when entirely in RAM,
    sequential reads are faster than random reads. Today's CPU's do
    memory prefetching on sequential access.
    Do you have any actual evidence of that? Because I don't believe it.
    Neither PG nor any kernel that I've ever heard of makes any effort to
    ensure that logically sequential blocks occupy physically sequential
    buffers, so even if the CPU tries to do some prefetching, it's not
    going to help at all.

    Now, if the database isn't entirely cached, then indeed it's a good
    idea to keep random_page_cost higher than seq_page_cost. But that's
    because of the actual disk fetches, not anything that happens in RAM.

    regards, tom lane
  • Robert Haas at Jun 5, 2010 at 1:29 am

    On Mon, May 31, 2010 at 3:55 PM, Tom Lane wrote:
    Jesper Krogh <jesper@krogh.cc> writes:
    On 2010-05-30 20:34, Tom Lane wrote:
    Well, hmm, I really doubt that that represents reality either.  A page
    access is by no means "free" even when the page is already in cache.
    I don't recall anyone suggesting that you set these numbers to less
    than perhaps 0.01.
    Thank you for the prompt response. Is it a "false assumption" that the
    cost should in some metric between different plans be a measurement
    of actual run-time in a dead-disk run?
    Well, the default cost parameters (seq_page_cost=1, random_page_cost=4)
    are intended to model the non-cached state where most page fetches
    actually do require a disk access.  They are definitely too large
    relative to the cpu_xxx_cost parameters when you have a fully-cached
    database, but what I've seen people recommending for that condition
    is to set them both to the same value in the vicinity of 0.1 or 0.01
    or so.  If it's only mostly cached you might try intermediate settings.
    I have had to set it as low as .005 to get the right things to happen.
    Could have been a fluke, I suppose.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company

Related Discussions

Discussion Navigation
viewthread | post