Hi all. I have a strange (and serious) problem with an application
ported from postgres 8.0 to 8.1.

The old installation is postgresql-8.0.4-2.FC4.1 running on a Fedora 4,
the new one is postgresql-8.1.4-1.FC5.1 running on a fedora 5.

Some query is now _very_ slow. I've found some deep differences between
query plans.

As example. The query is:

select count(*) from orario_ap join registrazioni using(treno, data)
join personale using(personale_id) join ruoli using(ruolo_id) where
data=today_or) where data=today_orario();

orario_ap is a view.

On 8.0 the query runs in 138.146 ms
On 8.1 the query runs in 6761.112 ms
On 8.1 with nested loops disabled: 63.184 ms

This is not the only query affected.

Two notes: please cc answer directly to me, and I'm sorry, my english is
alpha version.


On a 8.0 the plan is:

railcomm04=# explain analyze select count(*) from orario_ap join
registrazioni using(treno, data) join personale using(personale_id) join
ruoli using(ruolo_id) where data=today_or) where data=today_orario();

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1106.77..1106.77 rows=1 width=0) (actual
time=137.786..137.787 rows=1 loops=1)
-> Merge Join (cost=1088.96..1105.66 rows=444 width=0) (actual
time=124.173..137.190 rows=349 loops=1)
Merge Cond: (("outer".tipo_treno = "inner".tipo_treno) AND
("outer".num_treno = "inner".num_treno) AND ("outer".orario =
"inner".orario))
-> Sort (cost=574.10..575.09 rows=395 width=26) (actual
time=97.647..98.010 rows=349 loops=1)
Sort Key: o1.tipo_treno, o1.num_treno, o1.orario
-> Hash Join (cost=28.45..557.06 rows=395 width=26)
(actual time=35.326..93.415 rows=349 loops=1)
Hash Cond: ("outer".ruolo_id = "inner".ruolo_id)
-> Hash Join (cost=27.41..550.10 rows=395
width=30) (actual time=12.827..69.411 rows=349 loops=1)
Hash Cond: ("outer".personale_id =
"inner".personale_id)
-> Hash Join (cost=12.85..529.61 rows=395
width=34) (actual time=10.453..65.365 rows=349 loops=1)
Hash Cond: ("outer".treno =
"inner".treno)
-> Seq Scan on orario o1
(cost=0.00..504.38 rows=843 width=33) (actual time=3.691..57.487
rows=797 loops=1)
Filter: ((seq_fermata = 1) AND
(data = date((now() - '02:00:00'::interval))))
-> Hash (cost=11.98..11.98 rows=349
width=19) (actual time=2.665..2.665 rows=0 loops=1)
-> Seq Scan on registrazioni
(cost=0.00..11.98 rows=349 width=19) (actual time=0.029..2.042 rows=349
loops=1)
Filter: (date((now() -
'02:00:00'::interval)) = data)
-> Hash (cost=12.85..12.85 rows=685
width=4) (actual time=2.350..2.350 rows=0 loops=1)
-> Seq Scan on personale
(cost=0.00..12.85 rows=685 width=4) (actual time=0.005..1.350 rows=685
loops=1)
-> Hash (cost=1.03..1.03 rows=3 width=4) (actual
time=22.479..22.479 rows=0 loops=1)
-> Seq Scan on ruoli (cost=0.00..1.03
rows=3 width=4) (actual time=22.461..22.468 rows=3 loops=1)
-> Sort (cost=514.86..516.94 rows=831 width=26) (actual
time=26.493..27.490 rows=949 loops=1)
Sort Key: o2.tipo_treno, o2.num_treno, o2.orario
-> Seq Scan on orario o2 (cost=0.00..474.56 rows=831
width=26) (actual time=0.056..17.398 rows=797 loops=1)
Filter: ((orario_partenza IS NULL) AND (date((now()
- '02:00:00'::interval)) = data))
Total runtime: 138.146 ms



On a standard 8.1 is:

railcomm04=# explain analyze select count(*) from orario_ap join
registrazioni using(treno, data) join personale using(personale_id) join
ruoli using(ruolo_id) where data=today_orario();

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=500.45..500.46 rows=1 width=0) (actual
time=6760.876..6760.877 rows=1 loops=1)
-> Nested Loop (cost=0.00..500.44 rows=1 width=0) (actual
time=5.915..6759.550 rows=349 loops=1)
Join Filter: (("outer".orario = "inner".orario) AND
("outer".num_treno = "inner".num_treno) AND ("outer".tipo_treno =
"inner".tipo_treno))
-> Nested Loop (cost=0.00..25.87 rows=1 width=72) (actual
time=0.124..42.617 rows=349 loops=1)
-> Nested Loop (cost=0.00..20.15 rows=1 width=76)
(actual time=0.106..34.330 rows=349 loops=1)
-> Nested Loop (cost=0.00..14.12 rows=1 width=40)
(actual time=0.045..12.037 rows=349 loops=1)
Join Filter: ("outer".ruolo_id =
"inner".ruolo_id)
-> Seq Scan on registrazioni
(cost=0.00..11.98 rows=2 width=44) (actual time=0.025..2.315 rows=349
loops=1)
Filter: (date((now() -
'02:00:00'::interval)) = data)
-> Seq Scan on ruoli (cost=0.00..1.03
rows=3 width=4) (actual time=0.003..0.009 rows=3 loops=349)
-> Index Scan using orario_pkey on orario o1
(cost=0.00..6.02 rows=1 width=104) (actual time=0.053..0.056 rows=1
loops=349)
Index Cond: ((o1.treno = "outer".treno) AND
(o1.seq_fermata = 1))
Filter: (data = date((now() -
'02:00:00'::interval)))
-> Index Scan using personale_pkey on personale
(cost=0.00..5.71 rows=1 width=4) (actual time=0.013..0.017 rows=1
loops=349)
Index Cond: ("outer".personale_id =
personale.personale_id)
-> Seq Scan on orario o2 (cost=0.00..474.56 rows=1 width=72)
(actual time=0.030..17.784 rows=797 loops=349)
Filter: ((orario_partenza IS NULL) AND (date((now() -
'02:00:00'::interval)) = data))
Total runtime: 6761.112 ms


On a 8.1 with nested loops disabled:

railcomm04=# explain analyze select count(*) from orario_ap join
registrazioni using(treno, data) join personale using(personale_id) join
ruoli using(ruolo_id) where data=today_orario();

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=802.82..802.83 rows=1 width=0) (actual
time=62.309..62.310 rows=1 loops=1)
-> Hash Join (cost=328.23..802.82 rows=1 width=0) (actual
time=44.443..61.867 rows=349 loops=1)
Hash Cond: (("outer".orario = "inner".orario) AND
("outer".num_treno = "inner".num_treno) AND ("outer".tipo_treno =
"inner".tipo_treno))
-> Seq Scan on orario o2 (cost=0.00..474.56 rows=1 width=72)
(actual time=0.068..16.558 rows=797 loops=1)
Filter: ((orario_partenza IS NULL) AND (date((now() -
'02:00:00'::interval)) = data))
-> Hash (cost=328.22..328.22 rows=1 width=72) (actual
time=38.479..38.479 rows=349 loops=1)
-> Hash Join (cost=29.33..328.22 rows=1 width=72)
(actual time=6.700..37.530 rows=349 loops=1)
Hash Cond: ("outer".treno = "inner".treno)
-> Index Scan using orario_pkey on orario o1
(cost=0.00..298.88 rows=1 width=104) (actual time=0.069..29.033 rows=797
loops=1)
Index Cond: (seq_fermata = 1)
Filter: (data = date((now() -
'02:00:00'::interval)))
-> Hash (cost=29.32..29.32 rows=1 width=36)
(actual time=6.595..6.595 rows=349 loops=1)
-> Hash Join (cost=13.04..29.32 rows=1
width=36) (actual time=3.361..5.887 rows=349 loops=1)
Hash Cond: ("outer".personale_id =
"inner".personale_id)
-> Seq Scan on personale
(cost=0.00..12.85 rows=685 width=4) (actual time=0.013..1.098 rows=685
loops=1)
-> Hash (cost=13.04..13.04 rows=1
width=40) (actual time=3.301..3.301 rows=349 loops=1)
-> Hash Join (cost=1.04..13.04
rows=1 width=40) (actual time=0.090..2.602 rows=349 loops=1)
Hash Cond:
("outer".ruolo_id = "inner".ruolo_id)
-> Seq Scan on
registrazioni (cost=0.00..11.98 rows=2 width=44) (actual
time=0.025..1.465 rows=349 loops=1)
Filter: (date((now()
- '02:00:00'::interval)) = data)
-> Hash (cost=1.03..1.03
rows=3 width=4) (actual time=0.040..0.040 rows=3 loops=1)
-> Seq Scan on ruoli
(cost=0.00..1.03 rows=3 width=4) (actual time=0.014..0.025 rows=3
loops=1)
Total runtime: 63.184 ms


Regards,
Gabriele

Search Discussions

  • Michael Fuhr at Jul 15, 2006 at 7:02 pm

    On Sat, Jul 15, 2006 at 04:14:11PM +0200, Gabriele Turchi wrote:
    Hi all. I have a strange (and serious) problem with an application
    ported from postgres 8.0 to 8.1.

    The old installation is postgresql-8.0.4-2.FC4.1 running on a Fedora 4,
    the new one is postgresql-8.1.4-1.FC5.1 running on a fedora 5.

    Some query is now _very_ slow. I've found some deep differences between
    query plans.
    Have you run ANALYZE in 8.1? Some of the row count estimates in
    the 8.1 plan differ significantly from the actual number of rows
    returned, while in the 8.0 plan the estimates are accurate. For
    example, in one case the 8.0 plan shows 349 rows estimated, 349
    rows returned:

    -> Seq Scan on registrazioni (cost=0.00..11.98 rows=349 width=19) (actual time=0.029..2.042 rows=349 loops=1)
    Filter: (date((now() - '02:00:00'::interval)) = data)

    but the 8.1 plan shows 2 rows estimated, 349 rows returned:

    -> Seq Scan on registrazioni (cost=0.00..11.98 rows=2 width=44) (actual time=0.025..2.315 rows=349 loops=1)
    Filter: (date((now() - '02:00:00'::interval)) = data)

    This suggests that the 8.1 statistics are out of date, possibly
    because ANALYZE or VACUUM ANALYZE hasn't been run since the data
    was loaded. Try running ANALYZE in 8.1 and post the new plans if
    that doesn't help.

    --
    Michael Fuhr
  • Gabriele Turchi at Jul 15, 2006 at 7:55 pm

    Il giorno sab, 15/07/2006 alle 13.02 -0600, Michael Fuhr ha scritto:
    On Sat, Jul 15, 2006 at 04:14:11PM +0200, Gabriele Turchi wrote:
    Hi all. I have a strange (and serious) problem with an application
    ported from postgres 8.0 to 8.1.

    The old installation is postgresql-8.0.4-2.FC4.1 running on a Fedora 4,
    the new one is postgresql-8.1.4-1.FC5.1 running on a fedora 5.

    Some query is now _very_ slow. I've found some deep differences between
    query plans.
    Have you run ANALYZE in 8.1? Some of the row count estimates in
    the 8.1 plan differ significantly from the actual number of rows
    returned, while in the 8.0 plan the estimates are accurate. For
    Running an ANALYZE really change the plan, now it is fast as before
    (8.0).

    On the production system a VACUUM FULL ANALYZE is run every morning
    after a clean-up, when the "registrazioni" table is empty. During the
    day this table fills up (about 500 record any day), and apparently the
    performances are free-falling very quickly. This behaviour has not
    changed between the old and the new installation.

    Can you suggest an easy way to collect and keep up-to-date these
    statistics in a very low-impact way?

    I'm stunned from a so big difference in execution time from a so small
    difference in the records number...
    example, in one case the 8.0 plan shows 349 rows estimated, 349
    rows returned:

    -> Seq Scan on registrazioni (cost=0.00..11.98 rows=349 width=19) (actual time=0.029..2.042 rows=349 loops=1)
    Filter: (date((now() - '02:00:00'::interval)) = data)

    but the 8.1 plan shows 2 rows estimated, 349 rows returned:

    -> Seq Scan on registrazioni (cost=0.00..11.98 rows=2 width=44) (actual time=0.025..2.315 rows=349 loops=1)
    Filter: (date((now() - '02:00:00'::interval)) = data)

    This suggests that the 8.1 statistics are out of date, possibly
    because ANALYZE or VACUUM ANALYZE hasn't been run since the data
    was loaded. Try running ANALYZE in 8.1 and post the new plans if
    that doesn't help.
    Thank you very much,
    Gabriele
  • Joe Conway at Jul 15, 2006 at 8:04 pm

    Gabriele Turchi wrote:
    Running an ANALYZE really change the plan, now it is fast as before
    (8.0).

    On the production system a VACUUM FULL ANALYZE is run every morning
    after a clean-up, when the "registrazioni" table is empty. During the
    day this table fills up (about 500 record any day), and apparently the
    performances are free-falling very quickly. This behaviour has not
    changed between the old and the new installation.

    Can you suggest an easy way to collect and keep up-to-date these
    statistics in a very low-impact way?
    Why not just periodically (once an hour?) run "ANALYZE registrazioni;"
    during the day. This will only update the statistics, and should be very
    low impact.

    HTH,

    Joe
  • Gabriele Turchi at Jul 15, 2006 at 8:23 pm

    Il giorno sab, 15/07/2006 alle 13.04 -0700, Joe Conway ha scritto:
    Gabriele Turchi wrote:
    Running an ANALYZE really change the plan, now it is fast as before
    (8.0).

    On the production system a VACUUM FULL ANALYZE is run every morning
    after a clean-up, when the "registrazioni" table is empty. During the
    day this table fills up (about 500 record any day), and apparently the
    performances are free-falling very quickly. This behaviour has not
    changed between the old and the new installation.

    Can you suggest an easy way to collect and keep up-to-date these
    statistics in a very low-impact way?
    Why not just periodically (once an hour?) run "ANALYZE registrazioni;"
    during the day. This will only update the statistics, and should be very
    low impact.
    This is my "solution" too... but: is enough? Or else: there is a better
    way to do this? If the performance in the better case is 50 times faster
    than the worse case, during an hour (50/100 record inserted in
    "registrazioni") how much the performance can fall before the new
    "ANALYZE" is run? Otherwise, running ANALYZE more frequently can badly
    affect the overall performance?

    A so big difference in postgres performance, can be considered a bug or
    a over-optimization in the plan making? Why (at least apparently) the
    8.0 version is not affected?
    HTH,

    Joe

    ---------------------------(end of broadcast)---------------------------
    TIP 6: explain analyze is your friend
    Thank you all very much,
    Gabriele
  • Joe Conway at Jul 16, 2006 at 6:08 pm

    Gabriele Turchi wrote:
    Il giorno sab, 15/07/2006 alle 13.04 -0700, Joe Conway ha scritto:
    Why not just periodically (once an hour?) run "ANALYZE registrazioni;"
    during the day. This will only update the statistics, and should be very
    low impact.
    This is my "solution" too... but: is enough? Or else: there is a better
    way to do this? If the performance in the better case is 50 times faster
    than the worse case, during an hour (50/100 record inserted in
    "registrazioni") how much the performance can fall before the new
    "ANALYZE" is run? Otherwise, running ANALYZE more frequently can badly
    affect the overall performance?
    One thing I noticed is that in both plans there is a seq scan on
    registrazioni. Given that performance degrades so quickly as records are
    inserted into registrazioni, I'm wondering if you're missing an index.
    What indexes do you have on registrazioni?

    Joe
  • Gabriele Turchi at Jul 18, 2006 at 11:29 am

    Il giorno dom, 16/07/2006 alle 11.08 -0700, Joe Conway ha scritto:
    Gabriele Turchi wrote:
    Il giorno sab, 15/07/2006 alle 13.04 -0700, Joe Conway ha scritto:
    Why not just periodically (once an hour?) run "ANALYZE registrazioni;"
    during the day. This will only update the statistics, and should be very
    low impact.
    This is my "solution" too... but: is enough? Or else: there is a better
    way to do this? If the performance in the better case is 50 times faster
    than the worse case, during an hour (50/100 record inserted in
    "registrazioni") how much the performance can fall before the new
    "ANALYZE" is run? Otherwise, running ANALYZE more frequently can badly
    affect the overall performance?
    One thing I noticed is that in both plans there is a seq scan on
    registrazioni. Given that performance degrades so quickly as records are
    inserted into registrazioni, I'm wondering if you're missing an index.
    What indexes do you have on registrazioni?

    Joe
    No one. The application was not fine-tuned, because the original
    performance (under 8.0) was "more than enough". I thought that creating
    an index on a table with no more than some hundred of records was not
    useful...

    My biggest doubt is anyway related to the very big difference between
    the plans in 8.0 and 8.1 under the same conditions.

    Thank you,
    Gabriele
  • Chris Hoover at Jul 17, 2006 at 8:19 pm
    Just turn on autovacuuming on your 8.1 database. You can tune the vacuum
    and autovacuum parameters to minimize the impact to your system. This is
    the optimal route to take since PG will maintain the tables for you as
    needed.

    HTH,

    Chris
    On 7/15/06, Gabriele Turchi wrote:

    Il giorno sab, 15/07/2006 alle 13.04 -0700, Joe Conway ha scritto:
    Gabriele Turchi wrote:
    Running an ANALYZE really change the plan, now it is fast as before
    (8.0).

    On the production system a VACUUM FULL ANALYZE is run every morning
    after a clean-up, when the "registrazioni" table is empty. During the
    day this table fills up (about 500 record any day), and apparently the
    performances are free-falling very quickly. This behaviour has not
    changed between the old and the new installation.

    Can you suggest an easy way to collect and keep up-to-date these
    statistics in a very low-impact way?
    Why not just periodically (once an hour?) run "ANALYZE registrazioni;"
    during the day. This will only update the statistics, and should be very
    low impact.
    This is my "solution" too... but: is enough? Or else: there is a better
    way to do this? If the performance in the better case is 50 times faster
    than the worse case, during an hour (50/100 record inserted in
    "registrazioni") how much the performance can fall before the new
    "ANALYZE" is run? Otherwise, running ANALYZE more frequently can badly
    affect the overall performance?

    A so big difference in postgres performance, can be considered a bug or
    a over-optimization in the plan making? Why (at least apparently) the
    8.0 version is not affected?
    HTH,

    Joe

    ---------------------------(end of broadcast)---------------------------
    TIP 6: explain analyze is your friend
    Thank you all very much,
    Gabriele



    ---------------------------(end of broadcast)---------------------------
    TIP 1: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to majordomo@postgresql.org so that your
    message can get through to the mailing list cleanly

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedJul 15, '06 at 2:14p
activeJul 18, '06 at 11:29a
posts8
users4
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase