Hi all!

I have a strange behavior with this query:

SELECT c.id_contenido,p.fecha_publicacion,c.titulo_esp,c.activo,c.activo,s.label_esp as label_sbc,p.orden
,p.tapa_spc,p.tapa_cat,p.tapa_principal,p.id_publicacion,ca.label_esp as label_cat,sp.label_esp as label_spc
FROM cont_contenido c ,cont_publicacion p ,cont_sbc s ,cont_cat ca ,cont_spc sp
WHERE c.id_instalacion = 2
AND s.id_instalacion = 2
AND p.id_instalacion = 2
AND c.id_contenido = p.id_contenido
AND c.id_sbc = s.id_sbc
--AND (c.activo = 'S' or c.activo = 's')
--AND (s.activo = 'S' or s.activo = 's')
AND upper(c.activo) = 'S'
AND upper(s.activo) = 'S'
AND ca.id_instalacion = 2
AND sp.id_instalacion = 2
AND ca.id_cat = s.id_cat
AND sp.id_spc = ca.id_spc
ORDER BY sp.label_esp ,ca.label_esp ,p.orden

This is the execution plan:
Sort (cost=128.81..128.83 rows=5 width=189)
Sort Key: sp.label_esp, ca.label_esp, p.orden
-> Nested Loop (cost=0.00..128.76 rows=5 width=189)
Join Filter: ("outer".id_contenido = "inner".id_contenido)
-> Nested Loop (cost=0.00..24.70 rows=1 width=134)
Join Filter: ("inner".id_spc = "outer".id_spc)
-> Nested Loop (cost=0.00..22.46 rows=1 width=111)
-> Nested Loop (cost=0.00..6.89 rows=1 width=68)
Join Filter: ("inner".id_cat = "outer".id_cat)
-> Seq Scan on cont_sbc s (cost=0.00..4.44 rows=1 width=35)
Filter: ((id_instalacion = 2::numeric) AND (upper((activo)::text) = 'S'::text))
-> Seq Scan on cont_cat ca (cost=0.00..2.31 rows=11 width=33)
Filter: (id_instalacion = 2::numeric)
-> Index Scan using cont_cont_cont_sbc_fk_i on cont_contenido c (cost=0.00..15.56 rows=1 width=43)
Index Cond: ((c.id_instalacion = 2::numeric) AND (c.id_sbc = "outer".id_sbc))
Filter: (upper((activo)::text) = 'S'::text)
-> Seq Scan on cont_spc sp (cost=0.00..2.16 rows=6 width=23)
Filter: (id_instalacion = 2::numeric)
-> Seq Scan on cont_publicacion p (cost=0.00..98.54 rows=442 width=55)
Filter: (id_instalacion = 2::numeric)

If I replace both "uppers" with "...= 'S' or ...= 's'":

SELECT c.id_contenido,p.fecha_publicacion,c.titulo_esp,c.activo,c.activo,s.label_esp as label_sbc,p.orden
,p.tapa_spc,p.tapa_cat,p.tapa_principal,p.id_publicacion,ca.label_esp as label_cat,sp.label_esp as label_spc
FROM cont_contenido c ,cont_publicacion p ,cont_sbc s ,cont_cat ca ,cont_spc sp
WHERE c.id_instalacion = 2
AND s.id_instalacion = 2
AND p.id_instalacion = 2
AND c.id_contenido = p.id_contenido
AND c.id_sbc = s.id_sbc
AND (c.activo = 'S' or c.activo = 's')
AND (s.activo = 'S' or s.activo = 's')
AND ca.id_instalacion = 2
AND sp.id_instalacion = 2
AND ca.id_cat = s.id_cat
AND sp.id_spc = ca.id_spc
ORDER BY sp.label_esp ,ca.label_esp ,p.orden

This is the Execution plan:

Sort (cost=193.98..194.62 rows=256 width=189)
Sort Key: sp.label_esp, ca.label_esp, p.orden
-> Merge Join (cost=178.07..183.75 rows=256 width=189)
Merge Cond: ("outer".id_contenido = "inner".id_contenido)
-> Sort (cost=60.11..60.25 rows=56 width=134)
Sort Key: c.id_contenido
-> Merge Join (cost=57.31..58.50 rows=56 width=134)
Merge Cond: ("outer".id_sbc = "inner".id_sbc)
-> Sort (cost=10.60..10.64 rows=15 width=91)
Sort Key: s.id_sbc
-> Merge Join (cost=10.00..10.32 rows=15 width=91)
Merge Cond: ("outer".id_cat = "inner".id_cat)
-> Sort (cost=5.10..5.12 rows=10 width=56)
Sort Key: ca.id_cat
-> Merge Join (cost=4.74..4.94 rows=10 width=56)
Merge Cond: ("outer".id_spc = "inner".id_spc)
-> Sort (cost=2.50..2.53 rows=11 width=33)
Sort Key: ca.id_spc
-> Seq Scan on cont_cat ca (cost=0.00..2.31 rows=11 width=33)
Filter: (id_instalacion = 2::numeric)
-> Sort (cost=2.24..2.26 rows=6 width=23)
Sort Key: sp.id_spc
-> Seq Scan on cont_spc sp (cost=0.00..2.16 rows=6 width=23)
Filter: (id_instalacion = 2::numeric)
-> Sort (cost=4.90..4.96 rows=21 width=35)
Sort Key: s.id_cat
-> Seq Scan on cont_sbc s (cost=0.00..4.44 rows=21 width=35)
Filter: ((id_instalacion = 2::numeric) AND ((activo = 'S'::character varying) OR (activo = 's'::character varying)))
-> Sort (cost=46.70..46.94 rows=93 width=43)
Sort Key: c.id_sbc
-> Seq Scan on cont_contenido c (cost=0.00..43.66 rows=93 width=43)
Filter: ((id_instalacion = 2::numeric) AND ((activo = 'S'::character varying) OR (activo = 's'::character varying)))
-> Sort (cost=117.96..119.06 rows=442 width=55)
Sort Key: p.id_contenido
-> Seq Scan on cont_publicacion p (cost=0.00..98.54 rows=442 width=55)
Filter: (id_instalacion = 2::numeric)


The question is, why the query with the worst execution plan (most expensive, the second) runs faster the query with the better execution plan?
First Query: 10 runs, avg: 8 sec.
Second Query: 10 runs, avg: 1.8 sec.

I see a fail on the "best" exec plan, the rows I get are around 430, so the first EP expect only 5 rows and the second EP expect 256.

I run 7.3.2 over Solaris.
I did "vacuum full analyze" before

Thanks in advance!


Fernando.-

Search Discussions

  • Szűcs Gábor at Jun 26, 2003 at 10:30 am
    Fernando,

    1. Try EXPLAIN ANALYZE. Cost alone isn't an absolute measure. I think it's
    only to see which parts of the query are expected to be slowest. However,
    EXP ANA will give you exact times in msec (which effectively means it
    executes the query).

    2. I think calling upper() for each row costs more than direct comparison,
    but not sure

    3. Notice that there are seq scans with filter conditions like
    "id_instalacion = 2::numeric"
    Do you have indices on id_instalacion, which seems to be a numeric field?
    if so, try casting the constant expressions in the query to numeric so that
    postgresql may find the index. If you don't have such indices, it may be
    worth to create them. (I guess you only have it on the table aliased with c,
    since it does an index scan there.

    4. another guess may be indices on (id_instalacion, activo), or, if activo
    has few possible values (for example, it may be only one of three letters,
    say, 'S', 'A' or 'K'), partial indices like:

    CREATE INDEX cont_sbc_id_ins_S ON cont_sbc (id_instalacion)
    WHERE activo in ('S', 's');
    CREATE INDEX cont_sbc_id_ins_A ON cont_sbc (id_instalacion)
    WHERE activo in ('A', 'a');
    CREATE INDEX cont_sbc_id_ins_K ON cont_sbc (id_instalacion)
    WHERE activo in ('K', 'k');

    G.
    ------------------------------- cut here -------------------------------
    WHERE c.id_instalacion = 2
    AND s.id_instalacion = 2
    AND p.id_instalacion = 2
    ...

    -> Seq Scan on cont_sbc s (cost=0.00..4.44 rows=1 width=35)
    Filter: ((id_instalacion = 2::numeric)
    AND (upper((activo)::text) = 'S'::text))
    -> Index Scan using cont_cont_cont_sbc_fk_i on cont_contenido c
    (cost=0.00..15.56 rows=1 width=43)
    Index Cond: ((c.id_instalacion = 2::numeric)
    AND (c.id_sbc = "outer".id_sbc))
    Filter: (upper((activo)::text) = 'S'::text)
    -> Seq Scan on cont_publicacion p (cost=0.00..98.54 rows=442 width=55)
    Filter: (id_instalacion = 2::numeric)
  • Fernando Papa at Jun 26, 2003 at 1:33 pm

    -----Mensaje original-----
    De: SZUCS Gábor
    Enviado el: jueves, 26 de junio de 2003 7:31
    Para: pgsql-performance@postgresql.org
    Asunto: Re: [PERFORM] Similar querys, better execution time
    on worst execution plan


    Fernando,

    1. Try EXPLAIN ANALYZE. Cost alone isn't an absolute measure.
    I think it's only to see which parts of the query are
    expected to be slowest. However, EXP ANA will give you exact
    times in msec (which effectively means it executes the query).
    Ok, yes, I did only explay because I run several times the query and get avg. run time. but it's true, it's better to do EXP ANA.
    2. I think calling upper() for each row costs more than
    direct comparison, but not sure
    It's the only answer than I can found... maybe do a lot of uppers and then compare will be too much than compare with 2 conditions...
    3. Notice that there are seq scans with filter conditions like
    "id_instalacion = 2::numeric"
    Do you have indices on id_instalacion, which seems to be a
    numeric field? if so, try casting the constant expressions in
    the query to numeric so that postgresql may find the index.
    If you don't have such indices, it may be worth to create
    them. (I guess you only have it on the table aliased with c,
    since it does an index scan there.
    Yes, we have index on id_instalacion, but now we have only one instalation, so the content of these field, in the 99% of the rows, it's 2. I think in this case it's ok to choose seq scan.
    4. another guess may be indices on (id_instalacion, activo),
    or, if activo has few possible values (for example, it may be
    only one of three letters, say, 'S', 'A' or 'K'), partial
    indices like:

    CREATE INDEX cont_sbc_id_ins_S ON cont_sbc (id_instalacion)
    WHERE activo in ('S', 's');
    CREATE INDEX cont_sbc_id_ins_A ON cont_sbc (id_instalacion)
    WHERE activo in ('A', 'a');
    CREATE INDEX cont_sbc_id_ins_K ON cont_sbc (id_instalacion)
    WHERE activo in ('K', 'k');
    I need to recheck about the "quality" of "active" field. Really I don't know if I found a lot of 'S', a lot of 'N', maybe we will have 50%/50% of 'S' or 'N'. This will be important to define index.

    Thanks for your answer.
  • Szűcs Gábor at Jun 26, 2003 at 3:06 pm
    *happy* :)))

    G.
    ------------------------------- cut here -------------------------------
    ----- Original Message -----
    From: "Fernando Papa" <fpapa@claxson.com>
    Sent: Thursday, June 26, 2003 3:33 PM


    I need to recheck about the "quality" of "active" field. Really I don't know
    if I found a lot of 'S', a lot of 'N', maybe we will have 50%/50% of 'S' or
    'N'. This will be important to define index.

    Thanks for your answer.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedJun 25, '03 at 7:26p
activeJun 26, '03 at 3:06p
posts4
users2
websitepostgresql.org
irc#postgresql

2 users in discussion

Fernando Papa: 2 posts Szűcs Gábor: 2 posts

People

Translate

site design / logo © 2022 Grokbase