I've a problem that might be a bug in the core system (hashjoins) or with ltree using gist-index, but I fail miserable to produce a useful testcase (using 8.1, worked in 8.0):

A query produces wrong (=0) results, when a different plan is enforced, I get a merge-join plan that looks similar, but produces the correct result (=16 rows).

I can post a queryplan, but cannot post the data itself since it's confidental (though I might be able to randomize some data and construct a self contained case, but this would take quite some time).


The working case is:
set enable_hashjoin to off;
Seq Scan on foo1 cost=0.00..423583.57 rows=10810 width=4) (actual time=675.422..706.815 rows=16 loops=1)
Filter: (subplan)
SubPlan
-> Merge Join (cost=19.49..19.55 rows=1 width=0) (actual time=0.028..0.028 rows=0 loops=21619)
Merge Cond: ("outer".str_id = "inner".id)
-> Sort (cost=6.49..6.50 rows=5 width=4) (actual time=0.023..0.023 rows=0 loops=21619)
Sort Key: bz.str_id
-> Bitmap Heap Scan on foo2 bz (cost=2.02..6.43 rows=5 width=4) (actual time=0.012..0.012 rows=0 loops=21619)
Recheck Cond: (bid = $0)
-> Bitmap Index Scan on foo2_bid_key1 (cost=0.00..2.02 rows=5 width=0) (actual time=0.009..0.009 rows=0 loops=21619)
Index Cond: (bid = $0)
-> Sort (cost=13.00..13.01 rows=6 width=4) (actual time=0.002..0.003 rows=1 loops=136)
Sort Key: str.id
-> Bitmap Heap Scan on structure str (cost=2.02..12.92 rows=6 width=4) (actual time=0.095..0.097 rows=1 loops=1)
Recheck Cond: (path ~ '142.2330445.2330598.2330676.*'::lquery)
-> Bitmap Index Scan on str_uk4 (cost=0.00..2.02 rows=6 width=0) (actual time=0.086..0.086 rows=1 loops=1)
Index Cond: (path ~ '142.2330445.2330598.2330676.*'::lquery)
Total runtime: 707.019 ms

16 rows...


The failing case is:
set enable_hashjoin to on;
Seq Scan on foo1 cost=0.00..421679.00 rows=10810 width=4) (actual time=154.663..154.663 rows=0 loops=1)
Filter: (subplan)
SubPlan
-> Hash Join (cost=8.47..19.46 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=21619)
Hash Cond: ("outer".id = "inner".str_id)
-> Bitmap Heap Scan on structure str (cost=2.02..12.92 rows=6 width=4) (actual time=0.100..30.095 rows=1 loops=1)
Recheck Cond: (path ~ '142.2330445.2330598.2330676.*'::lquery)
-> Bitmap Index Scan on str_uk4 (cost=0.00..2.02 rows=6 width=0) (actual time=0.090..0.090 rows=1 loops=1)
Index Cond: (path ~ '142.2330445.2330598.2330676.*'::lquery)
-> Hash (cost=6.43..6.43 rows=5 width=4) (actual time=0.032..0.032 rows=0 loops=1)
-> Bitmap Heap Scan on foo2 bz (cost=2.02..6.43 rows=5 width=4) (actual time=0.025..0.025 rows=0 loops=1)
Recheck Cond: (bid = $0)
-> Bitmap Index Scan on foo2_bid_key1 (cost=0.00..2.02 rows=5 width=0) (actual time=0.021..0.021 rows=0 loops=1)
Index Cond: (bid = $0)
Total runtime: 154.862 ms
No rows....

The query itself is quite simple:
select foo1.id
from foo1
where
foo1.datloesch is null
and exists (select 1
from foo2 bz,
structure str
where bz.bid=foo1.id
and str.id = bz.str_id
and str.path ~ '*.2330676.*'
);

The path field is an "ltree" column, with an GIST index on it.


Any ideas what I could try to track this down?

Best regards,
Mario Weilguni

Search Discussions

  • Christopher Kings-Lynne at Nov 28, 2005 at 1:12 pm
    The path field is an "ltree" column, with an GIST index on it.
    Something to do with bitmap indexscans on lossy indexes?

    Chris
  • Mario Weilguni at Nov 28, 2005 at 1:21 pm

    Am Montag, 28. November 2005 14:12 schrieb Christopher Kings-Lynne:
    The path field is an "ltree" column, with an GIST index on it.
    Something to do with bitmap indexscans on lossy indexes?

    Chris
    I doubt that, "set enable_bitmapscan to off" produces the wrong result as
    well.

    Best regards
    Mario
  • Tom Lane at Nov 28, 2005 at 3:11 pm

    Mario Weilguni writes:
    The failing case is:
    ...
    SubPlan
    -> Hash Join (cost=8.47..19.46 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=21619)
    Hash Cond: ("outer".id = "inner".str_id)
    -> Bitmap Heap Scan on structure str (cost=2.02..12.92 rows=6 width=4) (actual time=0.100..30.095 rows=1 loops=1)
    Recheck Cond: (path ~ '142.2330445.2330598.2330676.*'::lquery)
    -> Bitmap Index Scan on str_uk4 (cost=0.00..2.02 rows=6 width=0) (actual time=0.090..0.090 rows=1 loops=1)
    Index Cond: (path ~ '142.2330445.2330598.2330676.*'::lquery)
    -> Hash (cost=6.43..6.43 rows=5 width=4) (actual time=0.032..0.032 rows=0 loops=1)
    -> Bitmap Heap Scan on foo2 bz (cost=2.02..6.43 rows=5 width=4) (actual time=0.025..0.025 rows=0 loops=1)
    Recheck Cond: (bid = $0)
    -> Bitmap Index Scan on foo2_bid_key1 (cost=0.00..2.02 rows=5 width=0) (actual time=0.021..0.021 rows=0 loops=1)
    Index Cond: (bid = $0)
    Hmm, I wonder why the hash join's input nodes are showing "loops=1" ...
    the hash depends on the subplan parameter $0 so it needs to be
    re-evaluated each time through. It looks like that's not happening.
    Do you have the corresponding results from 8.0 --- if so, what do
    the loop counts look like?

    regards, tom lane
  • Mario Weilguni at Nov 28, 2005 at 4:31 pm
    Thanks Tom for you quick answer!

    No, I'm using 8.1.0, and tried it on different machines, always the same results.

    SELECT version();
    PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.4 20040623 (Gentoo Hardened Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)

    Best regards,
    Mario Weilguni


    icomedias - Digitale Kommunikation
    ------------------------------------------------------------------------
    Mario Weilguni, Forschung und Entwicklung
    mario.weilguni@icomedias.com, http://www.icomedias.com/

    icomedias Österreich Systemhaus GmbH:
    8020 Graz, Entenplatz 1
    Tel: +43 (316) 721.671-272, Fax: -103

    icomedias Deutschland Systemhaus GmbH:
    10969 Berlin, Alexandrinenstraße 2-3
    Tel: +49 (30) 695.399-272, Fax: -103

    -----Original Message-----
    From: Tom Lane
    Sent: Monday, November 28, 2005 5:20 PM
    To: Mario Weilguni
    Cc: Mario Weilguni; pgsql-hackers@postgresql.org
    Subject: Re: [HACKERS] Getting different number of results when using hashjoin on/off

    "Mario Weilguni" <mario.weilguni@icomedias.com> writes:
    Yes. This is from a 8.0.3 (with slightly older and different data,
    resulting in only 9 rows, but the rest is the same):
    Yeah, that looks more reasonable.

    I tried to reproduce this, without any luck:

    regression=# explain analyze select count(*) from tenk1 a where exists (select 1 from tenk1 b, tenk1 c where b.unique1=c.unique2 and b.hundred in (4,5) and c.hundred=a.hundred);
    QUERY PLAN
    --------------------------------------------------------------------------------------------------------------------------------------------------------
    Aggregate (cost=3879742.37..3879742.38 rows=1 width=0) (actual time=46579.077..46579.082 rows=1 loops=1)
    -> Seq Scan on tenk1 a (cost=0.00..3879729.87 rows=5000 width=0) (actual time=5.129..46528.208 rows=8500 loops=1)
    Filter: (subplan)
    SubPlan
    -> Hash Join (cost=229.20..546.66 rows=2 width=0) (actual time=4.569..4.569 rows=1 loops=10000)
    Hash Cond: ("outer".unique1 = "inner".unique2)
    -> Bitmap Heap Scan on tenk1 b (cost=4.69..321.15 rows=196 width=4) (actual time=0.947..1.698 rows=90 loops=10000)
    Recheck Cond: ((hundred = 4) OR (hundred = 5))
    -> BitmapOr (cost=4.69..4.69 rows=197 width=0) (actual time=0.544..0.544 rows=0 loops=10000)
    -> Bitmap Index Scan on tenk1_hundred (cost=0.00..2.34 rows=98 width=0) (actual time=0.271..0.271 rows=100 loops=10000)
    Index Cond: (hundred = 4)
    -> Bitmap Index Scan on tenk1_hundred (cost=0.00..2.34 rows=98 width=0) (actual time=0.262..0.262 rows=100 loops=10000)
    Index Cond: (hundred = 5)
    -> Hash (cost=224.26..224.26 rows=100 width=4) (actual time=2.370..2.370 rows=100 loops=10000)
    -> Bitmap Heap Scan on tenk1 c (cost=2.35..224.26 rows=100 width=4) (actual time=0.492..1.616 rows=100 loops=10000)
    Recheck Cond: (hundred = $0)
    -> Bitmap Index Scan on tenk1_hundred (cost=0.00..2.35 rows=100 width=0) (actual time=0.278..0.278 rows=100 loops=10000)
    Index Cond: (hundred = $0)
    Total runtime: 46584.654 ms
    (19 rows)

    (I'm not bothering with setting up an ltree index, since the question
    of what index is being used shouldn't affect hashjoin's decision to
    rescan or not.)

    That's using 8.1 branch CVS tip, but there aren't any related bug fixes
    since 8.1 release. We did have several bug fixes in the hash join code
    during the 8.1 beta cycle though ... is it possible you are really
    running an 8.1 beta and not 8.1.0?

    regards, tom lane
  • Tom Lane at Nov 28, 2005 at 5:08 pm

    "Mario Weilguni" <mario.weilguni@icomedias.com> writes:
    No, I'm using 8.1.0, and tried it on different machines, always the same results.
    I see it, I think: the recent changes to avoid work when one or the
    other side of the hash join is empty would exit the hash join leaving
    a state that confused ExecReScanHashJoin() into thinking it didn't
    have to do anything. Try the attached patch.

    regards, tom lane


    Index: src/backend/executor/nodeHashjoin.c
    ===================================================================
    RCS file: /cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v
    retrieving revision 1.75.2.1
    diff -c -r1.75.2.1 nodeHashjoin.c
    *** src/backend/executor/nodeHashjoin.c 22 Nov 2005 18:23:09 -0000 1.75.2.1
    --- src/backend/executor/nodeHashjoin.c 28 Nov 2005 17:04:43 -0000
    ***************
    *** 152,163 ****
    * outer join, we can quit without scanning the outer relation.
    */
    if (hashtable->totalTuples == 0 && node->js.jointype != JOIN_LEFT)
    - {
    - ExecHashTableDestroy(hashtable);
    - node->hj_HashTable = NULL;
    - node->hj_FirstOuterTupleSlot = NULL;
    return NULL;
    - }

    /*
    * need to remember whether nbatch has increased since we began
    --- 152,158 ----
    ***************
    *** 487,493 ****
    {
    ExecHashTableDestroy(node->hj_HashTable);
    node->hj_HashTable = NULL;
    - node->hj_FirstOuterTupleSlot = NULL;
    }

    /*
    --- 482,487 ----
    ***************
    *** 805,841 ****
    ExecReScanHashJoin(HashJoinState *node, ExprContext *exprCtxt)
    {
    /*
    - * If we haven't yet built the hash table then we can just return; nothing
    - * done yet, so nothing to undo.
    - */
    - if (node->hj_HashTable == NULL)
    - return;
    -
    - /*
    * In a multi-batch join, we currently have to do rescans the hard way,
    * primarily because batch temp files may have already been released. But
    * if it's a single-batch join, and there is no parameter change for the
    * inner subnode, then we can just re-use the existing hash table without
    * rebuilding it.
    */
    ! if (node->hj_HashTable->nbatch == 1 &&
    ! ((PlanState *) node)->righttree->chgParam == NULL)
    ! {
    ! /* okay to reuse the hash table; needn't rescan inner, either */
    ! }
    ! else
    {
    ! /* must destroy and rebuild hash table */
    ! ExecHashTableDestroy(node->hj_HashTable);
    ! node->hj_HashTable = NULL;
    ! node->hj_FirstOuterTupleSlot = NULL;

    ! /*
    ! * if chgParam of subnode is not null then plan will be re-scanned by
    ! * first ExecProcNode.
    ! */
    ! if (((PlanState *) node)->righttree->chgParam == NULL)
    ! ExecReScan(((PlanState *) node)->righttree, exprCtxt);
    }

    /* Always reset intra-tuple state */
    --- 799,830 ----
    ExecReScanHashJoin(HashJoinState *node, ExprContext *exprCtxt)
    {
    /*
    * In a multi-batch join, we currently have to do rescans the hard way,
    * primarily because batch temp files may have already been released. But
    * if it's a single-batch join, and there is no parameter change for the
    * inner subnode, then we can just re-use the existing hash table without
    * rebuilding it.
    */
    ! if (node->hj_HashTable != NULL)
    {
    ! if (node->hj_HashTable->nbatch == 1 &&
    ! ((PlanState *) node)->righttree->chgParam == NULL)
    ! {
    ! /* okay to reuse the hash table; needn't rescan inner, either */
    ! }
    ! else
    ! {
    ! /* must destroy and rebuild hash table */
    ! ExecHashTableDestroy(node->hj_HashTable);
    ! node->hj_HashTable = NULL;

    ! /*
    ! * if chgParam of subnode is not null then plan will be re-scanned
    ! * by first ExecProcNode.
    ! */
    ! if (((PlanState *) node)->righttree->chgParam == NULL)
    ! ExecReScan(((PlanState *) node)->righttree, exprCtxt);
    ! }
    }

    /* Always reset intra-tuple state */
    ***************
    *** 847,852 ****
    --- 836,842 ----
    node->js.ps.ps_TupFromTlist = false;
    node->hj_NeedNewOuter = true;
    node->hj_MatchedOuter = false;
    + node->hj_FirstOuterTupleSlot = NULL;

    /*
    * if chgParam of subnode is not null then plan will be re-scanned by
  • Mario Weilguni at Nov 28, 2005 at 5:54 pm
    Yes. This is from a 8.0.3 (with slightly older and different data,
    resulting in only 9 rows, but the rest is the same):

    Index Scan using ben_uk3 on foo1 ben (cost=0.00..73867.23 rows=863
    width=27) (actual time=38.591..501.839 rows=9 loops=1)
    Filter: (subplan)
    SubPlan
    -> Hash Join (cost=14.25..42.53 rows=1 width=0) (actual
    time=0.284..0.284 rows=0 loops=1725)
    Hash Cond: ("outer".id = "inner".str_id)
    -> Index Scan using str_uk4 on structure str
    (cost=0.00..27.91 rows=13 width=4) (actual time=0.765..4.043 rows=1
    loops=112)
    Index Cond: (path ~ '*.2330676.*'::lquery)
    -> Hash (cost=14.23..14.23 rows=10 width=4) (actual
    time=0.012..0.012 rows=0 loops=1725)
    -> Index Scan using foo2_ben_id_key1 on foo2 bz
    (cost=0.00..14.23 rows=10 width=4) (actual time=0.008..0.009 rows=1
    loops=1725)
    Index Cond: (ben_id = $0)
    Total runtime: 501.980 ms

    Best regards

    P.s. sorry for the stupid quoting, I've to use Outlook....


    Mario Weilguni <mweilguni@sime.com> writes:
    The failing case is:
    ...
    SubPlan
    -> Hash Join (cost=8.47..19.46 rows=1 width=0) (actual
    time=0.004..0.004 rows=0 loops=21619)
    Hash Cond: ("outer".id = "inner".str_id)
    -> Bitmap Heap Scan on structure str (cost=2.02..12.92
    rows=6 width=4) (actual time=0.100..30.095 rows=1 loops=1)
    Recheck Cond: (path ~
    '142.2330445.2330598.2330676.*'::lquery)
    -> Bitmap Index Scan on str_uk4 (cost=0.00..2.02
    rows=6 width=0) (actual time=0.090..0.090 rows=1 loops=1)
    Index Cond: (path ~
    '142.2330445.2330598.2330676.*'::lquery)
    -> Hash (cost=6.43..6.43 rows=5 width=4) (actual
    time=0.032..0.032 rows=0 loops=1)
    -> Bitmap Heap Scan on foo2 bz (cost=2.02..6.43
    rows=5 width=4) (actual time=0.025..0.025 rows=0 loops=1)
    Recheck Cond: (bid = $0)
    -> Bitmap Index Scan on foo2_bid_key1
    (cost=0.00..2.02 rows=5 width=0) (actual time=0.021..0.021 rows=0
    loops=1)
    Index Cond: (bid = $0)
    Hmm, I wonder why the hash join's input nodes are showing "loops=1" ...
    the hash depends on the subplan parameter $0 so it needs to be
    re-evaluated each time through. It looks like that's not happening.
    Do you have the corresponding results from 8.0 --- if so, what do
    the loop counts look like?

    ---------------------------(end of broadcast)---------------------------
    TIP 1: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to majordomo@postgresql.org so that your
    message can get through to the mailing list cleanly
  • Tom Lane at Nov 28, 2005 at 4:20 pm

    "Mario Weilguni" <mario.weilguni@icomedias.com> writes:
    Yes. This is from a 8.0.3 (with slightly older and different data,
    resulting in only 9 rows, but the rest is the same):
    Yeah, that looks more reasonable.

    I tried to reproduce this, without any luck:

    regression=# explain analyze select count(*) from tenk1 a where exists (select 1 from tenk1 b, tenk1 c where b.unique1=c.unique2 and b.hundred in (4,5) and c.hundred=a.hundred);
    QUERY PLAN
    --------------------------------------------------------------------------------------------------------------------------------------------------------
    Aggregate (cost=3879742.37..3879742.38 rows=1 width=0) (actual time=46579.077..46579.082 rows=1 loops=1)
    -> Seq Scan on tenk1 a (cost=0.00..3879729.87 rows=5000 width=0) (actual time=5.129..46528.208 rows=8500 loops=1)
    Filter: (subplan)
    SubPlan
    -> Hash Join (cost=229.20..546.66 rows=2 width=0) (actual time=4.569..4.569 rows=1 loops=10000)
    Hash Cond: ("outer".unique1 = "inner".unique2)
    -> Bitmap Heap Scan on tenk1 b (cost=4.69..321.15 rows=196 width=4) (actual time=0.947..1.698 rows=90 loops=10000)
    Recheck Cond: ((hundred = 4) OR (hundred = 5))
    -> BitmapOr (cost=4.69..4.69 rows=197 width=0) (actual time=0.544..0.544 rows=0 loops=10000)
    -> Bitmap Index Scan on tenk1_hundred (cost=0.00..2.34 rows=98 width=0) (actual time=0.271..0.271 rows=100 loops=10000)
    Index Cond: (hundred = 4)
    -> Bitmap Index Scan on tenk1_hundred (cost=0.00..2.34 rows=98 width=0) (actual time=0.262..0.262 rows=100 loops=10000)
    Index Cond: (hundred = 5)
    -> Hash (cost=224.26..224.26 rows=100 width=4) (actual time=2.370..2.370 rows=100 loops=10000)
    -> Bitmap Heap Scan on tenk1 c (cost=2.35..224.26 rows=100 width=4) (actual time=0.492..1.616 rows=100 loops=10000)
    Recheck Cond: (hundred = $0)
    -> Bitmap Index Scan on tenk1_hundred (cost=0.00..2.35 rows=100 width=0) (actual time=0.278..0.278 rows=100 loops=10000)
    Index Cond: (hundred = $0)
    Total runtime: 46584.654 ms
    (19 rows)

    (I'm not bothering with setting up an ltree index, since the question
    of what index is being used shouldn't affect hashjoin's decision to
    rescan or not.)

    That's using 8.1 branch CVS tip, but there aren't any related bug fixes
    since 8.1 release. We did have several bug fixes in the hash join code
    during the 8.1 beta cycle though ... is it possible you are really
    running an 8.1 beta and not 8.1.0?

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedNov 28, '05 at 1:00p
activeNov 28, '05 at 5:54p
posts8
users4
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase