I am evaluating postgres 9 to migrate away from Oracle. The following query
runs too slow, also please find the explain plan:

****************************************************************
explain analyze select DISTINCT EVENT.ID, ORIGIN.ID AS
ORIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN,
EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE,
ORIGIN.DEPTH,ORIGIN.EVTYPE,
ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR, ORIGIN.CONTRIBUTOR OCONTRIBUTOR,
MAGNITUDE.ID AS MAGID,
MAGNITUDE.MAGNITUDE,MAGNITUDE.TYPE AS MAGTYPE
from event.event left join event.origin on event.id=origin.eventid left join
event.magnitude on origin.id=event.magnitude.origin_id
WHERE EXISTS(select origin_id from event.magnitude where
magnitude.magnitude>=7.2 and origin.id=origin_id)
order by ORIGIN.TIME desc,MAGNITUDE.MAGNITUDE desc,EVENT.ID
,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID


"Unique (cost=740549.86..741151.42 rows=15039 width=80) (actual
time=17791.557..17799.092 rows=5517 loops=1)"
" -> Sort (cost=740549.86..740587.45 rows=15039 width=80) (actual
time=17791.556..17792.220 rows=5517 loops=1)"
" Sort Key: origin."time", event.magnitude.magnitude, event.id,
event.preferred_origin_id, origin.id, event.contributor, origin.latitude,
origin.longitude, origin.depth, origin.evtype, origin.catalog,
origin.author, origin.contributor, event.magnitude.id, event.magnitude.type"
" Sort Method: quicksort Memory: 968kB"
" -> Nested Loop Left Join (cost=34642.50..739506.42 rows=15039
width=80) (actual time=6.927..17769.788 rows=5517 loops=1)"
" -> Hash Semi Join (cost=34642.50..723750.23 rows=14382
width=62) (actual time=6.912..17744.858 rows=2246 loops=1)"
" Hash Cond: (origin.id = event.magnitude.origin_id)"
" -> Merge Left Join (cost=0.00..641544.72 rows=6133105
width=62) (actual time=0.036..16221.008 rows=6133105 loops=1)"
" Merge Cond: (event.id = origin.eventid)"
" -> Index Scan using event_key_index on event
(cost=0.00..163046.53 rows=3272228 width=12) (actual time=0.017..1243.616
rows=3276192 loops=1)"
" -> Index Scan using origin_fk_index on origin
(cost=0.00..393653.81 rows=6133105 width=54) (actual time=0.013..3033.657
rows=6133105 loops=1)"
" -> Hash (cost=34462.73..34462.73 rows=14382 width=4)
(actual time=6.668..6.668 rows=3198 loops=1)"
" Buckets: 2048 Batches: 1 Memory Usage: 113kB"
" -> Bitmap Heap Scan on magnitude
(cost=324.65..34462.73 rows=14382 width=4) (actual time=1.682..5.414
rows=3198 loops=1)"
" Recheck Cond: (magnitude >= 7.2)"
" -> Bitmap Index Scan on mag_index
(cost=0.00..321.05 rows=14382 width=0) (actual time=1.331..1.331 rows=3198
loops=1)"
" Index Cond: (magnitude >= 7.2)"
" -> Index Scan using mag_fkey_index on magnitude
(cost=0.00..1.06 rows=3 width=22) (actual time=0.007..0.009 rows=2
loops=2246)"
" Index Cond: (origin.id = event.magnitude.origin_id)"
"Total runtime: 17799.669 ms"
****************************************************************

This query runs in Oracle in 1 second while takes 16 seconds in postgres,
The difference tells me that I am doing something wrong somewhere. This is
a new installation on a local Mac machine with 12G of RAM.

I have:
effective_cache_size=4096MB
shared_buffer=2048MB
work_mem=100MB

Search Discussions

  • Scott Marlowe at Jan 28, 2011 at 8:34 pm

    On Fri, Jan 28, 2011 at 10:30 AM, yazan suleiman wrote:
    I am evaluating postgres 9 to migrate away from Oracle.  The following query
    runs too slow, also please find the explain plan:
    ****************************************************************
    explain analyze select DISTINCT EVENT.ID, ORIGIN.ID AS
    ORIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN,
    EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE,
    ORIGIN.DEPTH,ORIGIN.EVTYPE,
    "Total runtime: 17799.669 ms"
    ****************************************************************
    This query runs in Oracle in 1 second while takes 16 seconds in postgres,
    The difference tells me that I am doing something wrong somewhere.  This is
    a new installation on a local Mac machine with 12G of RAM.
    Try turning it into a group by instead of a distinct. i.e.

    select a,b,c,d from xyz group by a,b,c,d

    and see if it's faster. There is some poor performance on large data
    sets for distinct. Don't know if they got fixed in 9.0 or not, if not
    then definitely try a group by and see.
  • Kenneth Marshall at Jan 28, 2011 at 8:50 pm

    On Fri, Jan 28, 2011 at 09:30:19AM -0800, yazan suleiman wrote:
    I am evaluating postgres 9 to migrate away from Oracle. The following query
    runs too slow, also please find the explain plan:

    ****************************************************************
    explain analyze select DISTINCT EVENT.ID, ORIGIN.ID AS
    ORIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN,
    EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE,
    ORIGIN.DEPTH,ORIGIN.EVTYPE,
    ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR, ORIGIN.CONTRIBUTOR OCONTRIBUTOR,
    MAGNITUDE.ID AS MAGID,
    MAGNITUDE.MAGNITUDE,MAGNITUDE.TYPE AS MAGTYPE
    from event.event left join event.origin on event.id=origin.eventid left join
    event.magnitude on origin.id=event.magnitude.origin_id
    WHERE EXISTS(select origin_id from event.magnitude where
    magnitude.magnitude>=7.2 and origin.id=origin_id)
    order by ORIGIN.TIME desc,MAGNITUDE.MAGNITUDE desc,EVENT.ID
    ,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID


    "Unique (cost=740549.86..741151.42 rows=15039 width=80) (actual
    time=17791.557..17799.092 rows=5517 loops=1)"
    " -> Sort (cost=740549.86..740587.45 rows=15039 width=80) (actual
    time=17791.556..17792.220 rows=5517 loops=1)"
    " Sort Key: origin."time", event.magnitude.magnitude, event.id,
    event.preferred_origin_id, origin.id, event.contributor, origin.latitude,
    origin.longitude, origin.depth, origin.evtype, origin.catalog,
    origin.author, origin.contributor, event.magnitude.id, event.magnitude.type"
    " Sort Method: quicksort Memory: 968kB"
    " -> Nested Loop Left Join (cost=34642.50..739506.42 rows=15039
    width=80) (actual time=6.927..17769.788 rows=5517 loops=1)"
    " -> Hash Semi Join (cost=34642.50..723750.23 rows=14382
    width=62) (actual time=6.912..17744.858 rows=2246 loops=1)"
    " Hash Cond: (origin.id = event.magnitude.origin_id)"
    " -> Merge Left Join (cost=0.00..641544.72 rows=6133105
    width=62) (actual time=0.036..16221.008 rows=6133105 loops=1)"
    " Merge Cond: (event.id = origin.eventid)"
    " -> Index Scan using event_key_index on event
    (cost=0.00..163046.53 rows=3272228 width=12) (actual time=0.017..1243.616
    rows=3276192 loops=1)"
    " -> Index Scan using origin_fk_index on origin
    (cost=0.00..393653.81 rows=6133105 width=54) (actual time=0.013..3033.657
    rows=6133105 loops=1)"
    " -> Hash (cost=34462.73..34462.73 rows=14382 width=4)
    (actual time=6.668..6.668 rows=3198 loops=1)"
    " Buckets: 2048 Batches: 1 Memory Usage: 113kB"
    " -> Bitmap Heap Scan on magnitude
    (cost=324.65..34462.73 rows=14382 width=4) (actual time=1.682..5.414
    rows=3198 loops=1)"
    " Recheck Cond: (magnitude >= 7.2)"
    " -> Bitmap Index Scan on mag_index
    (cost=0.00..321.05 rows=14382 width=0) (actual time=1.331..1.331 rows=3198
    loops=1)"
    " Index Cond: (magnitude >= 7.2)"
    " -> Index Scan using mag_fkey_index on magnitude
    (cost=0.00..1.06 rows=3 width=22) (actual time=0.007..0.009 rows=2
    loops=2246)"
    " Index Cond: (origin.id = event.magnitude.origin_id)"
    "Total runtime: 17799.669 ms"
    ****************************************************************

    This query runs in Oracle in 1 second while takes 16 seconds in postgres,
    The difference tells me that I am doing something wrong somewhere. This is
    a new installation on a local Mac machine with 12G of RAM.

    I have:
    effective_cache_size=4096MB
    shared_buffer=2048MB
    work_mem=100MB
    It sounds like the queries are not doing the same thing. What is
    the schema/index definition for Oracle versus PostgreSQL?

    Ken
  • Yazan suleiman at Jan 28, 2011 at 9:01 pm
    They have the same indexes, foreign keys are indexed in addition to the
    search values like magnitude. Distinct does nothing to speed up the query.
    If I remove the select in the where clause the time goes down to 98 ms:

    select DISTINCT EVENT.ID, ORIGIN.ID AS ORIGINID,EVENT.PREFERRED_ORIGIN_ID AS
    PREFERRED_ORIGIN, EVENT.CONTRIBUTOR,
    ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE, ORIGIN.DEPTH,ORIGIN.EVTYPE,
    ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR,
    ORIGIN.CONTRIBUTOR OCONTRIBUTOR,MAGNITUDE.ID AS
    MAGID,MAGNITUDE.MAGNITUDE,MAGNITUDE.TYPE AS MAGTYPE
    from event.event left join event.origin on event.id=origin.eventid left join
    event.magnitude on origin.id=event.magnitude.origin_id
    WHERE magnitude.magnitude>=7.2 order by ORIGIN.TIME desc,MAGNITUDE.MAGNITUDE
    desc,EVENT.ID,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID

    The new query returns 4000 rows, so the result is still big. I am not sure
    if I am answering your question, but I don't have access to generate ddl
    from Oracle. Thanks for the reply.
    On Fri, Jan 28, 2011 at 12:50 PM, Kenneth Marshall wrote:
    On Fri, Jan 28, 2011 at 09:30:19AM -0800, yazan suleiman wrote:
    I am evaluating postgres 9 to migrate away from Oracle. The following query
    runs too slow, also please find the explain plan:

    ****************************************************************
    explain analyze select DISTINCT EVENT.ID, ORIGIN.ID AS
    ORIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN,
    EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE,
    ORIGIN.DEPTH,ORIGIN.EVTYPE,
    ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR, ORIGIN.CONTRIBUTOR OCONTRIBUTOR,
    MAGNITUDE.ID AS MAGID,
    MAGNITUDE.MAGNITUDE,MAGNITUDE.TYPE AS MAGTYPE
    from event.event left join event.origin on event.id=origin.eventid left join
    event.magnitude on origin.id=event.magnitude.origin_id
    WHERE EXISTS(select origin_id from event.magnitude where
    magnitude.magnitude>=7.2 and origin.id=origin_id)
    order by ORIGIN.TIME desc,MAGNITUDE.MAGNITUDE desc,EVENT.ID
    ,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID


    "Unique (cost=740549.86..741151.42 rows=15039 width=80) (actual
    time=17791.557..17799.092 rows=5517 loops=1)"
    " -> Sort (cost=740549.86..740587.45 rows=15039 width=80) (actual
    time=17791.556..17792.220 rows=5517 loops=1)"
    " Sort Key: origin."time", event.magnitude.magnitude, event.id,
    event.preferred_origin_id, origin.id, event.contributor,
    origin.latitude,
    origin.longitude, origin.depth, origin.evtype, origin.catalog,
    origin.author, origin.contributor, event.magnitude.id,
    event.magnitude.type"
    " Sort Method: quicksort Memory: 968kB"
    " -> Nested Loop Left Join (cost=34642.50..739506.42 rows=15039
    width=80) (actual time=6.927..17769.788 rows=5517 loops=1)"
    " -> Hash Semi Join (cost=34642.50..723750.23 rows=14382
    width=62) (actual time=6.912..17744.858 rows=2246 loops=1)"
    " Hash Cond: (origin.id = event.magnitude.origin_id)"
    " -> Merge Left Join (cost=0.00..641544.72
    rows=6133105
    width=62) (actual time=0.036..16221.008 rows=6133105 loops=1)"
    " Merge Cond: (event.id = origin.eventid)"
    " -> Index Scan using event_key_index on event
    (cost=0.00..163046.53 rows=3272228 width=12) (actual
    time=0.017..1243.616
    rows=3276192 loops=1)"
    " -> Index Scan using origin_fk_index on origin
    (cost=0.00..393653.81 rows=6133105 width=54) (actual
    time=0.013..3033.657
    rows=6133105 loops=1)"
    " -> Hash (cost=34462.73..34462.73 rows=14382 width=4)
    (actual time=6.668..6.668 rows=3198 loops=1)"
    " Buckets: 2048 Batches: 1 Memory Usage: 113kB"
    " -> Bitmap Heap Scan on magnitude
    (cost=324.65..34462.73 rows=14382 width=4) (actual time=1.682..5.414
    rows=3198 loops=1)"
    " Recheck Cond: (magnitude >= 7.2)"
    " -> Bitmap Index Scan on mag_index
    (cost=0.00..321.05 rows=14382 width=0) (actual time=1.331..1.331 rows=3198
    loops=1)"
    " Index Cond: (magnitude >= 7.2)"
    " -> Index Scan using mag_fkey_index on magnitude
    (cost=0.00..1.06 rows=3 width=22) (actual time=0.007..0.009 rows=2
    loops=2246)"
    " Index Cond: (origin.id =
    event.magnitude.origin_id)"
    "Total runtime: 17799.669 ms"
    ****************************************************************

    This query runs in Oracle in 1 second while takes 16 seconds in postgres,
    The difference tells me that I am doing something wrong somewhere. This is
    a new installation on a local Mac machine with 12G of RAM.

    I have:
    effective_cache_size=4096MB
    shared_buffer=2048MB
    work_mem=100MB
    It sounds like the queries are not doing the same thing. What is
    the schema/index definition for Oracle versus PostgreSQL?

    Ken
  • Andres Freund at Jan 28, 2011 at 9:19 pm

    On Friday, January 28, 2011 06:30:19 PM yazan suleiman wrote:
    I am evaluating postgres 9 to migrate away from Oracle. The following
    query runs too slow, also please find the explain plan:
    First:

    explain analyze
    SELECT DISTINCT
    EVENT.ID
    ,ORIGIN.ID AS ORIGINID
    ,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN
    ,EVENT.CONTRIBUTOR
    ,ORIGIN.TIME
    ,ORIGIN.LATITUDE
    ,ORIGIN.LONGITUDE
    ,ORIGIN.DEPTH
    ,ORIGIN.EVTYPE
    ,ORIGIN.CATALOG
    ,ORIGIN.AUTHOR OAUTHOR
    ,ORIGIN.CONTRIBUTOR OCONTRIBUTOR
    ,MAGNITUDE.ID AS MAGID
    ,MAGNITUDE.MAGNITUDE
    ,MAGNITUDE.TYPE AS MAGTYPE
    FROM
    event.event
    left join event.origin on event.id = origin.eventid
    left join event.magnitude on origin.id = event.magnitude.origin_id
    WHERE
    EXISTS(
    select origin_id
    from event.magnitude
    where magnitude.magnitude >= 7.2 and origin.id = origin_id
    )
    order by
    ORIGIN.TIME desc
    ,MAGNITUDE.MAGNITUDE desc
    ,EVENT.ID
    ,EVENT.PREFERRED_ORIGIN_ID
    ,ORIGIN.ID

    I am honestly stumped if anybody can figure something sensible out of the
    original formatting of the query...

    What happens if you change the
    left join event.origin on event.id = origin.eventid
    into
    join event.origin on event.id = origin.eventid
    ?

    The EXISTS() requires that origin is not null anyway. (Not sure why the
    planner doesn't recognize that though).

    Andres
  • Yazan suleiman at Jan 28, 2011 at 9:34 pm
    OK, that did it. Time is now 315 ms. I am so exited working with
    postgres. I really apologize for the format, my first time posting on the
    list. That does not justify it though. Really thanks.
    On Fri, Jan 28, 2011 at 1:19 PM, Andres Freund wrote:
    On Friday, January 28, 2011 06:30:19 PM yazan suleiman wrote:
    I am evaluating postgres 9 to migrate away from Oracle. The following
    query runs too slow, also please find the explain plan:
    First:

    explain analyze
    SELECT DISTINCT
    EVENT.ID
    ,ORIGIN.ID AS ORIGINID
    ,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN
    ,EVENT.CONTRIBUTOR
    ,ORIGIN.TIME
    ,ORIGIN.LATITUDE
    ,ORIGIN.LONGITUDE
    ,ORIGIN.DEPTH
    ,ORIGIN.EVTYPE
    ,ORIGIN.CATALOG
    ,ORIGIN.AUTHOR OAUTHOR
    ,ORIGIN.CONTRIBUTOR OCONTRIBUTOR
    ,MAGNITUDE.ID AS MAGID
    ,MAGNITUDE.MAGNITUDE
    ,MAGNITUDE.TYPE AS MAGTYPE
    FROM
    event.event
    left join event.origin on event.id = origin.eventid
    left join event.magnitude on origin.id = event.magnitude.origin_id
    WHERE
    EXISTS(
    select origin_id
    from event.magnitude
    where magnitude.magnitude >= 7.2 and origin.id = origin_id
    )
    order by
    ORIGIN.TIME desc
    ,MAGNITUDE.MAGNITUDE desc
    ,EVENT.ID
    ,EVENT.PREFERRED_ORIGIN_ID
    ,ORIGIN.ID

    I am honestly stumped if anybody can figure something sensible out of the
    original formatting of the query...

    What happens if you change the
    left join event.origin on event.id = origin.eventid
    into
    join event.origin on event.id = origin.eventid
    ?

    The EXISTS() requires that origin is not null anyway. (Not sure why the
    planner doesn't recognize that though).

    Andres
  • Tom Lane at Jan 30, 2011 at 10:22 pm

    Andres Freund writes:
    What happens if you change the
    left join event.origin on event.id = origin.eventid
    into
    join event.origin on event.id = origin.eventid
    ?
    The EXISTS() requires that origin is not null anyway. (Not sure why the
    planner doesn't recognize that though).
    Sloppy thinking in reduce_outer_joins() is why. Fixed now:
    http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=1df57f63f3f60c684aa8918910ac410e9c780713

    regards, tom lane
  • Andres Freund at Jan 30, 2011 at 10:43 pm

    On Sunday 30 January 2011 23:18:15 Tom Lane wrote:
    Andres Freund <andres@anarazel.de> writes:
    What happens if you change the

    left join event.origin on event.id = origin.eventid

    into

    join event.origin on event.id = origin.eventid

    ?

    The EXISTS() requires that origin is not null anyway. (Not sure why the
    planner doesn't recognize that though).
    Sloppy thinking in reduce_outer_joins() is why.
    Wow. Nice one, thanks.

    Andres
  • Ross J. Reedstrom at Feb 1, 2011 at 5:24 pm

    On Sun, Jan 30, 2011 at 05:18:15PM -0500, Tom Lane wrote:
    Andres Freund <andres@anarazel.de> writes:
    What happens if you change the
    left join event.origin on event.id = origin.eventid
    into
    join event.origin on event.id = origin.eventid
    ?
    The EXISTS() requires that origin is not null anyway. (Not sure why the
    planner doesn't recognize that though).
    Sloppy thinking in reduce_outer_joins() is why. Fixed now:
    http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=1df57f63f3f60c684aa8918910ac410e9c780713

    regards, tom lane
    This is one of the reasons I love open source in general, and PostgreSQL
    in particular: Tom has the bandwidth to notice these kinds of
    workarounds being discussed on support lists, and turn them immediately
    into improvements in the planner. Partly because (I assume, based on
    the commit message) Andres's parenthetical comment red-flagged it for
    him, since he knew he could trust Andres's opinion that there was
    probably a planner improvement hiding here. Amazing!

    Ross
    --
    Ross Reedstrom, Ph.D. reedstrm@rice.edu
    Systems Engineer & Admin, Research Scientist phone: 713-348-6166
    Connexions http://cnx.org fax: 713-348-3665
    Rice University MS-375, Houston, TX 77005
    GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedJan 28, '11 at 5:30p
activeFeb 1, '11 at 5:24p
posts9
users6
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase