While investigating a client problem I just observed that pg_dump takes
a surprisingly large amount of time to dump a schema with a large number
of views. The client's hardware is quite spiffy, and yet pg_dump is
taking many minutes to dump a schema with some 35,000 views. Here's a
simple test case:

create schema views;
do 'begin for i in 1 .. 10000 loop execute $$create view views.v_$$
i ||$$ as select current_date as d, current_timestamp as ts,
$_$a$_$::text || n as t, n from generate_series(1,5) as n$$; end
loop; end;';


On my modest hardware this database took 4m18.864s for pg_dump to run.
Should we be looking at replacing the retail operations which consume
most of this time with something that runs faster?

There is also this gem of behaviour, which is where I started:

p1 p2
begin;
drop view foo;
pg_dump
commit;
boom.

with this error:

2011-10-01 16:38:20 EDT [27084] 30063 ERROR: could not open
relation with OID 133640
2011-10-01 16:38:20 EDT [27084] 30064 STATEMENT: SELECT
pg_catalog.pg_get_viewdef('133640'::pg_catalog.oid) AS viewdef

Of course, this isn't caused by having a large catalog, but it's
terrible nevertheless. I'm not sure what to do about it.

cheers

andrew

Search Discussions

  • Joe Abbate at Oct 1, 2011 at 10:51 pm

    On 10/01/2011 05:08 PM, Andrew Dunstan wrote:
    While investigating a client problem I just observed that pg_dump takes
    a surprisingly large amount of time to dump a schema with a large number
    of views. The client's hardware is quite spiffy, and yet pg_dump is
    taking many minutes to dump a schema with some 35,000 views. Here's a
    simple test case:

    create schema views;
    do 'begin for i in 1 .. 10000 loop execute $$create view views.v_$$
    i ||$$ as select current_date as d, current_timestamp as ts,
    $_$a$_$::text || n as t, n from generate_series(1,5) as n$$; end
    loop; end;';


    On my modest hardware this database took 4m18.864s for pg_dump to run.
    Should we be looking at replacing the retail operations which consume
    most of this time with something that runs faster?
    How modest? Was there anything else in the database? I tried with 9000
    views (because I didn't want to bother increasing
    max_locks_per_transaction) and the pg_dump in less than 10 seconds
    (8.991s) redirecting (plain-text) output to a file (this is on a Core i5).
    There is also this gem of behaviour, which is where I started:

    p1 p2
    begin;
    drop view foo;
    pg_dump
    commit;
    boom.

    with this error:

    2011-10-01 16:38:20 EDT [27084] 30063 ERROR: could not open
    relation with OID 133640
    2011-10-01 16:38:20 EDT [27084] 30064 STATEMENT: SELECT
    pg_catalog.pg_get_viewdef('133640'::pg_catalog.oid) AS viewdef

    Of course, this isn't caused by having a large catalog, but it's
    terrible nevertheless. I'm not sure what to do about it.
    Couldn't you run pg_dump with --lock-wait-timeout?

    Joe
  • Andrew Dunstan at Oct 2, 2011 at 1:46 am

    On 10/01/2011 05:48 PM, Joe Abbate wrote:
    On 10/01/2011 05:08 PM, Andrew Dunstan wrote:
    While investigating a client problem I just observed that pg_dump takes
    a surprisingly large amount of time to dump a schema with a large number
    of views. The client's hardware is quite spiffy, and yet pg_dump is
    taking many minutes to dump a schema with some 35,000 views. Here's a
    simple test case:

    create schema views;
    do 'begin for i in 1 .. 10000 loop execute $$create view views.v_$$
    i ||$$ as select current_date as d, current_timestamp as ts,
    $_$a$_$::text || n as t, n from generate_series(1,5) as n$$; end
    loop; end;';


    On my modest hardware this database took 4m18.864s for pg_dump to run.
    Should we be looking at replacing the retail operations which consume
    most of this time with something that runs faster?
    How modest? Was there anything else in the database? I tried with 9000
    views (because I didn't want to bother increasing
    max_locks_per_transaction) and the pg_dump in less than 10 seconds
    (8.991s) redirecting (plain-text) output to a file (this is on a Core i5).
    Yeah, it must be pretty modest :-) On more powerful h/w I get the same.
    I need to dig further into why it's taking so long to dump my client's
    schema on server class hardware.
    There is also this gem of behaviour, which is where I started:

    p1 p2
    begin;
    drop view foo;
    pg_dump
    commit;
    boom.

    with this error:

    2011-10-01 16:38:20 EDT [27084] 30063 ERROR: could not open
    relation with OID 133640
    2011-10-01 16:38:20 EDT [27084] 30064 STATEMENT: SELECT
    pg_catalog.pg_get_viewdef('133640'::pg_catalog.oid) AS viewdef

    Of course, this isn't caused by having a large catalog, but it's
    terrible nevertheless. I'm not sure what to do about it.
    Couldn't you run pg_dump with --lock-wait-timeout?
    How would that help? This isn't a lock failure.


    cheers

    andrew
  • Joe Abbate at Oct 2, 2011 at 9:16 pm
    Hi Andrew,
    On 10/01/2011 09:46 PM, Andrew Dunstan wrote:

    On 10/01/2011 05:48 PM, Joe Abbate wrote:
    On 10/01/2011 05:08 PM, Andrew Dunstan wrote:
    There is also this gem of behaviour, which is where I started:

    p1 p2
    begin;
    drop view foo;
    pg_dump
    commit;
    boom.

    with this error:

    2011-10-01 16:38:20 EDT [27084] 30063 ERROR: could not open
    relation with OID 133640
    2011-10-01 16:38:20 EDT [27084] 30064 STATEMENT: SELECT
    pg_catalog.pg_get_viewdef('133640'::pg_catalog.oid) AS viewdef

    Of course, this isn't caused by having a large catalog, but it's
    terrible nevertheless. I'm not sure what to do about it.
    Couldn't you run pg_dump with --lock-wait-timeout?
    How would that help? This isn't a lock failure.
    I misinterpreted the error. I have confirmed the behavior you see. I'm
    somewhat surprised there appears to be no ability to lock a database
    exclusively for something like pg_dump/pg_restore, so you're not
    surprised by concurrent activity against the catalogs. I'm guessing the
    assumption is that MVCC will take care of that?

    Joe
  • Florian Pflug at Oct 3, 2011 at 8:31 am

    On Oct2, 2011, at 23:15 , Joe Abbate wrote:
    I'm
    somewhat surprised there appears to be no ability to lock a database
    exclusively for something like pg_dump/pg_restore, so you're not
    surprised by concurrent activity against the catalogs. I'm guessing the
    assumption is that MVCC will take care of that?
    I think the hope is more that it will, one day. Currently, postgres internally
    accesses the catalog with SnapshotNow, not with a MVCC snapshot. This is
    necessary to ensure, for example, that rows inserted into a table also get
    inserted into a newly created index. This wouldn't affects pg_dump if it only
    access the catalog via SQL, but it doesn't. pg_dump also depends on some server-side
    functions to do its work, and since these functions in turn use SnapshotNow-based
    internal backend functions, pg_dump essentially uses a mix of SnapshotNow and
    its transaction's MVCC snapshot.

    There has been talk about reducing the use of of SnapshotNow for catalog access,
    but AFAIK there's no concrete proposal, and certainly no patch, available.

    best regards,
    Florian Pflug
  • Robert Haas at Oct 3, 2011 at 3:04 pm

    On Sat, Oct 1, 2011 at 9:46 PM, Andrew Dunstan wrote:
    How would that help? This isn't a lock failure.
    It is, rather, a failure to lock. Currently, LOCK TABLE only works on
    tables, and pg_dump only applies it to tables. If the offending
    object had been a table rather than a view, pg_dump would (I believe)
    have blocked trying to obtain an AccessShareLock against the existing
    AccessExclusiveLock. We talked about allowing locks on other types of
    relations, but due to some bad syntax choices in the past it's not
    completely obvious how to shoehorn that in.

    http://archives.postgresql.org/pgsql-hackers/2011-06/msg00119.php

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Tom Lane at Oct 3, 2011 at 3:21 pm

    Robert Haas writes:
    On Sat, Oct 1, 2011 at 9:46 PM, Andrew Dunstan wrote:
    How would that help? This isn't a lock failure.
    It is, rather, a failure to lock. Currently, LOCK TABLE only works on
    tables, and pg_dump only applies it to tables. If the offending
    object had been a table rather than a view, pg_dump would (I believe)
    have blocked trying to obtain an AccessShareLock against the existing
    AccessExclusiveLock.
    Yeah, and it would still have failed once the lock was released.

    Short of providing some sort of global DDL-blocking lock (with attendant
    performance consequences) it's not clear how to create an entirely
    bulletproof solution here. This isn't a new problem --- we've been
    aware of pg_dump's limitations in this area for many years.

    regards, tom lane
  • Tom Lane at Oct 3, 2011 at 4:47 am

    Andrew Dunstan writes:
    While investigating a client problem I just observed that pg_dump takes
    a surprisingly large amount of time to dump a schema with a large number
    of views. The client's hardware is quite spiffy, and yet pg_dump is
    taking many minutes to dump a schema with some 35,000 views. Here's a
    simple test case:
    create schema views;
    do 'begin for i in 1 .. 10000 loop execute $$create view views.v_$$
    i ||$$ as select current_date as d, current_timestamp as ts,
    $_$a$_$::text || n as t, n from generate_series(1,5) as n$$; end
    loop; end;';
    On my modest hardware this database took 4m18.864s for pg_dump to run.
    It takes about that on my machine too ... with --enable-cassert.
    oprofile said that 90% of the runtime was going into AllocSetCheck,
    so I rebuilt without cassert, and the runtime dropped to 16 seconds.
    What were you testing?

    (Without cassert, it looks like LockReassignCurrentOwner is the next
    biggest time sink; I'm wondering if there's some sort of O(N^2) behavior
    in there.)

    regards, tom lane
  • Alvaro Herrera at Oct 3, 2011 at 3:38 pm

    Excerpts from Tom Lane's message of lun oct 03 01:47:18 -0300 2011:

    (Without cassert, it looks like LockReassignCurrentOwner is the next
    biggest time sink; I'm wondering if there's some sort of O(N^2) behavior
    in there.)
    That seems fishy. Even if there weren't quadratic behavior, should this
    be called at all? AFAIK it should only be used on cases where there are
    subtransactions at work, and I don't think pg_dump uses them.

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Tom Lane at Oct 3, 2011 at 4:02 pm

    Alvaro Herrera writes:
    Excerpts from Tom Lane's message of lun oct 03 01:47:18 -0300 2011:
    (Without cassert, it looks like LockReassignCurrentOwner is the next
    biggest time sink; I'm wondering if there's some sort of O(N^2) behavior
    in there.)
    That seems fishy. Even if there weren't quadratic behavior, should this
    be called at all? AFAIK it should only be used on cases where there are
    subtransactions at work, and I don't think pg_dump uses them.
    I wondered that too, but the calls are legit --- they're coming from
    PortalDrop.

    It appears that most of the calls don't actually have anything to do,
    but they're iterating through a rather large local lock table to find
    that out. We probably ought to think of a way to avoid that. The trick
    is to not make performance worse for typical small transactions that
    aren't holding many locks (which I think was the design center for this
    to begin with).

    regards, tom lane
  • Andrew Dunstan at Oct 3, 2011 at 3:39 pm

    On 10/03/2011 12:47 AM, Tom Lane wrote:
    Andrew Dunstan<andrew@dunslane.net> writes:
    While investigating a client problem I just observed that pg_dump takes
    a surprisingly large amount of time to dump a schema with a large number
    of views. The client's hardware is quite spiffy, and yet pg_dump is
    taking many minutes to dump a schema with some 35,000 views. Here's a
    simple test case:
    create schema views;
    do 'begin for i in 1 .. 10000 loop execute $$create view views.v_$$
    i ||$$ as select current_date as d, current_timestamp as ts,
    $_$a$_$::text || n as t, n from generate_series(1,5) as n$$; end
    loop; end;';
    On my modest hardware this database took 4m18.864s for pg_dump to run.
    It takes about that on my machine too ... with --enable-cassert.
    oprofile said that 90% of the runtime was going into AllocSetCheck,
    so I rebuilt without cassert, and the runtime dropped to 16 seconds.
    What were you testing?
    Yeah, you're right, that must have been it. That's a big hit, I didn't
    realise cassert was so heavy. (Note to self: test with production build
    settings). I don't seem to be batting 1000 ...

    I still need to get to the bottom of why the client's machine is taking
    so long.

    I do notice that we seem to be doing a lot of repetitive tasks, e.g.
    calling pg_format_type() over and over for the same arguments. Would we
    be better off cacheing that?

    cheers

    andrew

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedOct 1, '11 at 9:08p
activeOct 3, '11 at 4:02p
posts11
users6
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase