FAQ
I enclose a complete patch for avoiding WAL usage for CREATE TABLE AS
SELECT, when not in archive mode (PITR). The main use case for this is
large BI environments that create summary tables or prejoined tables,
though there are many general applications.

There is no user interface for this. The speed gain is automatic, when
archiving is not enabled.

This contains all the lower level machinery required to do the same
thing for COPY, as discussed on hackers. The machinery includes some
additional freespace thinkery, aimed mainly at the forthcoming COPY
patch, which solely needs to be integrated with Alon's work.

Patch is diff -c format, compiles and make checks on cvstip as of now.

No performance tests *on this patch*, though the general principle has
already been proven via a similar prototype patch not published on list.

Best Regards, Simon Riggs

Search Discussions

  • Bruce Momjian at Jun 4, 2005 at 7:46 pm

    Simon Riggs wrote:
    I enclose a complete patch for avoiding WAL usage for CREATE TABLE AS
    SELECT, when not in archive mode (PITR). The main use case for this is
    large BI environments that create summary tables or prejoined tables,
    though there are many general applications.

    There is no user interface for this. The speed gain is automatic, when
    archiving is not enabled.
    Could we do your NOLOGGING automatically in COPY if we test to see if
    anyone else is connected to our current database? I would _love_ to see
    pg_dump loads use this automatically, without having to add clauses to
    pg_dump output.

    I think we decided we can't do it automatically for all zero-row COPYs
    because of locking concerns.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Neil Conway at Jun 5, 2005 at 12:21 am

    Bruce Momjian wrote:
    Could we do your NOLOGGING automatically in COPY if we test to see if
    anyone else is connected to our current database?
    That seems pretty fragile -- what happens if someone connects after the
    COPY has started? Considering that many COPY operations can take many
    minutes or hours, I don't think it is wise to make assumptions based on
    the initial state of the system.
    I would _love_ to see pg_dump loads use this automatically, without
    having to add clauses to pg_dump output.
    What's wrong with adding clauses to the pg_dump output?

    -Neil
  • Bruce Momjian at Jun 5, 2005 at 12:23 am

    Neil Conway wrote:
    Bruce Momjian wrote:
    Could we do your NOLOGGING automatically in COPY if we test to see if
    anyone else is connected to our current database?
    That seems pretty fragile -- what happens if someone connects after the
    COPY has started? Considering that many COPY operations can take many
    minutes or hours, I don't think it is wise to make assumptions based on
    the initial state of the system.
    I would _love_ to see pg_dump loads use this automatically, without
    having to add clauses to pg_dump output.
    What's wrong with adding clauses to the pg_dump output?
    Well, it isn't going to help us for 8.1 because 8.0 will not have it,
    and if we add the clause we make loading the data into previous releases
    harder.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Neil Conway at Jun 5, 2005 at 12:30 am

    Bruce Momjian wrote:
    Well, it isn't going to help us for 8.1 because 8.0 will not have it,
    and if we add the clause we make loading the data into previous releases
    harder.
    pg_dump output in general is not compatible with prior releases. It
    would be a nice feature to have, but until we have it, I don't see that
    changing or not changing the COPY syntax will make a major difference to
    dump backward compatibility.

    -Neil
  • Russell Smith at Jun 5, 2005 at 4:51 am

    On Sun, 5 Jun 2005 10:29 am, Neil Conway wrote:
    Bruce Momjian wrote:
    Well, it isn't going to help us for 8.1 because 8.0 will not have it,
    and if we add the clause we make loading the data into previous releases
    harder.
    pg_dump output in general is not compatible with prior releases. It
    would be a nice feature to have, but until we have it, I don't see that
    changing or not changing the COPY syntax will make a major difference to
    dump backward compatibility.
    Don't we usually suggest using the new pg_dump to dump the old database anyway?

    If that's the case, then we just add the locking options in there. Otherwise, yes you are
    stuck with the original locking mechanism. But if people are smart and want faster loading
    they will play with sed and friends to make it work.

    Even if people for 8.1 just get the supposed 500% speed increase because of a better parser,
    lots of people will be happy.

    Regards

    Russell Smith
  • Simon Riggs at Jun 5, 2005 at 8:11 am

    On Sun, 2005-06-05 at 10:20 +1000, Neil Conway wrote:
    Bruce Momjian wrote:
    Could we do your NOLOGGING automatically in COPY if we test to see if
    anyone else is connected to our current database?
    Remember that this patch doe NOT yet handle COPY, but that is planned...
    That seems pretty fragile -- what happens if someone connects after the
    COPY has started? Considering that many COPY operations can take many
    minutes or hours, I don't think it is wise to make assumptions based on
    the initial state of the system.
    Agreed.

    Best Regards, Simon Riggs
  • Bruce Momjian at Jun 20, 2005 at 6:41 pm
    Tom has applied this patch. Thanks.

    ---------------------------------------------------------------------------

    Simon Riggs wrote:
    I enclose a complete patch for avoiding WAL usage for CREATE TABLE AS
    SELECT, when not in archive mode (PITR). The main use case for this is
    large BI environments that create summary tables or prejoined tables,
    though there are many general applications.

    There is no user interface for this. The speed gain is automatic, when
    archiving is not enabled.

    This contains all the lower level machinery required to do the same
    thing for COPY, as discussed on hackers. The machinery includes some
    additional freespace thinkery, aimed mainly at the forthcoming COPY
    patch, which solely needs to be integrated with Alon's work.

    Patch is diff -c format, compiles and make checks on cvstip as of now.

    No performance tests *on this patch*, though the general principle has
    already been proven via a similar prototype patch not published on list.

    Best Regards, Simon Riggs
    [ Attachment, skipping... ]
    ---------------------------(end of broadcast)---------------------------
    TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Tom Lane at Jun 20, 2005 at 6:50 pm

    Simon Riggs writes:
    I enclose a complete patch for avoiding WAL usage for CREATE TABLE AS
    SELECT, when not in archive mode (PITR). The main use case for this is
    large BI environments that create summary tables or prejoined tables,
    though there are many general applications.
    Applied after heavy corrections --- there were a number of things wrong
    with this "simple" patch, starting with having gotten the tests
    backwards :-(, and extending to not having actually flushed the data
    before commit (smgrimmedsync isn't enough, you have to
    FlushRelationBuffers).

    A consideration we had all missed in the original discussions is that
    if the transaction doesn't emit any WAL records at all,
    RecordTransactionCommit will think that it need not WAL-log the
    transaction commit, leading to the possibility that the commit is lost
    even though all the data is preserved :-(

    This is not a hazard for CREATE TABLE AS, since it will certainly have
    emitted WAL records while creating the table's catalog entries. It will
    be a very real hazard for COPY however. The cleanest solution I can
    think of is that the COPY code should emit a WAL record for the first
    tuple copied in, but not for later ones. To this end, I separated the
    "use_wal" and "use_fsm" aspects of what the patch was doing.

    I didn't apply the freespace.c changes either; that struck me as a
    serious kluge with no real benefit. We can just omit updating the FSM's
    running average, if it even has one. (ISTM there's a reasonable
    argument to be made that the tuple sizes during CREATE/COPY might not be
    representative of later requests anyway.)

    Patch as applied is attached.

    regards, tom lane
  • Simon Riggs at Jun 20, 2005 at 8:56 pm

    On Mon, 2005-06-20 at 14:50 -0400, Tom Lane wrote:
    Simon Riggs <simon@2ndquadrant.com> writes:
    I enclose a complete patch for avoiding WAL usage for CREATE TABLE AS
    SELECT, when not in archive mode (PITR). The main use case for this is
    large BI environments that create summary tables or prejoined tables,
    though there are many general applications.
    Applied Thanks
    after heavy corrections --- there were a number of things wrong
    with this "simple" patch, starting with having gotten the tests
    backwards :-(
    Sorry, I thought I had corrected that error before submission. I was
    aware that I had made that error earlier.
    and extending to not having actually flushed the data
    before commit (smgrimmedsync isn't enough, you have to
    FlushRelationBuffers).
    I followed the logic as seen in nbtsort.c as you suggested. That code
    doesn't perform a FlushRelationBuffers and it looks like I fooled myself
    into thinking the CTAS/SELECT INTO case was also in local.

    Perhaps we should be building CTAS/SELECT INTO in local buffers anyway?
    It looks like we could save time by avoiding shared_buffers completely
    and build up a whole page before writing it anywhere. (But thats a story
    for another day).

    Perhaps this is also related to metapage errors, since the metapage is
    always the last page to be written?
    A consideration we had all missed in the original discussions is that
    if the transaction doesn't emit any WAL records at all,
    RecordTransactionCommit will think that it need not WAL-log the
    transaction commit, leading to the possibility that the commit is lost
    even though all the data is preserved :-(
    This is not a hazard for CREATE TABLE AS, since it will certainly have
    emitted WAL records while creating the table's catalog entries. It will
    be a very real hazard for COPY however.
    OK, but I haven't written that patch yet!
    The cleanest solution I can
    think of is that the COPY code should emit a WAL record for the first
    tuple copied in, but not for later ones. To this end, I separated the
    "use_wal" and "use_fsm" aspects of what the patch was doing.
    Not very clean, but will do as you suggest.
    I didn't apply the freespace.c changes either; that struck me as a
    serious kluge with no real benefit. We can just omit updating the FSM's
    running average, if it even has one. (ISTM there's a reasonable
    argument to be made that the tuple sizes during CREATE/COPY might not be
    representative of later requests anyway.)
    I was striving for completeness only. I was doubtful about that part of
    the patch, but thought I'd add that rather than have you say I hadn't
    thought about the FSM avg_request_size.

    I put those changes in mainly for COPY. If you don't make any request at
    all to FSM then a relation never gets to the MRU relation FSM list. I
    agree that it is not strictly necessary, but leaving it off would be a
    change in behaviour, since COPY did previously cause the relation to get
    to the MRU. That could be a problem, since a relation might not then be
    allocated any FSM pages following a vacuum.

    Best Regards, Simon Riggs
  • Alvaro Herrera at Jun 20, 2005 at 9:10 pm

    On Mon, Jun 20, 2005 at 09:55:12PM +0100, Simon Riggs wrote:

    I put those changes in mainly for COPY. If you don't make any request at
    all to FSM then a relation never gets to the MRU relation FSM list. I
    agree that it is not strictly necessary, but leaving it off would be a
    change in behaviour, since COPY did previously cause the relation to get
    to the MRU. That could be a problem, since a relation might not then be
    allocated any FSM pages following a vacuum.
    Is that a problem? If the pages don't fit in FSM, then maybe the system
    is misconfigured anyway. The person running the DW should just increase
    the FSM settings, which is hardly a costly thing because it uses so
    little memory.

    --
    Alvaro Herrera (<alvherre[a]surnet.cl>)
    "No renuncies a nada. No te aferres a nada."
  • Simon Riggs at Jun 20, 2005 at 9:19 pm

    On Mon, 2005-06-20 at 17:09 -0400, Alvaro Herrera wrote:
    On Mon, Jun 20, 2005 at 09:55:12PM +0100, Simon Riggs wrote:

    I put those changes in mainly for COPY. If you don't make any request at
    all to FSM then a relation never gets to the MRU relation FSM list. I
    agree that it is not strictly necessary, but leaving it off would be a
    change in behaviour, since COPY did previously cause the relation to get
    to the MRU. That could be a problem, since a relation might not then be
    allocated any FSM pages following a vacuum.
    Is that a problem?
    Not for me, but I wanted to explain the change in behaviour that
    implies.
    If the pages don't fit in FSM, then maybe the system
    is misconfigured anyway. The person running the DW should just increase
    the FSM settings, which is hardly a costly thing because it uses so
    little memory.
    If you aren't on the relation list you don't get any more pages than the
    minimum. No matter how many fsm_pages you allocate. If fsm_pages covers
    everything, then you are right, there is no problem.

    Best Regards, Simon Riggs

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-patches @
categoriespostgresql
postedJun 3, '05 at 10:11p
activeJun 20, '05 at 9:19p
posts12
users6
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase