Hallo,

I have some quite grave problems with dumping and restoring large databases (>4GB of
dump).

I had 5 databases, 4 dumped ok, the 5th, the largest failed dumping: I was unable to
make a dump in the default 'tar' format. I got this message:

pg_dump: [tar archiver] archive member too large for tar format

I got over this issue by using the 'custom' format.

Unfortunately later on I was only able to restore 3 of the 5 databases -- any of the
2 dumps that would get over 4GB in the 'tar' format would fail.

/var/tmp# ls -l dumps/
total 16294020
-rw-r--r-- 1 root root 742611968 2010-10-16 20:36 archiv1.dump
-rw-r--r-- 1 root root 317352448 2010-10-16 20:37 archiv2.dump
-rw-r--r-- 1 root root 1137477632 2010-10-16 20:41 archiv3.dump
-rw-r--r-- 1 root root 3712833536 2010-10-16 20:57 archiv4.dump
-rw-r--r-- 1 root root 8735429632 2010-10-16 21:35 archiv5.dump
-rw-r--r-- 1 root root 1253325453 2010-10-16 22:59 bb.dump

archiv1-4 went OK, archiv5 -- the largest in 'tar' format -- failed and the 'bb.dump'
which is in the 'custom' format, failed too.

I got these messages:

for the archiv5 in the 'tar' format:

pg_restore: [tar archivář] nalezena poškozená tar hlavička v STEX
(předpokládáno 100, vypočteno 34044) pozice souboru 7750193152

sorry, it's in my native locale, but is says "found a corrupted tar header in STEX
(expected 100, calculated 34044) file position 7750193152

for the bb.dump in the 'custom' format:

pg_restore: [vlastní archivář] unexpected end of file

'vlastní archivář' is again in my locale, it should be in English "own archiver"

Later I tried to utilize the -I and -i switches of pg_restore to restore data that
are in the archive behing the table that was not restored. But got the same error
message.

The dump was created on postgresql-8.3 8.3.3-1~bpo40+1 from debian backports. I was
trying to restore on this version and later on using postgresql-8.4 8.4.5-1~bpo50+1
from debian backports, finally I tried 64bit version
of postgresql-8.4 8.4.5-0ubuntu10.10. No change, still the same error messages.

I welcome any help and/or hints on this issue as I need to dump and restore several
large databases.

Regards,
--
Mgr. Martin Povolný, soLNet, s.r.o.,
+420777714458, martin.povolny@solnet.cz

Search Discussions

  • Lst_hoe02 at Oct 26, 2010 at 7:31 pm
  • Samuel Stearns at Oct 26, 2010 at 9:51 pm
    You can also try piping the dump through gzip and then restoring using cat:

    pg_dumpall | gzip > db.out-`date +\%Y\%m\%d\%H`.gz

    cat db.out-`date +\%Y\%m\%d\%H`.gz | gunzip | psql template1

    Sam



    From: pgsql-admin-owner@postgresql.org On Behalf Of Martin Povolny
    Sent: Tuesday, 26 October 2010 10:12 PM
    To: pgsql-admin@postgresql.org
    Subject: [ADMIN] large database: problems with pg_dump and pg_restore

    Hallo,

    I have some quite grave problems with dumping and restoring large databases (>4GB of dump).
    I had 5 databases, 4 dumped ok, the 5th, the largest failed dumping: I was unable to make a dump in the default 'tar' format. I got this message:

    pg_dump: [tar archiver] archive member too large for tar format

    I got over this issue by using the 'custom' format.

    Unfortunately later on I was only able to restore 3 of the 5 databases -- any of the 2 dumps that would get over 4GB in the 'tar' format would fail.

    /var/tmp# ls -l dumps/
    total 16294020
    -rw-r--r-- 1 root root 742611968 2010-10-16 20:36 archiv1.dump
    -rw-r--r-- 1 root root 317352448 2010-10-16 20:37 archiv2.dump
    -rw-r--r-- 1 root root 1137477632 2010-10-16 20:41 archiv3.dump
    -rw-r--r-- 1 root root 3712833536 2010- 10-16 20:57 archiv4.dump
    -rw-r--r-- 1 root root 8735429632 2010-10-16 21:35 archiv5.dump
    -rw-r--r-- 1 root root 1253325453 2010-10-16 22:59 bb.dump

    archiv1-4 went OK, archiv5 -- the largest in 'tar' format -- failed and the 'bb.dump' which is in the 'custom' format, failed too.

    I got these messages:

    for the archiv5 in the 'tar' format:

    pg_restore: [tar archivář] nalezena poškozená tar hlavička v STEX (předpokládáno 100, vypočteno 34044) pozice souboru 7750193152

    sorry, it's in my native locale, but is says "found a corrupted tar header in STEX (expected 100, calculated 34044) file position 7750193152

    for the bb.dump in the 'custom' format:

    pg_restore: [vlastní archivář] unexpected end of file

    'vlastní archiv ář' is again in my locale, it should be in English "own archiver"

    Later I tried to utilize the -I and -i switches of pg_restore to restore data that are in the archive behing the table that was not restored. But got the same error message.

    The dump was created on postgresql-8.3 8.3.3-1~bpo40+1 from debian backports. I was trying to restore on this version and later on using postgresql-8.4 8.4.5-1~bpo50+1 from debian backports, finally I tried 64bit version of postgresql-8.4 8.4.5-0ubuntu10.10. No change, still the same error messages.

    I welcome any help and/or hints on this issue as I need to dump and restore several large databases.

    Regards,

    --
    Mgr. Martin Povolný, soLNet, s.r.o.,
    +420777714458, martin.povolny@solnet.cz
  • Jehan-Guillaume (ioguix) de Rorthais at Oct 26, 2010 at 10:22 pm
    Or even compress AND split it !
    pg_dump -Fc dbname | split -b 1G - dump_dbname

    and restore:
    cat dump_dbname* | pg_restore -d dbname

    or
    cat dump_dbname* | pg_restore | psql dbname

    Le 26/10/2010 23:51, Samuel Stearns a écrit :
    You can also try piping the dump through gzip and then restoring using cat:



    pg_dumpall | gzip > db.out-`date +\%Y\%m\%d\%H`.gz



    cat db.out-`date +\%Y\%m\%d\%H`.gz | gunzip | psql template1



    Sam







    *From:* pgsql-admin-owner@postgresql.org
    *On Behalf Of *Martin Povolny
    *Sent:* Tuesday, 26 October 2010 10:12 PM
    *To:* pgsql-admin@postgresql.org
    *Subject:* [ADMIN] large database: problems with pg_dump and pg_restore



    Hallo,



    I have some quite grave problems with dumping and restoring large
    databases (>4GB of dump).

    I had 5 databases, 4 dumped ok, the 5th, the largest failed dumping: I
    was unable to make a dump in the default 'tar' format. I got this message:



    pg_dump: [tar archiver] archive member too large for tar format



    I got over this issue by using the 'custom' format.



    Unfortunately later on I was only able to restore 3 of the 5 databases
    -- any of the 2 dumps that would get over 4GB in the 'tar' format would
    fail.



    /var/tmp# ls -l dumps/

    total 16294020

    -rw-r--r-- 1 root root 742611968 2010-10-16 20:36 archiv1.dump

    -rw-r--r-- 1 root root 317352448 2010-10-16 20:37 archiv2.dump

    -rw-r--r-- 1 root root 1137477632 2010-10-16 20:41 archiv3.dump

    -rw-r--r-- 1 root root 3712833536 2010- 10-16 20:57 archiv4.dump

    -rw-r--r-- 1 root root 8735429632 2010-10-16 21:35 archiv5.dump

    -rw-r--r-- 1 root root 1253325453 2010-10-16 22:59 bb.dump



    archiv1-4 went OK, archiv5 -- the largest in 'tar' format -- failed and
    the 'bb.dump' which is in the 'custom' format, failed too.



    I got these messages:



    for the archiv5 in the 'tar' format:



    pg_restore: [tar archivář] nalezena poškozená tar hlavička v STEX
    (předpokládáno 100, vypočteno 34044) pozice souboru 7750193152



    sorry, it's in my native locale, but is says "found a corrupted tar
    header in STEX (expected 100, calculated 34044) file position 7750193152



    for the bb.dump in the 'custom' format:



    pg_restore: [vlastní archivář] unexpected end of file



    'vlastní archiv ář' is again in my locale, it should be in English "own
    archiver"



    Later I tried to utilize the -I and -i switches of pg_restore to restore
    data that are in the archive behing the table that was not restored. But
    got the same error message.



    The dump was created on postgresql-8.3 8.3.3-1~bpo40+1 from debian
    backports. I was trying to restore on this version and later on using
    postgresql-8.4 8.4.5-1~bpo50+1 from debian backports, finally I tried
    64bit version of postgresql-8.4 8.4.5-0ubuntu10.10. No change, still the
    same error messages.



    I welcome any help and/or hints on this issue as I need to dump and
    restore several large databases.



    Regards,


    --
    Mgr. Martin Povolný, soLNet, s.r.o.,
    +420777714458, martin.povolny@solnet.cz
  • Mark at Oct 27, 2010 at 2:41 am
    A long time ago, (8.1.11 IIRC)

    We got much better speed not using the compression flag with pg_dump instead piping to gzip (or better yet something like pbzip2 or pigz, but I haven't used them).


    I think there was a thread about this that had a test case and numbers.


    IIRC it's because you will further bottleneck a core when using the compression flag. Using a pipe the compression can be done on another core (or cores if using pbzip2 or pigz) and throughput will be faster.




    On the restore side hopefully people are now able to use parallel restore to improve things when reloading.


    Just my thoughts,



    ~mark


    -----Original Message-----
    From: pgsql-admin-owner@postgresql.org On Behalf Of Jehan-Guillaume (ioguix) de Rorthais
    Sent: Tuesday, October 26, 2010 4:22 PM
    To: Martin Povolny
    Cc: pgsql-admin@postgresql.org
    Subject: Re: [ADMIN] large database: problems with pg_dump and pg_restore

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Or even compress AND split it !
    pg_dump -Fc dbname | split -b 1G - dump_dbname

    and restore:
    cat dump_dbname* | pg_restore -d dbname

    or
    cat dump_dbname* | pg_restore | psql dbname

    Le 26/10/2010 23:51, Samuel Stearns a écrit :
    You can also try piping the dump through gzip and then restoring using cat:



    pg_dumpall | gzip > db.out-`date +\%Y\%m\%d\%H`.gz



    cat db.out-`date +\%Y\%m\%d\%H`.gz | gunzip | psql template1



    Sam







    *From:* pgsql-admin-owner@postgresql.org
    *On Behalf Of *Martin Povolny
    *Sent:* Tuesday, 26 October 2010 10:12 PM
    *To:* pgsql-admin@postgresql.org
    *Subject:* [ADMIN] large database: problems with pg_dump and pg_restore



    Hallo,



    I have some quite grave problems with dumping and restoring large
    databases (>4GB of dump).

    I had 5 databases, 4 dumped ok, the 5th, the largest failed dumping: I
    was unable to make a dump in the default 'tar' format. I got this message:



    pg_dump: [tar archiver] archive member too large for tar format



    I got over this issue by using the 'custom' format.



    Unfortunately later on I was only able to restore 3 of the 5 databases
    -- any of the 2 dumps that would get over 4GB in the 'tar' format would
    fail.



    /var/tmp# ls -l dumps/

    total 16294020

    -rw-r--r-- 1 root root 742611968 2010-10-16 20:36 archiv1.dump

    -rw-r--r-- 1 root root 317352448 2010-10-16 20:37 archiv2.dump

    -rw-r--r-- 1 root root 1137477632 2010-10-16 20:41 archiv3.dump

    -rw-r--r-- 1 root root 3712833536 2010- 10-16 20:57 archiv4.dump

    -rw-r--r-- 1 root root 8735429632 2010-10-16 21:35 archiv5.dump

    -rw-r--r-- 1 root root 1253325453 2010-10-16 22:59 bb.dump



    archiv1-4 went OK, archiv5 -- the largest in 'tar' format -- failed and
    the 'bb.dump' which is in the 'custom' format, failed too.



    I got these messages:



    for the archiv5 in the 'tar' format:



    pg_restore: [tar archivář] nalezena poškozená tar hlavička v STEX
    (předpokládáno 100, vypočteno 34044) pozice souboru 7750193152



    sorry, it's in my native locale, but is says "found a corrupted tar
    header in STEX (expected 100, calculated 34044) file position 7750193152



    for the bb.dump in the 'custom' format:



    pg_restore: [vlastní archivář] unexpected end of file



    'vlastní archiv ář' is again in my locale, it should be in English "own
    archiver"



    Later I tried to utilize the -I and -i switches of pg_restore to restore
    data that are in the archive behing the table that was not restored. But
    got the same error message.



    The dump was created on postgresql-8.3 8.3.3-1~bpo40+1 from debian
    backports. I was trying to restore on this version and later on using
    postgresql-8.4 8.4.5-1~bpo50+1 from debian backports, finally I tried
    64bit version of postgresql-8.4 8.4.5-0ubuntu10.10. No change, still the
    same error messages.



    I welcome any help and/or hints on this issue as I need to dump and
    restore several large databases.



    Regards,


    --
    Mgr. Martin Povolný, soLNet, s.r.o.,
    +420777714458, martin.povolny@solnet.cz
    --
    Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
    To make changes to your subscription:
    http://www.postgresql.org/mailpref/pgsql-admin
  • Jehan-Guillaume (ioguix) de Rorthais at Oct 27, 2010 at 9:43 am

    Le 27/10/2010 04:41, mark a écrit :
    A long time ago, (8.1.11 IIRC)

    We got much better speed not using the compression flag with pg_dump instead piping to gzip (or better yet something like pbzip2 or pigz, but I haven't used them).

    I think there was a thread about this that had a test case and numbers.

    IIRC it's because you will further bottleneck a core when using the compression flag. Using a pipe the compression can be done on another core (or cores if using pbzip2 or pigz) and throughput will be faster.
    Exact.
    And it actually depend on some other use cases. If you need speed, your
    solution makes totally sens indeed.

    If you can spend some more time using -Fc, then you'll be able to play
    with -l and -L switches of pg_restore to optionally decide what should
    be restored or not (with more control than -n, -t, etc, think about
    excluding slony stuffs as instance)
    On the restore side hopefully people are now able to use parallel restore to improve things when reloading. +1
    Just my thoughts,



    ~mark


    -----Original Message-----
    From: pgsql-admin-owner@postgresql.org On Behalf Of Jehan-Guillaume (ioguix) de Rorthais
    Sent: Tuesday, October 26, 2010 4:22 PM
    To: Martin Povolny
    Cc: pgsql-admin@postgresql.org
    Subject: Re: [ADMIN] large database: problems with pg_dump and pg_restore

    Or even compress AND split it !
    pg_dump -Fc dbname | split -b 1G - dump_dbname

    and restore:
    cat dump_dbname* | pg_restore -d dbname

    or
    cat dump_dbname* | pg_restore | psql dbname

    Le 26/10/2010 23:51, Samuel Stearns a écrit :
    You can also try piping the dump through gzip and then restoring using cat:
    pg_dumpall | gzip > db.out-`date +\%Y\%m\%d\%H`.gz
    cat db.out-`date +\%Y\%m\%d\%H`.gz | gunzip | psql template1
    Sam





    *From:* pgsql-admin-owner@postgresql.org
    *On Behalf Of *Martin Povolny
    *Sent:* Tuesday, 26 October 2010 10:12 PM
    *To:* pgsql-admin@postgresql.org
    *Subject:* [ADMIN] large database: problems with pg_dump and pg_restore
    Hallo,
    I have some quite grave problems with dumping and restoring large
    databases (>4GB of dump).
    I had 5 databases, 4 dumped ok, the 5th, the largest failed dumping: I
    was unable to make a dump in the default 'tar' format. I got this message:
    pg_dump: [tar archiver] archive member too large for tar format
    I got over this issue by using the 'custom' format.
    Unfortunately later on I was only able to restore 3 of the 5 databases
    -- any of the 2 dumps that would get over 4GB in the 'tar' format would
    fail.
    /var/tmp# ls -l dumps/
    total 16294020
    -rw-r--r-- 1 root root 742611968 2010-10-16 20:36 archiv1.dump
    -rw-r--r-- 1 root root 317352448 2010-10-16 20:37 archiv2.dump
    -rw-r--r-- 1 root root 1137477632 2010-10-16 20:41 archiv3.dump
    -rw-r--r-- 1 root root 3712833536 2010- 10-16 20:57 archiv4.dump
    -rw-r--r-- 1 root root 8735429632 2010-10-16 21:35 archiv5.dump
    -rw-r--r-- 1 root root 1253325453 2010-10-16 22:59 bb.dump
    archiv1-4 went OK, archiv5 -- the largest in 'tar' format -- failed and
    the 'bb.dump' which is in the 'custom' format, failed too.
    I got these messages:
    for the archiv5 in the 'tar' format:
    pg_restore: [tar archiváY] nalezena poakozená tar hlavi ka v STEX
    (pYedpokládáno 100, vypo teno 34044) pozice souboru 7750193152
    sorry, it's in my native locale, but is says "found a corrupted tar
    header in STEX (expected 100, calculated 34044) file position 7750193152
    for the bb.dump in the 'custom' format:
    pg_restore: [vlastní archiváY] unexpected end of file
    'vlastní archiv áY' is again in my locale, it should be in English "own
    archiver"
    Later I tried to utilize the -I and -i switches of pg_restore to restore
    data that are in the archive behing the table that was not restored. But
    got the same error message.
    The dump was created on postgresql-8.3 8.3.3-1~bpo40+1 from debian
    backports. I was trying to restore on this version and later on using
    postgresql-8.4 8.4.5-1~bpo50+1 from debian backports, finally I tried
    64bit version of postgresql-8.4 8.4.5-0ubuntu10.10. No change, still the
    same error messages.
    I welcome any help and/or hints on this issue as I need to dump and
    restore several large databases.
    Regards,
    --
    Mgr. Martin Povolný, soLNet, s.r.o.,
    +420777714458, martin.povolny@solnet.cz
  • Dimitri Fontaine at Oct 31, 2010 at 6:46 pm

    "Jehan-Guillaume (ioguix) de Rorthais" <ioguix@free.fr> writes:
    If you can spend some more time using -Fc, then you'll be able to play
    with -l and -L switches of pg_restore to optionally decide what should
    be restored or not (with more control than -n, -t, etc, think about
    excluding slony stuffs as instance)
    See also pg_staging that implements some commands for pg_restore
    filtering by schema. You can use that feature on its own.

    http://github.com/dimitri/pg_staging
    http://tapoueh.org/articles/blog/_pg_staging's_bird_view.html
    http://tapoueh.org/pgstaging.html

    Ok, for now check the docs, later I'll try to blog about some features
    in pg_staging that you can use on their own, and maybe some more about
    getting up to speed with pg_staging, after all.

    From the commands docs :

    catalog::

    Show the filtered out catalog we'll give to +pg_restore -L+.

    From the configuration (INI) file docs :

    schemas::

    List of schema (comma separated) to restore. If present, any schema
    not listed here nor in +schemas_nodata+ will get filtered out from
    the +pg_restore+ catalog.

    schemas_nodata::

    List of schema (comma separated) to restore without content. The
    +pg_restore+ catalog +TABLE DATA+ sections will get filtered out.

    From the code comments:

    http://github.com/dimitri/pg_staging/blob/master/pgstaging/restore.py

    ##
    # In the catalog, we have such TRIGGER lines:
    #
    # 6236; 2620 15995620 TRIGGER jdb www_to_reporting_logger webadmin
    #
    # The TRIGGER code could depend on a procedure hosted in a schema that
    # we filter out. In this case, we want to also filter out the TRIGGER
    # itself.
    #
    #CREATE TRIGGER www_to_reporting_logger
    #AFTER INSERT OR DELETE OR UPDATE ON daily_journal
    #FOR EACH ROW
    #EXECUTE PROCEDURE pgq.logtriga('www_to_reporting', 'kkvvvvvvvvv', 'jdb.daily_journal');
    #
    # get_trigger_funcs will return a dict of
    # {'trigger_name': ['procedure']}

    Regards,
    --
    Dimitri Fontaine
    http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
  • Tom Lane at Oct 27, 2010 at 12:13 am

    =?utf-8?Q?Martin_Povolny?= <martin.povolny@solnet.cz> writes:
    I had 5 databases, 4 dumped ok, the 5th, the largest failed dumping: I was unable to
    make a dump in the default 'tar' format. I got this message:
    pg_dump: [tar archiver] archive member too large for tar format
    This is expected: tar format has a documented limit of 8GB per table.
    (BTW, tar is not the "default" nor the recommended format, in part
    because of that limitation. The custom format is preferred unless
    you really *need* to manipulate the dump files with "tar" for some
    reason.)
    for the bb.dump in the 'custom' format:
    pg_restore: [vlastní archivář] unexpected end of file
    Hm, that's weird. I can't think of any explanation other than the dump
    file somehow getting corrupted. Do you get sane-looking output if you
    run "pg_restore -l bb.dump"?

    regards, tom lane
  • Martin Povolny at Oct 27, 2010 at 9:00 am

    27.10.2010 tgl@sss.pgh.pa.us napsal(a):
    =?utf-8?Q?Martin_Povolny?= <martin.povolny@solnet.cz> writes:
    I had 5 databases, 4 dumped ok, the 5th, the largest failed dumping: I
    was unable to
    make a dump in the default 'tar' format. I got this message:
    pg_dump: [tar archiver] archive member too large for tar format
    This is expected: tar format has a documented limit of 8GB per table.
    (BTW, tar is not the "default" nor the recommended format, in part
    because of that limitation. The custom format is preferred unless
    you really *need* to manipulate the dump files with "tar" for some
    reason.)
    Ok, I get it. Don't use the 'tar' format. I will not.
    As to hitting the limit of 8 GB per table -- I have one really large table.
    But if I dump the table separetely, I get:
    pg_dump --verbose --host localhost --username bb --create --format tar --file
    archiv5-process.dump --table process archiv5
    -rw-r--r-- 1 root root 4879763968 2010-10-27 10:15 archiv5-process.dump

    in other words: I am sure I did not hit the 8GB per table limit. But I am over 4GB
    per table.

    The 'process' table is the largest and is also the one where restore fails in both
    cases (tar format and custom format).
    for the bb.dump in the 'custom' format:
    pg_restore: [vlastní archivář] unexpected end of file
    Hm, that's weird. I can't think of any explanation other than the dump
    file somehow getting corrupted. Do you get sane-looking output if you
    run "pg_restore -l bb.dump"?
    Sure, I did pg_restore -l into a file and did not get any errors.
    Then I commented out the already restored files and then tried restoring tables
    behind the table 'process'.
    But I got the same error message :-(
    like this:
    $ /usr/lib/postgresql/8.4/bin/pg_restore -l bb.dump > bb.list
    # then edit bb.list, commenting out lines before and including table 'process',
    saving into bb.list-post-process
    $ /usr/lib/postgresql/8.4/bin/pg_restore --verbose --use-list bb.list-post-process
    bb.dump > bb-list-restore.sql
    pg_restore: restoring data for table "process_internet"
    pg_restore: [custom archiver] unexpected end of file
    pg_restore: *** aborted because of error
    As to splitting the dump as suggested earlier in this thread -- I am sure my system
    can work with files over 4 GB also I don't understand how spliting the output from
    pg_dump would prevent the pg_dump from failing. But I can try that too.

    Also I did not try the '-F plain' dump format.

    I have stopped using the plain format in the past because I was getting output as if
    I used --inserts atlhough I did not and I don't see any option for pg_dump, that
    would force the use of COPY for dumping data. But that is several versions of
    postgres back and I did not try this since that time.

    Many thanks for your time and tips!

    --
    Mgr. Martin Povolný, soLNet, s.r.o.,
    +420777714458, martin.povolny@solnet.cz

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-admin @
categoriespostgresql
postedOct 26, '10 at 12:02p
activeOct 31, '10 at 6:46p
posts9
users7
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase