Bosco Rama writes:
If 'standard_conforming_strings = on' is set in our DB (which is required for
our app) then the piped restore method (e.g. pg_restore -O backup.dat | psql)
results in the large objects being corrupted.
All servers and client tools involved are PG 8.4.6 on Ubuntu Server 10.04.1 LTS
with all current updates applied.
I've been able to replicate this in 8.4; it doesn't happen in 9.0
(but probably does in all 8.x versions).

The problem is that pg_dump (or in this case really pg_restore) is
relying on libpq's PQescapeBytea() to format the bytea literal that
will be given as argument to lowrite() during the restore. When
pg_dump is producing SQL directly, or when pg_restore is connected
to a database, PQescapeBytea() mooches the standard_conforming_strings
value from the active libpq connection and gets the right answer.
In the single case where pg_restore is producing SQL without ever
opening a database connection, PQescapeBytea doesn't know what to do
and defaults to the old non-standard-strings behavior. Unfortunately
pg_restore set standard_conforming_strings=on earlier in the script
(if it was set in the original source database) so you get the wrong
thing.

The bottom line is that pg_dump can't depend on libpq's PQescapeBytea,
but needs its own copy. We have in fact done that as of 9.0, which is
what I was vaguely remembering:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_0_BR [b1732111f] 2009-08-04 21:56:09 +0000

Fix pg_dump to do the right thing when escaping the contents of large objects.

The previous implementation got it right in most cases but failed in one:
if you pg_dump into an archive with standard_conforming_strings enabled, then
pg_restore to a script file (not directly to a database), the script will set
standard_conforming_strings = on but then emit large object data as
nonstandardly-escaped strings.

At the moment the code is made to emit hex-format bytea strings when dumping
to a script file. We might want to change to old-style escaping for backwards
compatibility, but that would be slower and bulkier. If we do, it's just a
matter of reimplementing appendByteaLiteral().

This has been broken for a long time, but given the lack of field complaints
I'm not going to worry about back-patching.

I'm not sure whether this new complaint is enough reason to reconsider
back-patching. We cannot just backport the 9.0 patch, since it assumes
it can do bytea hex output --- we'd need to emit old style escaped
output instead. So it's a bit of work, and more to the point would
involve pushing poorly-tested code into stable branches. I doubt it
would go wrong, but in the worst-case scenario we might create failures
for blob-restore cases that work now.

So I'm not sure whether to fix it, or leave it as a known failure case
in old branches. Comments?

regards, tom lane

Search Discussions

  • Vivek Khera at Jan 21, 2011 at 1:47 pm

    On Thu, Jan 20, 2011 at 6:14 PM, Tom Lane wrote:
    So I'm not sure whether to fix it, or leave it as a known failure case
    in old branches.  Comments?
    Since there is a workaround, I think it is best to document it and
    leave it as-is.
  • Bosco Rama at Jan 21, 2011 at 5:44 pm

    Tom Lane wrote:

    So I'm not sure whether to fix it, or leave it as a known failure case
    in old branches. Comments?
    I understand the reluctance to fool with stable code. I have zero insight
    into your installed versions distribution and backward compatibility needs
    so any comment I may have here is purely selfish.

    As an end user there is one area of the DB that I want to work correctly
    100% of the time and that is the dump/restore tool(s). If it's not going
    to work under certain circumstances it should at least tell me so and
    fail. I don't think having the tool appear to work while corrupting the
    data (even if documented as doing so) is a viable method of operation.

    Just my $0.02

    Bosco.
  • Robert Haas at Jan 21, 2011 at 5:55 pm

    On Fri, Jan 21, 2011 at 12:44 PM, Bosco Rama wrote:
    Tom Lane wrote:
    So I'm not sure whether to fix it, or leave it as a known failure case
    in old branches.  Comments?
    I understand the reluctance to fool with stable code.  I have zero insight
    into your installed versions distribution and backward compatibility needs
    so any comment I may have here is purely selfish.

    As an end user there is one area of the DB that I want to work correctly
    100% of the time and that is the dump/restore tool(s).  If it's not going
    to work under certain circumstances it should at least tell me so and
    fail.  I don't think having the tool appear to work while corrupting the
    data (even if documented as doing so) is a viable method of operation.
    Yeah, I lean toward saying we should back-patch this. Yeah, it'll be
    lightly tested, but it's a pretty confined change, so it's unlikely to
    break anything else. ISTM the worst case scenario is that it takes
    two minor releases to get it right, and even that seems fairly
    unlikely.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Tom Lane at Jan 21, 2011 at 7:28 pm

    Robert Haas writes:
    On Fri, Jan 21, 2011 at 12:44 PM, Bosco Rama wrote:
    Tom Lane wrote:
    So I'm not sure whether to fix it, or leave it as a known failure case
    in old branches.  Comments?
    As an end user there is one area of the DB that I want to work correctly
    100% of the time and that is the dump/restore tool(s).
    Yeah, I lean toward saying we should back-patch this.
    Fair enough, I'll go do it. I just wanted to hear at least one other
    person opine that it was worth taking some risk for.

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJan 20, '11 at 11:15p
activeJan 21, '11 at 7:28p
posts5
users4
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase