FAQ
Hello,

As a one-off attempt to change a large table's 'bytea' column to
'text' with minimal I/O (where the 'bytea' contents is already valid
UTF8 and the database encoding is also UTF8, and the column is not
part of any index or anything involving collation), how unsafe is the
following?

UPDATE pg_attribute SET atttypid='text'::regtype::oid WHERE
attrelid=('schema_name.table_name')::regclass AND attname='col_name'
AND atttypid='bytea'::regtype::oid;

Additionally, if the 'bytea' happenned to also explicitly contain a
trailing NUL prior to the "conversion" (i.e. after the hack, the last
byte in the 'text' value would be NUL), would there be any obvious
problems with the above hack?

Thanks,
V.

Search Discussions

  • Noah Misch at May 11, 2011 at 8:11 am

    On Tue, May 10, 2011 at 04:31:37PM -0400, Vlad Romascanu wrote:
    As a one-off attempt to change a large table's 'bytea' column to
    'text' with minimal I/O (where the 'bytea' contents is already valid
    UTF8 and the database encoding is also UTF8, and the column is not
    part of any index or anything involving collation), how unsafe is the
    following?

    UPDATE pg_attribute SET atttypid='text'::regtype::oid WHERE
    attrelid=('schema_name.table_name')::regclass AND attname='col_name'
    AND atttypid='bytea'::regtype::oid;
    Do a "LOCK TABLE schema_name.table_name" earlier in the same transaction.

    Any of the following potentially complicates things:
    - data is not valid text in the server encoding, including NUL bytes
    - use in an index (you mentioned this does not apply)
    - use in a CHECK or FOREIGN KEY constraint
    - default expression on the column
    - views or rules referencing the column
    - referenced directly in a trigger definition (UPDATE OF col_name, WHEN ...)

    A useful, though not foolproof, way to check whether you've snagged on any of
    those is to take schema dumps of a test database after (a) doing the change
    normally and (b) doing it this way, then compare.
    Additionally, if the 'bytea' happenned to also explicitly contain a
    trailing NUL prior to the "conversion" (i.e. after the hack, the last
    byte in the 'text' value would be NUL), would there be any obvious
    problems with the above hack?
    Yes; this will break things in the general case. Two text datums that differ
    only in the presence of this trailing NUL will compare as unequal by texteq().
    The octet_length() function will include the NUL byte in its count. If you use
    the column in very restricted ways, you might end up okay. Note that you could
    avoid these problems by updating the bytea columns to remove these NUL bytes
    before making the catalog update. It could still be much I/O, but at least you
    wouldn't have a table lock while doing it.

    nm

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedMay 10, '11 at 8:38p
activeMay 11, '11 at 8:11a
posts2
users2
websitepostgresql.org
irc#postgresql

2 users in discussion

Noah Misch: 1 post Vlad Romascanu: 1 post

People

Translate

site design / logo © 2021 Grokbase