So, we are still having an issue with this and I thought I'd throw this
out to the list to see if I'm missing something. Basically, we have
identified the tables/fields we need to convert. I'm running the
following perl code against the fields and re-inserting the 'fixed' code
into the field:

data =~ s/(.)/((ord($1) >= 0) && (ord($1) <= 8))
(ord($1) == 11)
((ord($1) >= 13) && (ord($1) <= 31))
((ord($1) >= 127)) ?"": $1/egs;
This appears to be working as a large number of records are cleaned.
Problem is, someone it's not fixing data that contains the hex value
0xbd, as when I attempt to dump this database and create a new one with
the UTF8 encoding I get the following error:

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 5246; 0 4978675 TABLE
DATA cust postgres
pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence
for encoding "UTF8": 0xbd

As I see it, the perl code above should catch this '0xbd' character, but
somehow it is finding it's way through.

Any insights would be greatly appreciated.

Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
postedMar 29, '11 at 7:38p
activeMar 29, '11 at 7:38p

1 user in discussion

Geoffrey Myers: 1 post



site design / logo © 2022 Grokbase