We saw some strangeness when we started investigating Unicode, too.
The first question I have, though, is somewhat non-intuitive.
What is the NLS_LANG parameter for the client side, as viewed
from the UTF database?
Here's what we learned...if the client and the database are using
the same NLS_LANG value, then Oracle presumes the data to be=20
correct and does no checking. We learned this when we found 8-bit
extended ASCII characters in a US7ASCII database; how the heck
did they get in there? From a client also using US7ASCII as
When we changed the client characterset to WE8ISO8859, then
we found that Oracle zero'ed the high-order bit and US7ASCII-ized
the characters as they were inbound.
You might try setting your client to WE characterset and see if
you can get the characters loaded correctly. We haven't tried
this ourselves, but it might work.
Sent: Friday, September 17, 2004 5:58 PM
Subject: UTF character set application problem
We're having a problem with character sets. Recently we switched our
database to UTF and now we have problems with names containing accented
characters, etc. generating errors when we are trying to insert them =3D
The data originates from a database that uses Western European character
set. We expected that UTF being a superset, there would be no problems =
switching. However, after a lot of testing, we found that UTF is not
compatible with WE characters. If the data originates as WE, you must
either store it in a UTF database or do an explicit translation to UTF.
This is counter-intuitive to me, but it is my first experience with =3D
different character sets.
The application is in Java using thin JDBC drivers and no =3D
functions. We created a very simple test program to prove out this =3D
We've tested this on 9iR1, 9iR2, and 8i and it works the same.
Anyone else encounter this? Is it just my misconceptions on this in the
first place? Or have I overlooked something?