FAQ
Hi,

I have a quick question that doesn't seem to be answered in the
documentation. We have a database that has a database character set of
WE8ISO8859P1 and a national character set of UTF-16. Currently we are not
using any NCHAR, NVARCHAR2 or NCLOB columns, but would like to soon.
However, we'd like to use UTF-8 instead of UTF-16. Since there should be
no conversion necessary, is there any reason why I can't just alter the
national character set to UTF-8? Or am I missing something?

While we are on the subject, can anyone recommend for/against UTF-8 or
UTF-16. I have read the pros and cons in the manual, but actual
experiences could change my mind.

Thanks,

Alan

Alan Aschenbrenner
Oracle DBA
IHS Group
alan.aschenbrenner_at_ihs.com

--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author:
INET: alan.aschenbrenner_at_ihs.com

Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------

To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).

Search Discussions

  • Faan DeSwardt at Nov 4, 2003 at 12:44 am
    Alan,

    Unfortunately UTF-8 and UTF-16 is in fact different in their binary representations and hence you will have to perform a full fledged conversion to map the UTF-16 characters to their UTF-8 binary representation. You can use the dump() SQL function to see this for yourself if you have both an UTF-8 and UTF-16 database. Just store the same string data in both and dump the binary representation to see the difference. Additionally, if you try to just alter the character set you will receive an error back from Oracle that UTF-16 is not an _exact_ subset of UTF-8.

    The reason for the difference is that UTF-16 always uses 2 bytes for encoding Unicode but UTF-8 can use up to 3 bytes in the extreme case and hence use completely different encoding schemes to both support Unicode. I know this sounds confusing at times but Unicode is the standard and UTF-8 and UTF-16 are some of the encoding schemes implemented to support different versions of Unicode.

    Referring to your second question about UTF-8 vs. UTF-16, I would suggest UTF8 for both the national character set and the database character set in a mixed 9i and pre-9i environment. If using a pure 9i (in the ideal world!) environment then I would use AL16UTF16 for the national character set and AL32UTF8 for the database character set. The reason being that Oracle will only be supporting and extending these 2 Unicode encodings going forward and you may save yourself another database character set conversion in the near future.

    Another consideration would be the intensity of string manipulation in your application e.g. string searches, sub stringing, string lengths, etc. The more intense and frequent these operations are the more you would want to go with UTF-16 as this uses a fixed 2 bytes for encoding Unicode and is far more efficient at it than the variable byte length UTF-8 character set. Unfortunately you will have to load test your application to get accurate numbers but I have seen 30% longer response times on intense string manipulation operations (especially wildcard searches e.g. %DE%) after converting an Oracle 8.1.7 database from US7ASCII to UTF8. I know this is not comparing apples with apples, but gives and idea of what the extra byte and linked list/vector lookup can do to performance.

    HTH,

    Faan

    -----Original Message-----
    Sent: Monday, November 03, 2003 2:20 PM
    To: Multiple recipients of list ORACLE-L

    Hi,

    I have a quick question that doesn't seem to be answered in the
    documentation. We have a database that has a database character set of
    WE8ISO8859P1 and a national character set of UTF-16. Currently we are not
    using any NCHAR, NVARCHAR2 or NCLOB columns, but would like to soon.
    However, we'd like to use UTF-8 instead of UTF-16. Since there should be
    no conversion necessary, is there any reason why I can't just alter the
    national character set to UTF-8? Or am I missing something?

    While we are on the subject, can anyone recommend for/against UTF-8 or
    UTF-16. I have read the pros and cons in the manual, but actual
    experiences could change my mind.

    Thanks,

    Alan

    Alan Aschenbrenner
    Oracle DBA
    IHS Group
    alan.aschenbrenner_at_ihs.com

    --
    Please see the official ORACLE-L FAQ: http://www.orafaq.net
    --
    Author:
    INET: alan.aschenbrenner_at_ihs.com

    Fat City Network Services -- 858-538-5051 http://www.fatcity.com
    San Diego, California -- Mailing list and web hosting services
    ---------------------------------------------------------------------

    To REMOVE yourself from this mailing list, send an E-Mail message
    to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
    the message BODY, include a line containing: UNSUB ORACLE-L
    (or the name of mailing list you want to be removed from). You may
    also send the HELP command for other information (like subscribing).

    --
    Please see the official ORACLE-L FAQ: http://www.orafaq.net
    --
    Author: Faan DeSwardt
    INET: Faan_D1_at_VERIFONE.com

    Fat City Network Services -- 858-538-5051 http://www.fatcity.com
    San Diego, California -- Mailing list and web hosting services
    ---------------------------------------------------------------------

    To REMOVE yourself from this mailing list, send an E-Mail message
    to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
    the message BODY, include a line containing: UNSUB ORACLE-L
    (or the name of mailing list you want to be removed from). You may
    also send the HELP command for other information (like subscribing).

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouporacle-l @
categoriesoracle
postedNov 3, '03 at 10:19p
activeNov 4, '03 at 12:44a
posts2
users2
websiteoracle.com

People

Translate

site design / logo © 2022 Grokbase