Hi,

I installed postgres 8.0 for windows on my win xp (Simplified Chinese
version). The encoding is unicode. When I set pgsql client encoding to
gb18030, I could insert Chinese text from the command line to
postgres.

However, I could not get the sort order of Chinese varchar field to
work properly.

What I tried are as follows:

1) installed postgres for windows and used the "C" locale.
2) installed postgres for windows and used the "Chinese, PRC" locale.

Again, in both cases, the backend encoding is unicode.

The other interesting thing I observed was, when setting to the "C"
locale, the following sql worked fine:
select * from user where name = 'xxxxx';
xxxxx is a Chinese text
However, if I set the locale to "Chinese, PRC" during installation,
the above select did not get the any matching rows, where it should
have got. In this case, the following worked fine:
select * from user where name ilike 'xxxxx%';

Could anyone let me know the best practice for using postgres to store
Chinese text? (This should be the same problem I guess, for using
postgres to store other languages than English.)

Thanks a lot,

Jian

Search Discussions

  • Magnus Hagander at Mar 28, 2005 at 4:51 pm

    Hi,

    I installed postgres 8.0 for windows on my win xp (Simplified Chinese
    version). The encoding is unicode. When I set pgsql client encoding to
    gb18030, I could insert Chinese text from the command line to
    postgres.

    However, I could not get the sort order of Chinese varchar field to
    work properly.

    What I tried are as follows:

    1) installed postgres for windows and used the "C" locale.
    2) installed postgres for windows and used the "Chinese, PRC" locale.

    Again, in both cases, the backend encoding is unicode.
    Sorting is not currently supported in the UNICODE/UTF-8 encoding on
    Win32. Sorry. You need to pick a specific non-unicode encoding/locale
    combination. (Or run the server on a platform that supports it, if that
    is an option)

    //Magnus
  • Tatsuo Ishii at Mar 29, 2005 at 1:26 am

    Hi,

    I installed postgres 8.0 for windows on my win xp (Simplified Chinese
    version). The encoding is unicode. When I set pgsql client encoding to
    gb18030, I could insert Chinese text from the command line to
    postgres.

    However, I could not get the sort order of Chinese varchar field to
    work properly.

    What I tried are as follows:

    1) installed postgres for windows and used the "C" locale.
    2) installed postgres for windows and used the "Chinese, PRC" locale.

    Again, in both cases, the backend encoding is unicode.

    The other interesting thing I observed was, when setting to the "C"
    locale, the following sql worked fine:
    select * from user where name = 'xxxxx';
    xxxxx is a Chinese text
    However, if I set the locale to "Chinese, PRC" during installation,
    the above select did not get the any matching rows, where it should
    have got. In this case, the following worked fine:
    select * from user where name ilike 'xxxxx%';

    Could anyone let me know the best practice for using postgres to store
    Chinese text? (This should be the same problem I guess, for using
    postgres to store other languages than English.)
    I would suggest to use UNICODE/C locale combo. On most systems the
    locale database for multibyte encodings are broken as far as I
    know. For the sorting problem, probably you could get the right sort
    order by using convert. i.e.

    SELECT * FROM t1 WHERE ... ORDER BY CONVERT(your_chinese_character, utf_8_to_gb_18030);

    If above does not work, you cannot get the right sort order even if
    you use GB18030 anyway.
    --
    Tatsuo Ishii
  • Jian chen at Mar 29, 2005 at 6:00 am
    Great, that works out fine!

    So, the SQL I tested with is:
    select * from mytable order by convert(name, 'utf8', 'gb18030');

    It produces the correct output.

    Thanks Tatsuo!

    Jian

    On Tue, 29 Mar 2005 10:25:58 +0900 (JST), Tatsuo Ishii
    wrote:
    Hi,

    I installed postgres 8.0 for windows on my win xp (Simplified Chinese
    version). The encoding is unicode. When I set pgsql client encoding to
    gb18030, I could insert Chinese text from the command line to
    postgres.

    However, I could not get the sort order of Chinese varchar field to
    work properly.

    What I tried are as follows:

    1) installed postgres for windows and used the "C" locale.
    2) installed postgres for windows and used the "Chinese, PRC" locale.

    Again, in both cases, the backend encoding is unicode.

    The other interesting thing I observed was, when setting to the "C"
    locale, the following sql worked fine:
    select * from user where name = 'xxxxx';
    xxxxx is a Chinese text
    However, if I set the locale to "Chinese, PRC" during installation,
    the above select did not get the any matching rows, where it should
    have got. In this case, the following worked fine:
    select * from user where name ilike 'xxxxx%';

    Could anyone let me know the best practice for using postgres to store
    Chinese text? (This should be the same problem I guess, for using
    postgres to store other languages than English.)
    I would suggest to use UNICODE/C locale combo. On most systems the
    locale database for multibyte encodings are broken as far as I
    know. For the sorting problem, probably you could get the right sort
    order by using convert. i.e.

    SELECT * FROM t1 WHERE ... ORDER BY CONVERT(your_chinese_character, utf_8_to_gb_18030);

    If above does not work, you cannot get the right sort order even if
    you use GB18030 anyway.
    --
    Tatsuo Ishii
  • Tatsuo Ishii at Mar 29, 2005 at 7:39 am

    Great, that works out fine!

    So, the SQL I tested with is:
    select * from mytable order by convert(name, 'utf8', 'gb18030');
    Sorry, what I wanted to say was:

    SELECT * FROM t1 WHERE ... ORDER BY CONVERT(your_chinese_character using utf_8_to_gb_18030);

    Of course your example is fine too (actually they are internally
    identical).
    --
    Tatsuo Ishii
    It produces the correct output.

    Thanks Tatsuo!

    Jian

    On Tue, 29 Mar 2005 10:25:58 +0900 (JST), Tatsuo Ishii
    wrote:
    Hi,

    I installed postgres 8.0 for windows on my win xp (Simplified Chinese
    version). The encoding is unicode. When I set pgsql client encoding to
    gb18030, I could insert Chinese text from the command line to
    postgres.

    However, I could not get the sort order of Chinese varchar field to
    work properly.

    What I tried are as follows:

    1) installed postgres for windows and used the "C" locale.
    2) installed postgres for windows and used the "Chinese, PRC" locale.

    Again, in both cases, the backend encoding is unicode.

    The other interesting thing I observed was, when setting to the "C"
    locale, the following sql worked fine:
    select * from user where name = 'xxxxx';
    xxxxx is a Chinese text
    However, if I set the locale to "Chinese, PRC" during installation,
    the above select did not get the any matching rows, where it should
    have got. In this case, the following worked fine:
    select * from user where name ilike 'xxxxx%';

    Could anyone let me know the best practice for using postgres to store
    Chinese text? (This should be the same problem I guess, for using
    postgres to store other languages than English.)
    I would suggest to use UNICODE/C locale combo. On most systems the
    locale database for multibyte encodings are broken as far as I
    know. For the sorting problem, probably you could get the right sort
    order by using convert. i.e.

    SELECT * FROM t1 WHERE ... ORDER BY CONVERT(your_chinese_character, utf_8_to_gb_18030);

    If above does not work, you cannot get the right sort order even if
    you use GB18030 anyway.
    --
    Tatsuo Ishii
    ---------------------------(end of broadcast)---------------------------
    TIP 6: Have you searched our list archives?

    http://archives.postgresql.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedMar 28, '05 at 4:37p
activeMar 29, '05 at 7:39a
posts5
users3
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2023 Grokbase