FAQ
Hello,

When running the installer for the free edition, this error is recorded in
the log file init-embedded-b.log:
"initdb: encoding mismatch
The encoding you selected (UTF8) and the encoding that the
selected locale uses (LATIN1) do not match. This would lead to
misbehavior in various character string processing functions.
Rerun initdb and either do not specify an encoding explicitly,
or choose a matching combination."

Is there a manual way to ensure this doesn't occur ? I have already tried
to execute initdb (successfully) with utf8 encoding and locale as the
cloudera-scm user and after that if I run the installer, the installer
still fails, but the log file it points to is empty.

Search Discussions

  • Vikas Singh at Sep 27, 2012 at 4:08 pm
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.
  • Mohit Chawla at Sep 27, 2012 at 6:47 pm
    Hello,

    The locale of the OS which I have already mentioned is set to UTF-8.
    On Thu, Sep 27, 2012 at 9:38 PM, Vikas Singh wrote:
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.
  • Vikas Singh at Sep 27, 2012 at 6:49 pm
    What's the distro you are using?

    - Vikas

    On Thu, Sep 27, 2012 at 11:47 AM, Mohit Chawla
    wrote:
    Hello,

    The locale of the OS which I have already mentioned is set to UTF-8.
    On Thu, Sep 27, 2012 at 9:38 PM, Vikas Singh wrote:
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.
  • Mohit Chawla at Sep 27, 2012 at 7:20 pm
    Hello,

    Its Ubuntu 12.04 (precise).
    On Fri, Sep 28, 2012 at 12:19 AM, Vikas Singh wrote:
    What's the distro you are using?

    - Vikas

    On Thu, Sep 27, 2012 at 11:47 AM, Mohit Chawla
    wrote:
    Hello,

    The locale of the OS which I have already mentioned is set to UTF-8.
    On Thu, Sep 27, 2012 at 9:38 PM, Vikas Singh wrote:
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.
  • Vikas Singh at Sep 27, 2012 at 7:49 pm
    Hi Mohit,

    CM installer runs initdb w/o specifying the locale but specifies
    encoding as UTF8. So the postgresql picks up default locale from the
    OS. The error you see is generated by Postgresql when it checks the
    default locale and finds that it doesn't support UTF-8 format.

    Can you try running "initdb" on your setup w/o specifying locale, but
    providing --encoding=UTF8? You should hit same error.

    Also, what do you get when you run "locale" command? What do you get
    when you run "locale -a"?

    - Vikas

    On Thu, Sep 27, 2012 at 11:55 AM, Mohit Chawla
    wrote:
    Hello,

    Its Ubuntu 12.04 (precise).
    On Fri, Sep 28, 2012 at 12:19 AM, Vikas Singh wrote:
    What's the distro you are using?

    - Vikas

    On Thu, Sep 27, 2012 at 11:47 AM, Mohit Chawla
    wrote:
    Hello,

    The locale of the OS which I have already mentioned is set to UTF-8.
    On Thu, Sep 27, 2012 at 9:38 PM, Vikas Singh wrote:
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.
  • Mohit Chawla at Sep 27, 2012 at 8:42 pm
    Hello,

    Yes, I am aware of that behaviour, and I was able to reproduce it as
    well - that is, without specifying --locale initdb errors out, even if
    the global locale settings for all users is UTF-8. That's why I was
    surprised in the first place.

    Anyway, here are the _global_ locale settings for all users:
    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US"
    LC_NUMERIC="en_US"
    LC_TIME="en_US"
    LC_COLLATE="en_US"
    LC_MONETARY="en_US"
    LC_MESSAGES="en_US"
    LC_PAPER="en_US"
    LC_NAME="en_US"
    LC_ADDRESS="en_US"
    LC_TELEPHONE="en_US"
    LC_MEASUREMENT="en_US"
    LC_IDENTIFICATION="en_US"
    LC_ALL=en_US

    And locale -a:
    C
    C.UTF-8
    en_AG
    en_AG.utf8
    en_AU.utf8
    en_BW.utf8
    en_CA.utf8
    en_DK.utf8
    en_GB.utf8
    en_HK.utf8
    en_IE.utf8
    en_IN
    en_IN.utf8
    en_NG
    en_NG.utf8
    en_NZ.utf8
    en_PH.utf8
    en_SG.utf8
    en_US
    en_US.iso88591
    en_US.utf8
    en_ZA.utf8
    en_ZM
    en_ZM.utf8
    en_ZW.utf8
    POSIX

    On Fri, Sep 28, 2012 at 1:19 AM, Vikas Singh wrote:
    Hi Mohit,

    CM installer runs initdb w/o specifying the locale but specifies
    encoding as UTF8. So the postgresql picks up default locale from the
    OS. The error you see is generated by Postgresql when it checks the
    default locale and finds that it doesn't support UTF-8 format.

    Can you try running "initdb" on your setup w/o specifying locale, but
    providing --encoding=UTF8? You should hit same error.

    Also, what do you get when you run "locale" command? What do you get
    when you run "locale -a"?

    - Vikas

    On Thu, Sep 27, 2012 at 11:55 AM, Mohit Chawla
    wrote:
    Hello,

    Its Ubuntu 12.04 (precise).
    On Fri, Sep 28, 2012 at 12:19 AM, Vikas Singh wrote:
    What's the distro you are using?

    - Vikas

    On Thu, Sep 27, 2012 at 11:47 AM, Mohit Chawla
    wrote:
    Hello,

    The locale of the OS which I have already mentioned is set to UTF-8.
    On Thu, Sep 27, 2012 at 9:38 PM, Vikas Singh wrote:
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.
  • Vikas Singh at Sep 27, 2012 at 8:54 pm
    You have got your locale subcategories wrong. Here is output of
    'locale' from a ubuntu system. Note that all LC_* have UTF-8 in them.
    You need to fix that and it should take care of this issue.

    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"

    - Vikas


    On Thu, Sep 27, 2012 at 1:42 PM, Mohit Chawla
    wrote:
    Hello,

    Yes, I am aware of that behaviour, and I was able to reproduce it as
    well - that is, without specifying --locale initdb errors out, even if
    the global locale settings for all users is UTF-8. That's why I was
    surprised in the first place.

    Anyway, here are the _global_ locale settings for all users:
    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US"
    LC_NUMERIC="en_US"
    LC_TIME="en_US"
    LC_COLLATE="en_US"
    LC_MONETARY="en_US"
    LC_MESSAGES="en_US"
    LC_PAPER="en_US"
    LC_NAME="en_US"
    LC_ADDRESS="en_US"
    LC_TELEPHONE="en_US"
    LC_MEASUREMENT="en_US"
    LC_IDENTIFICATION="en_US"
    LC_ALL=en_US

    And locale -a:
    C
    C.UTF-8
    en_AG
    en_AG.utf8
    en_AU.utf8
    en_BW.utf8
    en_CA.utf8
    en_DK.utf8
    en_GB.utf8
    en_HK.utf8
    en_IE.utf8
    en_IN
    en_IN.utf8
    en_NG
    en_NG.utf8
    en_NZ.utf8
    en_PH.utf8
    en_SG.utf8
    en_US
    en_US.iso88591
    en_US.utf8
    en_ZA.utf8
    en_ZM
    en_ZM.utf8
    en_ZW.utf8
    POSIX

    On Fri, Sep 28, 2012 at 1:19 AM, Vikas Singh wrote:
    Hi Mohit,

    CM installer runs initdb w/o specifying the locale but specifies
    encoding as UTF8. So the postgresql picks up default locale from the
    OS. The error you see is generated by Postgresql when it checks the
    default locale and finds that it doesn't support UTF-8 format.

    Can you try running "initdb" on your setup w/o specifying locale, but
    providing --encoding=UTF8? You should hit same error.

    Also, what do you get when you run "locale" command? What do you get
    when you run "locale -a"?

    - Vikas

    On Thu, Sep 27, 2012 at 11:55 AM, Mohit Chawla
    wrote:
    Hello,

    Its Ubuntu 12.04 (precise).
    On Fri, Sep 28, 2012 at 12:19 AM, Vikas Singh wrote:
    What's the distro you are using?

    - Vikas

    On Thu, Sep 27, 2012 at 11:47 AM, Mohit Chawla
    wrote:
    Hello,

    The locale of the OS which I have already mentioned is set to UTF-8.
    On Thu, Sep 27, 2012 at 9:38 PM, Vikas Singh wrote:
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.
  • Mohit Chawla at Sep 27, 2012 at 9:45 pm
    Hello,

    Thanks for pointing that out, will give it a try.
    On Fri, Sep 28, 2012 at 2:24 AM, Vikas Singh wrote:
    You have got your locale subcategories wrong. Here is output of
    'locale' from a ubuntu system. Note that all LC_* have UTF-8 in them.
    You need to fix that and it should take care of this issue.

    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"

    - Vikas


    On Thu, Sep 27, 2012 at 1:42 PM, Mohit Chawla
    wrote:
    Hello,

    Yes, I am aware of that behaviour, and I was able to reproduce it as
    well - that is, without specifying --locale initdb errors out, even if
    the global locale settings for all users is UTF-8. That's why I was
    surprised in the first place.

    Anyway, here are the _global_ locale settings for all users:
    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US"
    LC_NUMERIC="en_US"
    LC_TIME="en_US"
    LC_COLLATE="en_US"
    LC_MONETARY="en_US"
    LC_MESSAGES="en_US"
    LC_PAPER="en_US"
    LC_NAME="en_US"
    LC_ADDRESS="en_US"
    LC_TELEPHONE="en_US"
    LC_MEASUREMENT="en_US"
    LC_IDENTIFICATION="en_US"
    LC_ALL=en_US

    And locale -a:
    C
    C.UTF-8
    en_AG
    en_AG.utf8
    en_AU.utf8
    en_BW.utf8
    en_CA.utf8
    en_DK.utf8
    en_GB.utf8
    en_HK.utf8
    en_IE.utf8
    en_IN
    en_IN.utf8
    en_NG
    en_NG.utf8
    en_NZ.utf8
    en_PH.utf8
    en_SG.utf8
    en_US
    en_US.iso88591
    en_US.utf8
    en_ZA.utf8
    en_ZM
    en_ZM.utf8
    en_ZW.utf8
    POSIX

    On Fri, Sep 28, 2012 at 1:19 AM, Vikas Singh wrote:
    Hi Mohit,

    CM installer runs initdb w/o specifying the locale but specifies
    encoding as UTF8. So the postgresql picks up default locale from the
    OS. The error you see is generated by Postgresql when it checks the
    default locale and finds that it doesn't support UTF-8 format.

    Can you try running "initdb" on your setup w/o specifying locale, but
    providing --encoding=UTF8? You should hit same error.

    Also, what do you get when you run "locale" command? What do you get
    when you run "locale -a"?

    - Vikas

    On Thu, Sep 27, 2012 at 11:55 AM, Mohit Chawla
    wrote:
    Hello,

    Its Ubuntu 12.04 (precise).
    On Fri, Sep 28, 2012 at 12:19 AM, Vikas Singh wrote:
    What's the distro you are using?

    - Vikas

    On Thu, Sep 27, 2012 at 11:47 AM, Mohit Chawla
    wrote:
    Hello,

    The locale of the OS which I have already mentioned is set to UTF-8.
    On Thu, Sep 27, 2012 at 9:38 PM, Vikas Singh wrote:
    Please set default Locale of the OS to be one which supports UTF-8.
    Please look at documentation of the distro you are using to see how
    you can change default Locale.

    - Vikas
    On Thu, Sep 27, 2012 at 3:31 AM, alcy wrote:
    Hello,

    When running the installer for the free edition, this error is recorded in
    the log file init-embedded-b.log:
    "initdb: encoding mismatch
    The encoding you selected (UTF8) and the encoding that the
    selected locale uses (LATIN1) do not match. This would lead to
    misbehavior in various character string processing functions.
    Rerun initdb and either do not specify an encoding explicitly,
    or choose a matching combination."

    Is there a manual way to ensure this doesn't occur ? I have already tried to
    execute initdb (successfully) with utf8 encoding and locale as the
    cloudera-scm user and after that if I run the installer, the installer still
    fails, but the log file it points to is empty.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedSep 27, '12 at 10:31a
activeSep 27, '12 at 9:45p
posts9
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Mohit Chawla: 5 posts Vikas Singh: 4 posts

People

Translate

site design / logo © 2022 Grokbase