FAQ
Hi all,

So I went through an upgrade today from CM 4.1 Enterprise with CDH4.1.3
RPMs and Impala 0.5 RPMs to CM 4.5 Enterprise and parcel based CDH and
Impala.

There were a few gotchas that came up during this that seemed worth noting.

The upgrade was carried out with a full cluster shutdown to avoid any
potential conflict issues.

1) CM 4.5 upgrade went fine. CM didn't check for parcels at start... Had to
temporarily set the check frequency to very low to get it to see the impala
and cdh parcels to download.
2) For security reasons I have apache terminating SSL on 443 and proxying
to CM port 7180. That TCP port is blocked by iptables to enforce logins etc
over an encrypted connection. The nodes were unable to get the parcels as
they specifically tried to get them from CM on this port. It would be
useful to be able to provide an override (full URL including port) for scm
agents to use to get parcels.
3) The parcels were used by CM correctly for starting the services but the
existing binaries on the systems were still the old 4.1.3 ones. No symlinks
were put in place. The systems needed yum remove \*cdh\* and the scm agents
restarted to have the symlinks correctly put in place.
4) The key parts of the hive config was correctly imported but the sections
configuring LDAP auth for hive server 2 were not... This additional config
was deployed correctly from the hive safety valve. The hive jobs through
the hive server 2 jdbc now run as the authenticating user rather than hive
- great stuff!
4) My hue config was just using sqlite ... This database was lost during
the rpm removal and I had no backup... Oops. Still it didn't take long to
readd the users and the hive history etc is not used heavily by us...
5) The new impala interface was nice to see but after a few queries the
history has a Texception error (or something like that)... I haven't spent
time digging into this given I'm already working a Saturday ;-)
6) Although CDH4.2 supports a postgres backend for hue this option is not
shown in CM4.5

Incidentally the new graphs and metrics logging are fantastic... So much
detail on every host.

That's it for my Saturday working... Any queries I'll respond to on Monday.

James

Search Discussions

  • Romain Rigaux at Mar 2, 2013 at 5:23 pm
    Thanks for the feedback!

    About Hue:

    #4, I am going to create a bug about this, sqlite is not recommended but
    this is pretty bad!
    #5, you are probably hitting this:
    https://issues.cloudera.org/browse/HUE-1052
    #6, I am going to create a bug too

    Romain
    On Sat, Mar 2, 2013 at 7:04 AM, James Hogarth wrote:

    Hi all,

    So I went through an upgrade today from CM 4.1 Enterprise with CDH4.1.3
    RPMs and Impala 0.5 RPMs to CM 4.5 Enterprise and parcel based CDH and
    Impala.

    There were a few gotchas that came up during this that seemed worth
    noting.

    The upgrade was carried out with a full cluster shutdown to avoid any
    potential conflict issues.

    1) CM 4.5 upgrade went fine. CM didn't check for parcels at start... Had
    to temporarily set the check frequency to very low to get it to see the
    impala and cdh parcels to download.
    2) For security reasons I have apache terminating SSL on 443 and proxying
    to CM port 7180. That TCP port is blocked by iptables to enforce logins etc
    over an encrypted connection. The nodes were unable to get the parcels as
    they specifically tried to get them from CM on this port. It would be
    useful to be able to provide an override (full URL including port) for scm
    agents to use to get parcels.
    3) The parcels were used by CM correctly for starting the services but the
    existing binaries on the systems were still the old 4.1.3 ones. No symlinks
    were put in place. The systems needed yum remove \*cdh\* and the scm agents
    restarted to have the symlinks correctly put in place.
    4) The key parts of the hive config was correctly imported but the
    sections configuring LDAP auth for hive server 2 were not... This
    additional config was deployed correctly from the hive safety valve. The
    hive jobs through the hive server 2 jdbc now run as the authenticating user
    rather than hive - great stuff!
    4) My hue config was just using sqlite ... This database was lost during
    the rpm removal and I had no backup... Oops. Still it didn't take long to
    readd the users and the hive history etc is not used heavily by us...
    5) The new impala interface was nice to see but after a few queries the
    history has a Texception error (or something like that)... I haven't spent
    time digging into this given I'm already working a Saturday ;-)
    6) Although CDH4.2 supports a postgres backend for hue this option is not
    shown in CM4.5

    Incidentally the new graphs and metrics logging are fantastic... So much
    detail on every host.

    That's it for my Saturday working... Any queries I'll respond to on Monday.

    James
  • James Hogarth at Mar 2, 2013 at 5:35 pm
    Thanks for the feedback!
    No problem... I'm looking forward to digging into some bits more during
    working hours (including cloudera navigator)...
    #4, I am going to create a bug about this, sqlite is not recommended but
    this is pretty bad!

    This was originally a CM/CDH4.0 install that's been upgraded over time and
    the change from sqlite to mysql then sort of blew up for the initial
    deployment ... Seeing as how we were just doing technical evaluations back
    then and it still 'worked' it kind of just slipped into my back burner with
    other priorities...

    I plan to get hue on a postgres backend soon...
    #5, you are probably hitting this:
    https://issues.cloudera.org/browse/HUE-1052

    That looks about right ... I'll double check the logs on Monday to see if
    it matches up.
    #6, I am going to create a bug too
    Great stuff... Mid next week I'm going to try and get reproducible steps
    documented to change the mysql backend for oozie and hive to postgres with
    no data loss... Will be nice to get them all on a common platform and will
    make HA for these then simpler since they could potentially then all point
    to a common postgres cluster.
  • James Hogarth at Mar 4, 2013 at 9:10 am

    #5, you are probably hitting this:
    https://issues.cloudera.org/browse/HUE-1052

    That looks about right ... I'll double check the logs on Monday to see if
    it matches up.

    I applied that patch from the commit to our instance and the query history
    is indeed being viewed correctly...

    The Impala interface doesn't seem to be returning all results however... As
    an example one query (a simple select * from table where username='me')
    show 94 rows correctly if you do a count(*) whether on the web ui or the
    CLI ... but on the web UI doing the select only has two rows but on the CLI
    all 94 are returned...
  • Philip Langdale at Mar 4, 2013 at 5:59 pm
    Hi James,

    2) You can configure CM to use TLS directly, and if you do this, the agents
    will know to make an SSL connection
    to download the files. That should obviate the need for your SSL proxying
    (unless your security requirements mean
    you have to use a proxy regardless - in which case there isn't anything you
    can do)

    3) Yeah, you have to do exactly what you wrote. We failed to mention that
    in the docs, which we're in the process
    of rectifying.

    Thanks,

    --phil

    On 2 March 2013 07:04, James Hogarth wrote:

    Hi all,

    So I went through an upgrade today from CM 4.1 Enterprise with CDH4.1.3
    RPMs and Impala 0.5 RPMs to CM 4.5 Enterprise and parcel based CDH and
    Impala.

    There were a few gotchas that came up during this that seemed worth
    noting.

    The upgrade was carried out with a full cluster shutdown to avoid any
    potential conflict issues.

    1) CM 4.5 upgrade went fine. CM didn't check for parcels at start... Had
    to temporarily set the check frequency to very low to get it to see the
    impala and cdh parcels to download.
    2) For security reasons I have apache terminating SSL on 443 and proxying
    to CM port 7180. That TCP port is blocked by iptables to enforce logins etc
    over an encrypted connection. The nodes were unable to get the parcels as
    they specifically tried to get them from CM on this port. It would be
    useful to be able to provide an override (full URL including port) for scm
    agents to use to get parcels.
    3) The parcels were used by CM correctly for starting the services but the
    existing binaries on the systems were still the old 4.1.3 ones. No symlinks
    were put in place. The systems needed yum remove \*cdh\* and the scm agents
    restarted to have the symlinks correctly put in place.
    4) The key parts of the hive config was correctly imported but the
    sections configuring LDAP auth for hive server 2 were not... This
    additional config was deployed correctly from the hive safety valve. The
    hive jobs through the hive server 2 jdbc now run as the authenticating user
    rather than hive - great stuff!
    4) My hue config was just using sqlite ... This database was lost during
    the rpm removal and I had no backup... Oops. Still it didn't take long to
    readd the users and the hive history etc is not used heavily by us...
    5) The new impala interface was nice to see but after a few queries the
    history has a Texception error (or something like that)... I haven't spent
    time digging into this given I'm already working a Saturday ;-)
    6) Although CDH4.2 supports a postgres backend for hue this option is not
    shown in CM4.5

    Incidentally the new graphs and metrics logging are fantastic... So much
    detail on every host.

    That's it for my Saturday working... Any queries I'll respond to on Monday.

    James
  • James Hogarth at Mar 4, 2013 at 6:23 pm
    Hi Philip,

    For a couple of reasons I need the UI to be on 443...

    I did have TLS on CM a while back but for various reasons needed to drop it
    and proxy...

    I'll come up with a better solution than I have currently in a few weeks
    ;-)

    Don't forget (like I did) impala doesn't have cdh in its name for the time
    being and needs similar treatment...

    Regards,

    James
    On 4 Mar 2013 17:59, "Philip Langdale" wrote:

    Hi James,

    2) You can configure CM to use TLS directly, and if you do this, the
    agents will know to make an SSL connection
    to download the files. That should obviate the need for your SSL proxying
    (unless your security requirements mean
    you have to use a proxy regardless - in which case there isn't anything
    you can do)

    3) Yeah, you have to do exactly what you wrote. We failed to mention that
    in the docs, which we're in the process
    of rectifying.

    Thanks,

    --phil

    On 2 March 2013 07:04, James Hogarth wrote:

    Hi all,

    So I went through an upgrade today from CM 4.1 Enterprise with CDH4.1.3
    RPMs and Impala 0.5 RPMs to CM 4.5 Enterprise and parcel based CDH and
    Impala.

    There were a few gotchas that came up during this that seemed worth
    noting.

    The upgrade was carried out with a full cluster shutdown to avoid any
    potential conflict issues.

    1) CM 4.5 upgrade went fine. CM didn't check for parcels at start... Had
    to temporarily set the check frequency to very low to get it to see the
    impala and cdh parcels to download.
    2) For security reasons I have apache terminating SSL on 443 and proxying
    to CM port 7180. That TCP port is blocked by iptables to enforce logins etc
    over an encrypted connection. The nodes were unable to get the parcels as
    they specifically tried to get them from CM on this port. It would be
    useful to be able to provide an override (full URL including port) for scm
    agents to use to get parcels.
    3) The parcels were used by CM correctly for starting the services but
    the existing binaries on the systems were still the old 4.1.3 ones. No
    symlinks were put in place. The systems needed yum remove \*cdh\* and the
    scm agents restarted to have the symlinks correctly put in place.
    4) The key parts of the hive config was correctly imported but the
    sections configuring LDAP auth for hive server 2 were not... This
    additional config was deployed correctly from the hive safety valve. The
    hive jobs through the hive server 2 jdbc now run as the authenticating user
    rather than hive - great stuff!
    4) My hue config was just using sqlite ... This database was lost during
    the rpm removal and I had no backup... Oops. Still it didn't take long to
    readd the users and the hive history etc is not used heavily by us...
    5) The new impala interface was nice to see but after a few queries the
    history has a Texception error (or something like that)... I haven't spent
    time digging into this given I'm already working a Saturday ;-)
    6) Although CDH4.2 supports a postgres backend for hue this option is not
    shown in CM4.5

    Incidentally the new graphs and metrics logging are fantastic... So much
    detail on every host.

    That's it for my Saturday working... Any queries I'll respond to on
    Monday.

    James

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedMar 2, '13 at 3:04p
activeMar 4, '13 at 6:23p
posts6
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase