FAQ
Hi,

We are using CM 4.5
Upon using the high availability wizard, we are failing with:

Supervisor returned FATAL: + '[' -e /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/log4j.properties ']'
+ perl -pi -e 's#{{CMF_CONF_DIR}}#/var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER#g' /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/log4j.properties
++ find /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER -maxdepth 1 -name '*.py'
+ OUTPUT=/var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/cloudera_manager_agent_fencer.py
+ '[' /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/cloudera_manager_agent_fencer.py '!=' '' ']'
+ chmod +x /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/cloudera_manager_agent_fencer.py
+ export HADOOP_IDENT_STRING=hdfs
+ HADOOP_IDENT_STRING=hdfs
+ '[' -n '' ']'
+ acquire_kerberos_tgt hdfs.keytab
+ '[' -z hdfs.keytab ']'
+ '[' -n '' ']'
+ '[' validate-writable-empty-dirs = zkfc ']'
+ '[' file-operation = zkfc ']'
+ '[' bootstrap = zkfc ']'
+ '[' failover = zkfc ']'
+ '[' transition-to-active = zkfc ']'
+ '[' initializeSharedEdits = zkfc ']'
+ '[' initialize-znode = zkfc ']'
+ '[' format-namenode = zkfc ']'
+ '[' monitor-decommission = zkfc ']'
+ '[' jnSyncWait = zkfc ']'
+ '[' nnRpcWait = zkfc ']'
+ '[' monitor-upgrade = zkfc ']'
+ '[' finalize-upgrade = zkfc ']'
+ '[' mkdir = zkfc ']'
+ '[' namenode = zkfc -o secondarynamenode = zkfc -o datanode = zkfc ']'
+ exec /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop-hdfs/bin/hdfs --config /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER zkfc
Exception in thread "main" java.lang.NullPointerException
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
at org.apache.hadoop.hdfs.tools.NNHAServiceTarget.(DFSZKFailoverController.java:128)
at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:172)


In addition, we pressed 'retry' and now we got an error in an earlier step,
with: Creating name directories for the new NameNode, if they don't already
exist, and checking that they are writable and empty

Thanks,
Amit

Search Discussions

  • Amit Mor at Mar 6, 2013 at 12:24 pm
    O.K - Resolved (for future reference - Cloudera please do something with
    this ...)
    It's a problem with the High Availability wizard.
    The wizard is launched when one clicks, in the hdfs instances page the
    'enable' link under the high availability column.
    Somehow, since I had a 'leftover' of a stopped failovercontroller service,
    the wizard is trying and failing to launch it - although it is not
    'enabled' and or selected in any other way. I assume it fails because the
    current hdfs-site.xml does not have a configuration that has automatic
    failover param and value present (or equals false) (and that's why
    checkNotNull screams), but somehow the wizard 'thinks' it needs to start
    the failovercontroller service as well.

    I resolved this issue by:
    1) deleted the existing failovercontroller
    2) deleted the existing second namenode associated with this NameService
    3) clicked the 'enable' under high availability, this launches the wizard
    that now finishes without errors
    4) now I have clicked 'enable' under the automatic failover - this launches
    another wizard the starts failovercontroller and succeeds

    On Tuesday, March 5, 2013 2:08:32 PM UTC+2, Amit Mor wrote:

    Hi,

    We are using CM 4.5
    Upon using the high availability wizard, we are failing with:

    Supervisor returned FATAL: + '[' -e /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/log4j.properties ']'
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER#g' /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/log4j.properties
    ++ find /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER -maxdepth 1 -name '*.py'
    + OUTPUT=/var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/cloudera_manager_agent_fencer.py
    + '[' /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/cloudera_manager_agent_fencer.py '!=' '' ']'
    + chmod +x /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER/cloudera_manager_agent_fencer.py
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = zkfc ']'
    + '[' file-operation = zkfc ']'
    + '[' bootstrap = zkfc ']'
    + '[' failover = zkfc ']'
    + '[' transition-to-active = zkfc ']'
    + '[' initializeSharedEdits = zkfc ']'
    + '[' initialize-znode = zkfc ']'
    + '[' format-namenode = zkfc ']'
    + '[' monitor-decommission = zkfc ']'
    + '[' jnSyncWait = zkfc ']'
    + '[' nnRpcWait = zkfc ']'
    + '[' monitor-upgrade = zkfc ']'
    + '[' finalize-upgrade = zkfc ']'
    + '[' mkdir = zkfc ']'
    + '[' namenode = zkfc -o secondarynamenode = zkfc -o datanode = zkfc ']'
    + exec /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop-hdfs/bin/hdfs --config /var/run/cloudera-scm-agent/process/105-hdfs-FAILOVERCONTROLLER zkfc
    Exception in thread "main" java.lang.NullPointerException
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
    at org.apache.hadoop.hdfs.tools.NNHAServiceTarget.<init>(NNHAServiceTarget.java:57)
    at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:128)
    at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:172)


    In addition, we pressed 'retry' and now we got an error in an earlier
    step, with: Creating name directories for the new NameNode, if they don't
    already exist, and checking that they are writable and empty

    Thanks,
    Amit
  • Vinithra Varadharajan at Mar 6, 2013 at 8:01 pm
    Amit,

    As you have rightly discovered, the Enable HA workflow assumes that you
    haven't taken any steps to enabling HDFS HA yourself. I'll file a bug to
    account for any roles that shouldn't be there, such as the
    FailoverController.

    -Vinithra
    On Wed, Mar 6, 2013 at 4:24 AM, Amit Mor wrote:

    O.K - Resolved (for future reference - Cloudera please do something with
    this ...)
    It's a problem with the High Availability wizard.
    The wizard is launched when one clicks, in the hdfs instances page the
    'enable' link under the high availability column.
    Somehow, since I had a 'leftover' of a stopped failovercontroller service,
    the wizard is trying and failing to launch it - although it is not
    'enabled' and or selected in any other way. I assume it fails because the
    current hdfs-site.xml does not have a configuration that has automatic
    failover param and value present (or equals false) (and that's why
    checkNotNull screams), but somehow the wizard 'thinks' it needs to start
    the failovercontroller service as well.

    I resolved this issue by:
    1) deleted the existing failovercontroller
    2) deleted the existing second namenode associated with this NameService
    3) clicked the 'enable' under high availability, this launches the wizard
    that now finishes without errors
    4) now I have clicked 'enable' under the automatic failover - this
    launches another wizard the starts failovercontroller and succeeds

    On Tuesday, March 5, 2013 2:08:32 PM UTC+2, Amit Mor wrote:

    Hi,

    We are using CM 4.5
    Upon using the high availability wizard, we are failing with:


    Supervisor returned FATAL: + '[' -e /var/run/cloudera-scm-agent/**process/105-hdfs-**FAILOVERCONTROLLER/log4j.**properties ']'
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/var/run/**cloudera-scm-agent/process/**105-hdfs-FAILOVERCONTROLLER#g' /var/run/cloudera-scm-agent/**process/105-hdfs-**FAILOVERCONTROLLER/log4j.**properties
    ++ find /var/run/cloudera-scm-agent/**process/105-hdfs-**FAILOVERCONTROLLER -maxdepth 1 -name '*.py'
    + OUTPUT=/var/run/cloudera-scm-**agent/process/105-hdfs-**FAILOVERCONTROLLER/cloudera_**manager_agent_fencer.py
    + '[' /var/run/cloudera-scm-agent/**process/105-hdfs-**FAILOVERCONTROLLER/cloudera_**manager_agent_fencer.py '!=' '' ']'
    + chmod +x /var/run/cloudera-scm-agent/**process/105-hdfs-**FAILOVERCONTROLLER/cloudera_**manager_agent_fencer.py
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = zkfc ']'
    + '[' file-operation = zkfc ']'
    + '[' bootstrap = zkfc ']'
    + '[' failover = zkfc ']'
    + '[' transition-to-active = zkfc ']'
    + '[' initializeSharedEdits = zkfc ']'
    + '[' initialize-znode = zkfc ']'
    + '[' format-namenode = zkfc ']'
    + '[' monitor-decommission = zkfc ']'
    + '[' jnSyncWait = zkfc ']'
    + '[' nnRpcWait = zkfc ']'
    + '[' monitor-upgrade = zkfc ']'
    + '[' finalize-upgrade = zkfc ']'
    + '[' mkdir = zkfc ']'
    + '[' namenode = zkfc -o secondarynamenode = zkfc -o datanode = zkfc ']'
    + exec /opt/cloudera/parcels/CDH-4.2.**0-1.cdh4.2.0.p0.10/lib/hadoop-**hdfs/bin/hdfs --config /var/run/cloudera-scm-agent/**process/105-hdfs-**FAILOVERCONTROLLER zkfc
    Exception in thread "main" java.lang.NullPointerException
    at com.google.common.base.**Preconditions.checkNotNull(**Preconditions.java:187)
    at org.apache.hadoop.hdfs.tools.**NNHAServiceTarget.<init>(**NNHAServiceTarget.java:57)
    at org.apache.hadoop.hdfs.tools.**DFSZKFailoverController.**create(**DFSZKFailoverController.java:**128)
    at org.apache.hadoop.hdfs.tools.**DFSZKFailoverController.main(**DFSZKFailoverController.java:**172)


    In addition, we pressed 'retry' and now we got an error in an earlier
    step, with: Creating name directories for the new NameNode, if they
    don't already exist, and checking that they are writable and empty

    Thanks,
    Amit

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedMar 5, '13 at 12:08p
activeMar 6, '13 at 8:01p
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Amit Mor: 2 posts Vinithra Varadharajan: 1 post

People

Translate

site design / logo © 2022 Grokbase