FAQ
Hi,
I install CDH4 in a single machine using Cloudera Manager, but it's fails
when start HDFS service:

Service did not start successfully; not all of the required roles started: No DataNodes are running.

So I've tried to start the DataNode and I get the error:



Supervisor returned FATAL: + '[' /usr/share/cmf ']'
++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
++ tr -d '\n'
+ ADD_TO_CP=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
+ eval 'OLD_VALUE=$HADOOP_CLASSPATH'
++ OLD_VALUE=
+ '[' -z ']'
+ export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
+ HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
+ set -x
+ perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/8-hdfs-DATANODE#g' /run/cloudera-scm-agent/process/8-hdfs-DATANODE/core-site.xml /run/cloudera-scm-agent/process/8-hdfs-DATANODE/hdfs-site.xml
+ '[' -e /run/cloudera-scm-agent/process/8-hdfs-DATANODE/topology.py ']'
++ find /run/cloudera-scm-agent/process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
+ OUTPUT=
+ '[' '' '!=' '' ']'
+ export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true '
+ HADOOP_OPTS='-Djava.net.preferIPv4Stack=true '
+ export HADOOP_IDENT_STRING=hdfs
+ HADOOP_IDENT_STRING=hdfs
+ '[' -n '' ']'
+ acquire_kerberos_tgt hdfs.keytab
+ '[' -z hdfs.keytab ']'
+ '[' -n '' ']'
+ '[' validate-writable-empty-dirs = datanode ']'
+ '[' file-operation = datanode ']'
+ '[' bootstrap = datanode ']'
+ '[' failover = datanode ']'
+ '[' transition-to-active = datanode ']'
+ '[' initializeSharedEdits = datanode ']'
+ '[' initialize-znode = datanode ']'
+ '[' format-namenode = datanode ']'
+ '[' monitor-decommission = datanode ']'
+ '[' monitor-upgrade = datanode ']'
+ '[' finalize-upgrade = datanode ']'
+ '[' mkdir = datanode ']'
+ '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
+ HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
+ export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
+ HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
+ exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/8-hdfs-DATANODE datanode

Search Discussions

  • Viktor Schmidt at Oct 8, 2012 at 11:37 am
    I have same problem... single machine, CM...
    it's fails when start HDFS service too

    after reinstall:

    Starting your cluster services.

    Completed 2 of 9 steps.
    Formatting HDFS if empty

    NameNode is already formatted and contains data.

    Starting HDFS Service
    Details** <http://localhost:7180/cmf/command/8/details>

    Service did not start successfully; not all of the required roles started: No DataNodes are running.




    Am Donnerstag, 4. Oktober 2012 21:38:10 UTC+2 schrieb Ana Larissa Dias:
    Hi,
    I install CDH4 in a single machine using Cloudera Manager, but it's fails
    when start HDFS service:

    Service did not start successfully; not all of the required roles started: No DataNodes are running.

    So I've tried to start the DataNode and I get the error:



    Supervisor returned FATAL: + '[' /usr/share/cmf ']'
    ++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
    ++ tr -d '\n'
    + ADD_TO_CP=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + eval 'OLD_VALUE=$HADOOP_CLASSPATH'
    ++ OLD_VALUE=
    + '[' -z ']'
    + export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + set -x
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/8-hdfs-DATANODE#g' /run/cloudera-scm-agent/process/8-hdfs-DATANODE/core-site.xml /run/cloudera-scm-agent/process/8-hdfs-DATANODE/hdfs-site.xml
    + '[' -e /run/cloudera-scm-agent/process/8-hdfs-DATANODE/topology.py ']'
    ++ find /run/cloudera-scm-agent/process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
    + OUTPUT=
    + '[' '' '!=' '' ']'
    + export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Djava.net.preferIPv4Stack=true '
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = datanode ']'
    + '[' file-operation = datanode ']'
    + '[' bootstrap = datanode ']'
    + '[' failover = datanode ']'
    + '[' transition-to-active = datanode ']'
    + '[' initializeSharedEdits = datanode ']'
    + '[' initialize-znode = datanode ']'
    + '[' format-namenode = datanode ']'
    + '[' monitor-decommission = datanode ']'
    + '[' monitor-upgrade = datanode ']'
    + '[' finalize-upgrade = datanode ']'
    + '[' mkdir = datanode ']'
    + '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
    + HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/8-hdfs-DATANODE datanode
  • Viktor Schmidt at Oct 8, 2012 at 12:20 pm
    Start the SecondaryNameNode fails too.

    *Start this SecondaryNameNode <http://localhost:7180/cmf/command/12/details>
    *secondarynamenode (localhost)<http://localhost:7180/cmf/services/6/instances/12/status>
    Role Log File<http://localhost:7180/cmf/process/all/logs/context?port=9000&host=localhost&token=a85ec6d217c7e79cc153fd6c6f30f644&path=%2Fvar%2Flog%2Fhadoop-hdfs%2Fhadoop-cmf-hdfs1-SECONDARYNAMENODE-localhost.log.out&roleId=12>08-Oct-2012
    13:40:04** Finished , 08-Oct-2012 13:40:32

    Supervisor returned FATAL: + eval 'OLD_VALUE=$HADOOP_CLASSPATH' ++ OLD_VALUE= + '[' -z ']' + export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar + HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar + set -x + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE#g' /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/core-site.xml /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/hdfs-site.xml + '[' -e /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/topology.py ']' ++ find /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE -maxdepth 1 -name '*.py' + OUTPUT= + '[' '' '!=' '' ']' + export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true ' + HADOOP_OPTS='-Djava.net.preferIPv4Stack=true ' + export HADOOP_IDENT_STRING=hdfs + HADOOP_IDENT_STRING=hdfs + '[' -n '' ']' + acquire_kerberos_tgt hdfs.keytab + '[' -z hdfs.keytab ']' + '[' -n '' ']' + '[' validate-writable-empty-dirs = secondarynamenode ']' + '[' file-operation = secondarynamenode ']' + '[' bootstrap = secondarynamenode ']' + '[' failover = secondarynamenode ']' + '[' transition-to-active = secondarynamenode ']' + '[' initializeSharedEdits = secondarynamenode ']' + '[' initialize-znode = secondarynamenode ']' + '[' format-namenode = secondarynamenode ']' + '[' monitor-decommission = secondarynamenode ']' + '[' monitor-upgrade = secondarynamenode ']' + '[' finalize-upgrade = secondarynamenode ']' + '[' mkdir = secondarynamenode ']' + '[' namenode = secondarynamenode -o secondarynamenode = secondarynamenode -o datanode = secondarynamenode ']' + HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE secondarynamenode


    Am Montag, 8. Oktober 2012 13:37:41 UTC+2 schrieb Viktor Schmidt:
    I have same problem... single machine, CM...
    it's fails when start HDFS service too.

    after reinstall:

    Starting your cluster services.

    Completed 2 of 9 steps.
    Formatting HDFS if empty

    NameNode is already formatted and contains data.

    Starting HDFS Service
    Details** <http://localhost:7180/cmf/command/8/details>

    Service did not start successfully; not all of the required roles started: No DataNodes are running.




    Am Donnerstag, 4. Oktober 2012 21:38:10 UTC+2 schrieb Ana Larissa Dias:
    Hi,
    I install CDH4 in a single machine using Cloudera Manager, but it's
    fails when start HDFS service:

    Service did not start successfully; not all of the required roles started: No DataNodes are running.

    So I've tried to start the DataNode and I get the error:



    Supervisor returned FATAL: + '[' /usr/share/cmf ']'
    ++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
    ++ tr -d '\n'
    + ADD_TO_CP=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + eval 'OLD_VALUE=$HADOOP_CLASSPATH'
    ++ OLD_VALUE=
    + '[' -z ']'
    + export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + set -x
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/8-hdfs-DATANODE#g' /run/cloudera-scm-agent/process/8-hdfs-DATANODE/core-site.xml /run/cloudera-scm-agent/process/8-hdfs-DATANODE/hdfs-site.xml
    + '[' -e /run/cloudera-scm-agent/process/8-hdfs-DATANODE/topology.py ']'
    ++ find /run/cloudera-scm-agent/process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
    + OUTPUT=
    + '[' '' '!=' '' ']'
    + export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Djava.net.preferIPv4Stack=true '
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = datanode ']'
    + '[' file-operation = datanode ']'
    + '[' bootstrap = datanode ']'
    + '[' failover = datanode ']'
    + '[' transition-to-active = datanode ']'
    + '[' initializeSharedEdits = datanode ']'
    + '[' initialize-znode = datanode ']'
    + '[' format-namenode = datanode ']'
    + '[' monitor-decommission = datanode ']'
    + '[' monitor-upgrade = datanode ']'
    + '[' finalize-upgrade = datanode ']'
    + '[' mkdir = datanode ']'
    + '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
    + HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/8-hdfs-DATANODE datanode
  • bc Wong at Oct 8, 2012 at 2:33 pm
    And what does it say if you click on the "Role Log File" link in that
    screenshot? (Or the same for the DN.)

    On Mon, Oct 8, 2012 at 4:55 AM, Viktor Schmidt
    wrote:
    Start the SecondaryNameNode fails too.

    *Start this SecondaryNameNode<http://localhost:7180/cmf/command/12/details>
    * secondarynamenode (localhost)<http://localhost:7180/cmf/services/6/instances/12/status>
    Role Log File<http://localhost:7180/cmf/process/all/logs/context?port=9000&host=localhost&token=a85ec6d217c7e79cc153fd6c6f30f644&path=%2Fvar%2Flog%2Fhadoop-hdfs%2Fhadoop-cmf-hdfs1-SECONDARYNAMENODE-localhost.log.out&roleId=12> 08-Oct-2012
    13:40:04 ** Finished , 08-Oct-2012 13:40:32


    Supervisor returned FATAL: + eval 'OLD_VALUE=$HADOOP_CLASSPATH' ++ OLD_VALUE= + '[' -z ']' + export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar + HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar + set -x + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE#g' /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/core-site.xml /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/hdfs-site.xml + '[' -e /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/topology.py ']' ++ find /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE -maxdepth 1 -name '*.py' + OUTPUT= + '[' '' '!=' '' ']' + export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true ' + HADOOP_OPTS='-Djava.net.preferIPv4Stack=true ' + export HADOOP_IDENT_STRING=hdfs + HADOOP_IDENT_STRING=hdfs + '[' -n '' ']' + acquire_kerberos_tgt hdfs.keytab + '[' -z hdfs.keytab ']' + '[' -n '' ']' + '[' validate-writable-empty-dirs = secondarynamenode ']' + '[' file-operation = secondarynamenode ']' + '[' bootstrap = secondarynamenode ']' + '[' failover = secondarynamenode ']' + '[' transition-to-active = secondarynamenode ']' + '[' initializeSharedEdits = secondarynamenode ']' + '[' initialize-znode = secondarynamenode ']' + '[' format-namenode = secondarynamenode ']' + '[' monitor-decommission = secondarynamenode ']' + '[' monitor-upgrade = secondarynamenode ']' + '[' finalize-upgrade = secondarynamenode ']' + '[' mkdir = secondarynamenode ']' + '[' namenode = secondarynamenode -o secondarynamenode = secondarynamenode -o datanode = secondarynamenode ']' + HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE secondarynamenode


    Am Montag, 8. Oktober 2012 13:37:41 UTC+2 schrieb Viktor Schmidt:
    I have same problem... single machine, CM...
    it's fails when start HDFS service too.

    after reinstall:

    Starting your cluster services.

    Completed 2 of 9 steps.
    Formatting HDFS if empty

    NameNode is already formatted and contains data.

    Starting HDFS Service
    Details** <http://localhost:7180/cmf/command/8/details>


    Service did not start successfully; not all of the required roles started: No DataNodes are running.




    Am Donnerstag, 4. Oktober 2012 21:38:10 UTC+2 schrieb Ana Larissa Dias:
    Hi,
    I install CDH4 in a single machine using Cloudera Manager, but it's
    fails when start HDFS service:

    Service did not start successfully; not all of the required roles started: No DataNodes are running.

    So I've tried to start the DataNode and I get the error:




    Supervisor returned FATAL: + '[' /usr/share/cmf ']'
    ++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
    ++ tr -d '\n'
    + ADD_TO_CP=/usr/share/cmf/lib/**plugins/event-publish-4.0.4-**shaded.jar
    + eval 'OLD_VALUE=$HADOOP_CLASSPATH'
    ++ OLD_VALUE=
    + '[' -z ']'
    + export HADOOP_CLASSPATH=/usr/share/**cmf/lib/plugins/event-publish-**4.0.4-shaded.jar
    + HADOOP_CLASSPATH=/usr/share/**cmf/lib/plugins/event-publish-**4.0.4-shaded.jar
    + set -x
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/**cloudera-scm-agent/process/8-**hdfs-DATANODE#g' /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/core-**site.xml /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/hdfs-**site.xml
    + '[' -e /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/**topology.py ']'
    ++ find /run/cloudera-scm-agent/**process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
    + OUTPUT=
    + '[' '' '!=' '' ']'
    + export 'HADOOP_OPTS=-Djava.net.**preferIPv4Stack=true '
    + HADOOP_OPTS='-Djava.net.**preferIPv4Stack=true '
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = datanode ']'
    + '[' file-operation = datanode ']'
    + '[' bootstrap = datanode ']'
    + '[' failover = datanode ']'
    + '[' transition-to-active = datanode ']'
    + '[' initializeSharedEdits = datanode ']'
    + '[' initialize-znode = datanode ']'
    + '[' format-namenode = datanode ']'
    + '[' monitor-decommission = datanode ']'
    + '[' monitor-upgrade = datanode ']'
    + '[' finalize-upgrade = datanode ']'
    + '[' mkdir = datanode ']'
    + '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
    + HADOOP_OPTS='-Dsecurity.audit.**logger=INFO,RFAS -Djava.net.preferIPv4Stack=**true '
    + export 'HADOOP_OPTS=-Dhdfs.audit.**logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,**RFAS -Djava.net.preferIPv4Stack=**true '
    + HADOOP_OPTS='-Dhdfs.audit.**logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,**RFAS -Djava.net.preferIPv4Stack=**true '
    + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/**process/8-hdfs-DATANODE datanode
  • Viktor Schmidt at Oct 8, 2012 at 4:25 pm
    Sorry, I am not sure... I reinstalled ubuntu and cloudera and it seems to
    work now... no one problem

    Am Montag, 8. Oktober 2012 16:32:59 UTC+2 schrieb bc Wong:
    And what does it say if you click on the "Role Log File" link in that
    screenshot? (Or the same for the DN.)

    On Mon, Oct 8, 2012 at 4:55 AM, Viktor Schmidt <viktoria...@gmail.com<javascript:>
    wrote:
    Start the SecondaryNameNode fails too.

    *Start this SecondaryNameNode<http://localhost:7180/cmf/command/12/details>
    * secondarynamenode (localhost)<http://localhost:7180/cmf/services/6/instances/12/status>
    Role Log File<http://localhost:7180/cmf/process/all/logs/context?port=9000&host=localhost&token=a85ec6d217c7e79cc153fd6c6f30f644&path=%2Fvar%2Flog%2Fhadoop-hdfs%2Fhadoop-cmf-hdfs1-SECONDARYNAMENODE-localhost.log.out&roleId=12> 08-Oct-2012
    13:40:04 ** Finished , 08-Oct-2012 13:40:32

    Supervisor returned FATAL: + eval 'OLD_VALUE=$HADOOP_CLASSPATH' ++ OLD_VALUE= + '[' -z ']' + export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar + HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar + set -x + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE#g' /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/core-site.xml /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/hdfs-site.xml + '[' -e /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE/topology.py ']' ++ find /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE -maxdepth 1 -name '*.py' + OUTPUT= + '[' '' '!=' '' ']' + export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true ' + HADOOP_OPTS='-Djava.net.preferIPv4Stack=true ' + export HADOOP_IDENT_STRING=hdfs + HADOOP_IDENT_STRING=hdfs + '[' -n '' ']' + acquire_kerberos_tgt hdfs.keytab + '[' -z hdfs.keytab ']' + '[' -n '' ']' + '[' validate-writable-empty-dirs = secondarynamenode ']' + '[' file-operation = secondarynamenode ']' + '[' bootstrap = secondarynamenode ']' + '[' failover = secondarynamenode ']' + '[' transition-to-active = secondarynamenode ']' + '[' initializeSharedEdits = secondarynamenode ']' + '[' initialize-znode = secondarynamenode ']' + '[' format-namenode = secondarynamenode ']' + '[' monitor-decommission = secondarynamenode ']' + '[' monitor-upgrade = secondarynamenode ']' + '[' finalize-upgrade = secondarynamenode ']' + '[' mkdir = secondarynamenode ']' + '[' namenode = secondarynamenode -o secondarynamenode = secondarynamenode -o datanode = secondarynamenode ']' + HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true ' + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/7-hdfs-SECONDARYNAMENODE secondarynamenode


    Am Montag, 8. Oktober 2012 13:37:41 UTC+2 schrieb Viktor Schmidt:
    I have same problem... single machine, CM...
    it's fails when start HDFS service too.

    after reinstall:

    Starting your cluster services.

    Completed 2 of 9 steps.
    Formatting HDFS if empty

    NameNode is already formatted and contains data.

    Starting HDFS Service
    Details** <http://localhost:7180/cmf/command/8/details>

    Service did not start successfully; not all of the required roles started: No DataNodes are running.




    Am Donnerstag, 4. Oktober 2012 21:38:10 UTC+2 schrieb Ana Larissa Dias:
    Hi,
    I install CDH4 in a single machine using Cloudera Manager, but it's
    fails when start HDFS service:

    Service did not start successfully; not all of the required roles started: No DataNodes are running.


    So I've tried to start the DataNode and I get the error:



    Supervisor returned FATAL: + '[' /usr/share/cmf ']'
    ++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
    ++ tr -d '\n'
    + ADD_TO_CP=/usr/share/cmf/lib/**plugins/event-publish-4.0.4-**shaded.jar
    + eval 'OLD_VALUE=$HADOOP_CLASSPATH'
    ++ OLD_VALUE=
    + '[' -z ']'
    + export HADOOP_CLASSPATH=/usr/share/**cmf/lib/plugins/event-publish-**4.0.4-shaded.jar
    + HADOOP_CLASSPATH=/usr/share/**cmf/lib/plugins/event-publish-**4.0.4-shaded.jar
    + set -x
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/**cloudera-scm-agent/process/8-**hdfs-DATANODE#g' /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/core-**site.xml /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/hdfs-**site.xml
    + '[' -e /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/**topology.py ']'
    ++ find /run/cloudera-scm-agent/**process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
    + OUTPUT=
    + '[' '' '!=' '' ']'
    + export 'HADOOP_OPTS=-Djava.net.**preferIPv4Stack=true '
    + HADOOP_OPTS='-Djava.net.**preferIPv4Stack=true '
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = datanode ']'
    + '[' file-operation = datanode ']'
    + '[' bootstrap = datanode ']'
    + '[' failover = datanode ']'
    + '[' transition-to-active = datanode ']'
    + '[' initializeSharedEdits = datanode ']'
    + '[' initialize-znode = datanode ']'
    + '[' format-namenode = datanode ']'
    + '[' monitor-decommission = datanode ']'
    + '[' monitor-upgrade = datanode ']'
    + '[' finalize-upgrade = datanode ']'
    + '[' mkdir = datanode ']'
    + '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
    + HADOOP_OPTS='-Dsecurity.audit.**logger=INFO,RFAS -Djava.net.preferIPv4Stack=**true '
    + export 'HADOOP_OPTS=-Dhdfs.audit.**logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,**RFAS -Djava.net.preferIPv4Stack=**true '
    + HADOOP_OPTS='-Dhdfs.audit.**logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,**RFAS -Djava.net.preferIPv4Stack=**true '
    + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/**process/8-hdfs-DATANODE datanode
  • Emre at Oct 9, 2012 at 7:19 am
    I get the same error on a single-node Ubuntu 12.04 cluster. One person
    solved the problem by editing /etc/hosts:
    https://groups.google.com/a/cloudera.org/d/topic/scm-users/5DMxhidRFB0/discussion

    This did not for me yet. My log file says:

    Initialization failed for block pool Block pool
    BP-1480654419-127.0.1.1-1349759642196 (storage id
    DS-1335128386-127.0.1.1-50010-1349759655474) service to
    Minerva/127.0.1.1:8020
    org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException):
    Datanode denied communication with namenode:
    DatanodeRegistration(127.0.0.1,
    storageID=DS-1335128386-127.0.1.1-50010-1349759655474, infoPort=50075,
    ipcPort=50020, storageInfo=lv=-40;cid=cluster6;nsid=277638881;c=0)
    On Thursday, October 4, 2012 12:38:10 PM UTC-7, Ana Larissa Dias wrote:

    Hi,
    I install CDH4 in a single machine using Cloudera Manager, but it's fails
    when start HDFS service:

    Service did not start successfully; not all of the required roles started: No DataNodes are running.

    So I've tried to start the DataNode and I get the error:



    Supervisor returned FATAL: + '[' /usr/share/cmf ']'
    ++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
    ++ tr -d '\n'
    + ADD_TO_CP=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + eval 'OLD_VALUE=$HADOOP_CLASSPATH'
    ++ OLD_VALUE=
    + '[' -z ']'
    + export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + set -x
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/8-hdfs-DATANODE#g' /run/cloudera-scm-agent/process/8-hdfs-DATANODE/core-site.xml /run/cloudera-scm-agent/process/8-hdfs-DATANODE/hdfs-site.xml
    + '[' -e /run/cloudera-scm-agent/process/8-hdfs-DATANODE/topology.py ']'
    ++ find /run/cloudera-scm-agent/process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
    + OUTPUT=
    + '[' '' '!=' '' ']'
    + export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Djava.net.preferIPv4Stack=true '
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = datanode ']'
    + '[' file-operation = datanode ']'
    + '[' bootstrap = datanode ']'
    + '[' failover = datanode ']'
    + '[' transition-to-active = datanode ']'
    + '[' initializeSharedEdits = datanode ']'
    + '[' initialize-znode = datanode ']'
    + '[' format-namenode = datanode ']'
    + '[' monitor-decommission = datanode ']'
    + '[' monitor-upgrade = datanode ']'
    + '[' finalize-upgrade = datanode ']'
    + '[' mkdir = datanode ']'
    + '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
    + HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/8-hdfs-DATANODE datanode
  • Emre at Oct 9, 2012 at 9:27 am
    I get the same error too, also using a single-node Ubuntu 12.04 cluster.
    The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/current/fsimage_0000000000000000366, cpktTxId=0000000000000000366)

    On Thursday, October 4, 2012 12:38:10 PM UTC-7, Ana Larissa Dias wrote:

    Hi,
    I install CDH4 in a single machine using Cloudera Manager, but it's fails
    when start HDFS service:

    Service did not start successfully; not all of the required roles started: No DataNodes are running.

    So I've tried to start the DataNode and I get the error:



    Supervisor returned FATAL: + '[' /usr/share/cmf ']'
    ++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
    ++ tr -d '\n'
    + ADD_TO_CP=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + eval 'OLD_VALUE=$HADOOP_CLASSPATH'
    ++ OLD_VALUE=
    + '[' -z ']'
    + export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar
    + set -x
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/8-hdfs-DATANODE#g' /run/cloudera-scm-agent/process/8-hdfs-DATANODE/core-site.xml /run/cloudera-scm-agent/process/8-hdfs-DATANODE/hdfs-site.xml
    + '[' -e /run/cloudera-scm-agent/process/8-hdfs-DATANODE/topology.py ']'
    ++ find /run/cloudera-scm-agent/process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
    + OUTPUT=
    + '[' '' '!=' '' ']'
    + export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Djava.net.preferIPv4Stack=true '
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = datanode ']'
    + '[' file-operation = datanode ']'
    + '[' bootstrap = datanode ']'
    + '[' failover = datanode ']'
    + '[' transition-to-active = datanode ']'
    + '[' initializeSharedEdits = datanode ']'
    + '[' initialize-znode = datanode ']'
    + '[' format-namenode = datanode ']'
    + '[' monitor-decommission = datanode ']'
    + '[' monitor-upgrade = datanode ']'
    + '[' finalize-upgrade = datanode ']'
    + '[' mkdir = datanode ']'
    + '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
    + HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
    + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/process/8-hdfs-DATANODE datanode
  • bc Wong at Oct 9, 2012 at 8:54 pm

    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04 cluster.
    The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/current/fsimage_0000000000000000366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the `hdfs'
    user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You can
    do a `netstat -tln' to find out.) Because that's where your DN thinks your
    NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc

    On Thursday, October 4, 2012 12:38:10 PM UTC-7, Ana Larissa Dias wrote:

    Hi,
    I install CDH4 in a single machine using Cloudera Manager, but it's
    fails when start HDFS service:

    Service did not start successfully; not all of the required roles started: No DataNodes are running.


    So I've tried to start the DataNode and I get the error:




    Supervisor returned FATAL: + '[' /usr/share/cmf ']'
    ++ find /usr/share/cmf/lib/plugins -name 'event-publish-*.jar'
    ++ tr -d '\n'
    + ADD_TO_CP=/usr/share/cmf/lib/**plugins/event-publish-4.0.4-**shaded.jar
    + eval 'OLD_VALUE=$HADOOP_CLASSPATH'
    ++ OLD_VALUE=
    + '[' -z ']'
    + export HADOOP_CLASSPATH=/usr/share/**cmf/lib/plugins/event-publish-**4.0.4-shaded.jar
    + HADOOP_CLASSPATH=/usr/share/**cmf/lib/plugins/event-publish-**4.0.4-shaded.jar
    + set -x
    + perl -pi -e 's#{{CMF_CONF_DIR}}#/run/**cloudera-scm-agent/process/8-**hdfs-DATANODE#g' /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/core-**site.xml /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/hdfs-**site.xml
    + '[' -e /run/cloudera-scm-agent/**process/8-hdfs-DATANODE/**topology.py ']'
    ++ find /run/cloudera-scm-agent/**process/8-hdfs-DATANODE -maxdepth 1 -name '*.py'
    + OUTPUT=
    + '[' '' '!=' '' ']'
    + export 'HADOOP_OPTS=-Djava.net.**preferIPv4Stack=true '
    + HADOOP_OPTS='-Djava.net.**preferIPv4Stack=true '
    + export HADOOP_IDENT_STRING=hdfs
    + HADOOP_IDENT_STRING=hdfs
    + '[' -n '' ']'
    + acquire_kerberos_tgt hdfs.keytab
    + '[' -z hdfs.keytab ']'
    + '[' -n '' ']'
    + '[' validate-writable-empty-dirs = datanode ']'
    + '[' file-operation = datanode ']'
    + '[' bootstrap = datanode ']'
    + '[' failover = datanode ']'
    + '[' transition-to-active = datanode ']'
    + '[' initializeSharedEdits = datanode ']'
    + '[' initialize-znode = datanode ']'
    + '[' format-namenode = datanode ']'
    + '[' monitor-decommission = datanode ']'
    + '[' monitor-upgrade = datanode ']'
    + '[' finalize-upgrade = datanode ']'
    + '[' mkdir = datanode ']'
    + '[' namenode = datanode -o secondarynamenode = datanode -o datanode = datanode ']'
    + HADOOP_OPTS='-Dsecurity.audit.**logger=INFO,RFAS -Djava.net.preferIPv4Stack=**true '
    + export 'HADOOP_OPTS=-Dhdfs.audit.**logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,**RFAS -Djava.net.preferIPv4Stack=**true '
    + HADOOP_OPTS='-Dhdfs.audit.**logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,**RFAS -Djava.net.preferIPv4Stack=**true '
    + exec /usr/lib/hadoop-hdfs/bin/hdfs --config /run/cloudera-scm-agent/**process/8-hdfs-DATANODE datanode
  • Emre at Oct 9, 2012 at 8:24 pm
    The NameNode is not running, and (Re)starting fails; that is the problem.
    How do I know whether hdfs has permissions to read/write "it"? Are you
    referring to file fsimage_0000000000000000366? That is owned by user and
    group hdfs.

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:

    On Tue, Oct 9, 2012 at 2:00 AM, Emre <esa...@gmail.com <javascript:>>wrote:
    I get the same error too, also using a single-node Ubuntu 12.04 cluster.
    The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/current/fsimage_0000000000000000366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the `hdfs'
    user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You can
    do a `netstat -tln' to find out.) Because that's where your DN thinks your
    NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • Chris at Oct 9, 2012 at 9:02 pm
    This may be unrelated, but we had some issues a couple of weeks ago when we
    had to do a full cluster outage to migrate our name node to another ESX
    host. The problem was, that when I started the c luster through cloudera
    manager, I was getting similar messages to this that you are seeing. Other
    messages I saw were that the "Filesystem was not formatted" and that
    "things couldnt be found" in what appeared to be a default directory for CM
    and Hadoop. Basically it was as if my cluster's critical namenode data was
    completely gone.

    Turn's out, when I started all the cluster components up... Cloudera
    Manager, namenode, datanodes, tableau, hive server, etc - that I didn't
    wait longer enough in some phases of starting everything back up. I don't
    know exactly where I went "too fast" through things, but it was like the
    cluster started up with blank configurations instead of our production
    cluster. I tried starting the cluster several times through the Service
    button in CM to get the same results.

    I started to research things wondering what had gone wrong, and about 20
    minutes later I tried to start it up again and everything worked... so I
    believe that certain components have a "warm up" time if you will, until
    they come online to interface with other components.

    I apologize if this is not relevant, but hey - if there's a 1% chance that
    it is, why not share the findings.

    Good luck,

    -Chris


    On Tue, Oct 9, 2012 at 1:18 PM, Emre wrote:

    The NameNode is not running, and (Re)starting fails; that is the problem.
    How do I know whether hdfs has permissions to read/write "it"? Are you
    referring to file fsimage_0000000000000000366? That is owned by user and
    group hdfs.

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04 cluster.
    The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**current/fsimage_**0000000000000000366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the `hdfs'
    user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You can
    do a `netstat -tln' to find out.) Because that's where your DN thinks your
    NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • bc Wong at Oct 9, 2012 at 11:03 pm

    On Tue, Oct 9, 2012 at 1:18 PM, Emre wrote:

    The NameNode is not running, and (Re)starting fails; that is the problem.
    How do I know whether hdfs has permissions to read/write "it"? Are you
    referring to file fsimage_0000000000000000366? That is owned by user and
    group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load image
    from FSImageFile" is actually caused by another IOException. We need to
    know what that underlying IOException is. (It could range from permission
    problem to corrupted fsimage.) Can you paste the whole stack trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04 cluster.
    The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**current/fsimage_**0000000000000000366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the `hdfs'
    user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You can
    do a `netstat -tln' to find out.) Because that's where your DN thinks your
    NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • Emre at Oct 9, 2012 at 10:10 pm
    Absolutely. Sorry for omitting important details; I just didn't want to
    clog the thread with cruft. The full entry is:

    Exception in namenode join java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/current/fsimage_0000000000000000366, cpktTxId=0000000000000000366)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:658)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:589)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1128)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1192)
    Caused by: *java.io.IOException: No MD5 file found corresponding to image file* /mnt/dfs/nn/current/fsimage_0000000000000000366
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:743)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:639)


    On Tuesday, October 9, 2012 2:58:05 PM UTC-7, bc Wong wrote:

    On Tue, Oct 9, 2012 at 1:18 PM, Emre <esa...@gmail.com <javascript:>>wrote:
    The NameNode is not running, and (Re)starting fails; that is the problem.
    How do I know whether hdfs has permissions to read/write "it"? Are you
    referring to file fsimage_0000000000000000366? That is owned by user and
    group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load image
    from FSImageFile" is actually caused by another IOException. We need to
    know what that underlying IOException is. (It could range from permission
    problem to corrupted fsimage.) Can you paste the whole stack trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04
    cluster. The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**current/fsimage_**0000000000000000366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the `hdfs'
    user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You
    can do a `netstat -tln' to find out.) Because that's where your DN thinks
    your NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • Emre at Oct 9, 2012 at 10:34 pm
    I hasted to add that fsimage_0000000000000000366.md5 does not exist, but
    fsimage_0000000000000000366.md5.tmp *does*, and it looks just like the
    other MD5 files in the folder.

    The obvious thing do to was to rename the file, which I did, and the
    service started! Then I tried to enable the other services and I got as far
    as the JobTracker which failed with:

    org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

    ...

    Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

    On Tuesday, October 9, 2012 2:58:05 PM UTC-7, bc Wong wrote:

    On Tue, Oct 9, 2012 at 1:18 PM, Emre <esa...@gmail.com <javascript:>>wrote:
    The NameNode is not running, and (Re)starting fails; that is the problem.
    How do I know whether hdfs has permissions to read/write "it"? Are you
    referring to file fsimage_0000000000000000366? That is owned by user and
    group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load image
    from FSImageFile" is actually caused by another IOException. We need to
    know what that underlying IOException is. (It could range from permission
    problem to corrupted fsimage.) Can you paste the whole stack trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04
    cluster. The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**current/fsimage_**0000000000000000366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the `hdfs'
    user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You
    can do a `netstat -tln' to find out.) Because that's where your DN thinks
    your NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • bc Wong at Oct 9, 2012 at 10:38 pm

    On Tue, Oct 9, 2012 at 3:34 PM, Emre wrote:

    I hasted to add that fsimage_0000000000000000366.md5 does not exist, but
    fsimage_0000000000000000366.md5.tmp *does*, and it looks just like the
    other MD5 files in the folder.

    The obvious thing do to was to rename the file, which I did, and the
    service started! Then I tried to enable the other services and I got as far
    as the JobTracker which failed with:

    org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

    ...

    Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

    You hit https://issues.apache.org/jira/browse/HDFS-3736. Glad you
    resolved it.

    For the JT startup, go to your CM. HDFS -> NameNode -> Actions -> Create
    /tmp directory. That should fix it.

    Cheers,
    bc


    On Tuesday, October 9, 2012 2:58:05 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 1:18 PM, Emre wrote:

    The NameNode is not running, and (Re)starting fails; that is the
    problem. How do I know whether hdfs has permissions to read/write "it"? Are
    you referring to file fsimage_0000000000000000366? That is owned by user
    and group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load image
    from FSImageFile" is actually caused by another IOException. We need to
    know what that underlying IOException is. (It could range from permission
    problem to corrupted fsimage.) Can you paste the whole stack trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04
    cluster. The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**c**urrent/fsimage_**000000000000000**0366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the `hdfs'
    user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You
    can do a `netstat -tln' to find out.) Because that's where your DN thinks
    your NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • Emre at Oct 9, 2012 at 10:53 pm
    Thanks, I think creating /tmp worked, but now I got

    Can not start task tracker because org.apache.hadoop.util.DiskChecker$DiskErrorException: No mapred local directories are writable


    I believe it is referring to /mnt/dos/mapred/local on my system. It exists
    and is empty.
    On Tuesday, October 9, 2012 3:38:49 PM UTC-7, bc Wong wrote:

    On Tue, Oct 9, 2012 at 3:34 PM, Emre <esa...@gmail.com <javascript:>>wrote:
    I hasted to add that fsimage_0000000000000000366.md5 does not exist, but
    fsimage_0000000000000000366.md5.tmp *does*, and it looks just like the
    other MD5 files in the folder.

    The obvious thing do to was to rename the file, which I did, and the
    service started! Then I tried to enable the other services and I got as far
    as the JobTracker which failed with:

    org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

    ...

    Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

    You hit https://issues.apache.org/jira/browse/HDFS-3736. Glad you
    resolved it.

    For the JT startup, go to your CM. HDFS -> NameNode -> Actions -> Create
    /tmp directory. That should fix it.

    Cheers,
    bc


    On Tuesday, October 9, 2012 2:58:05 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 1:18 PM, Emre wrote:

    The NameNode is not running, and (Re)starting fails; that is the
    problem. How do I know whether hdfs has permissions to read/write "it"? Are
    you referring to file fsimage_0000000000000000366? That is owned by user
    and group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load image
    from FSImageFile" is actually caused by another IOException. We need to
    know what that underlying IOException is. (It could range from permission
    problem to corrupted fsimage.) Can you paste the whole stack trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04
    cluster. The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**c**urrent/fsimage_**000000000000000**0366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the
    `hdfs' user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You
    can do a `netstat -tln' to find out.) Because that's where your DN thinks
    your NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • bc Wong at Oct 9, 2012 at 11:12 pm

    On Tue, Oct 9, 2012 at 3:53 PM, Emre wrote:

    Thanks, I think creating /tmp worked, but now I got

    Can not start task tracker because org.apache.hadoop.util.DiskChecker$DiskErrorException: No mapred local directories are writable


    I believe it is referring to /mnt/dos/mapred/local on my system. It exists
    and is empty.
    Is it writable by the `mapred' user? You can do `sudo -u mapred test
    /mnt/dos/mapred/local && echo ok'.

    Btw, it looks like that that directory is remove (NFS/CIFS). That's very
    bad for performance. You want a real local directory.

    Cheers,
    bc


    On Tuesday, October 9, 2012 3:38:49 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 3:34 PM, Emre wrote:

    I hasted to add that fsimage_0000000000000000366.**md5 does not exist,
    but fsimage_0000000000000000366.**md5.tmp *does*, and it looks just
    like the other MD5 files in the folder.

    The obvious thing do to was to rename the file, which I did, and the
    service started! Then I tried to enable the other services and I got as far
    as the JobTracker which failed with:

    org.apache.hadoop.security.**AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:**drwxr-xr-x

    ...

    Caused by: org.apache.hadoop.ipc.**RemoteException(org.apache.**hadoop.security.**AccessControlException): Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:**drwxr-xr-x

    You hit https://issues.apache.org/**jira/browse/HDFS-3736<https://issues.apache.org/jira/browse/HDFS-3736>.
    Glad you resolved it.

    For the JT startup, go to your CM. HDFS -> NameNode -> Actions -> Create
    /tmp directory. That should fix it.

    Cheers,
    bc


    On Tuesday, October 9, 2012 2:58:05 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 1:18 PM, Emre wrote:

    The NameNode is not running, and (Re)starting fails; that is the
    problem. How do I know whether hdfs has permissions to read/write "it"? Are
    you referring to file fsimage_0000000000000000366? That is owned by user
    and group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load
    image from FSImageFile" is actually caused by another IOException. We need
    to know what that underlying IOException is. (It could range from
    permission problem to corrupted fsimage.) Can you paste the whole stack
    trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04
    cluster. The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**c****urrent/fsimage_**000000000000000****0366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the
    `hdfs' user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020? (You
    can do a `netstat -tln' to find out.) Because that's where your DN thinks
    your NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • Emre at Oct 10, 2012 at 1:17 am
    Your test returns "ok". The folder is local, not removable, but it is NTFS.
    I think it's because NTFS has user permission issues in Linux; I don't know
    if it's an architectural shortcoming or if it's due to the ntfs-3g driver
    in Linux. Thinking that might be the problem I tried an ext4 partition and
    got:

    org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x


    Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hbase, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x


    I successfully ran your "sudo -u mapred test /mnt/mapred/local && echo ok"
    on the ext4 partition. The /mnt/mapred/local folder appears to be owned by
    mapred and group hadoop, while its subfolders (taskTracker, toBeDeleted,
    tt_log_tmp, ttprivate, userlogs) are owned by mapred and group mapred. Any
    ideas?
    On Tuesday, October 9, 2012 4:07:37 PM UTC-7, bc Wong wrote:

    On Tue, Oct 9, 2012 at 3:53 PM, Emre <esa...@gmail.com <javascript:>>wrote:
    Thanks, I think creating /tmp worked, but now I got

    Can not start task tracker because org.apache.hadoop.util.DiskChecker$DiskErrorException: No mapred local directories are writable


    I believe it is referring to /mnt/dos/mapred/local on my system. It
    exists and is empty.
    Is it writable by the `mapred' user? You can do `sudo -u mapred test
    /mnt/dos/mapred/local && echo ok'.

    Btw, it looks like that that directory is remove (NFS/CIFS). That's very
    bad for performance. You want a real local directory.

    Cheers,
    bc


    On Tuesday, October 9, 2012 3:38:49 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 3:34 PM, Emre wrote:

    I hasted to add that fsimage_0000000000000000366.**md5 does not exist,
    but fsimage_0000000000000000366.**md5.tmp *does*, and it looks just
    like the other MD5 files in the folder.

    The obvious thing do to was to rename the file, which I did, and the
    service started! Then I tried to enable the other services and I got as far
    as the JobTracker which failed with:

    org.apache.hadoop.security.**AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:**drwxr-xr-x

    ...

    Caused by: org.apache.hadoop.ipc.**RemoteException(org.apache.**hadoop.security.**AccessControlException): Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:**drwxr-xr-x

    You hit https://issues.apache.org/**jira/browse/HDFS-3736<https://issues.apache.org/jira/browse/HDFS-3736>.
    Glad you resolved it.

    For the JT startup, go to your CM. HDFS -> NameNode -> Actions -> Create
    /tmp directory. That should fix it.

    Cheers,
    bc


    On Tuesday, October 9, 2012 2:58:05 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 1:18 PM, Emre wrote:

    The NameNode is not running, and (Re)starting fails; that is the
    problem. How do I know whether hdfs has permissions to read/write "it"? Are
    you referring to file fsimage_0000000000000000366? That is owned by user
    and group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load
    image from FSImageFile" is actually caused by another IOException. We need
    to know what that underlying IOException is. (It could range from
    permission problem to corrupted fsimage.) Can you paste the whole stack
    trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04
    cluster. The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**c****urrent/fsimage_**000000000000000****0366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the
    `hdfs' user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020?
    (You can do a `netstat -tln' to find out.) Because that's where your DN
    thinks your NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc
  • Emre at Oct 10, 2012 at 1:52 am
    Apparently I needed to run "Create Root Directory" from HBase after I moved
    it from one partition to another. Now everything is started and in good
    health. Including this user.
    On Tuesday, October 9, 2012 6:17:07 PM UTC-7, Emre wrote:

    Your test returns "ok". The folder is local, not removable, but it is
    NTFS. I think it's because NTFS has user permission issues in Linux; I
    don't know if it's an architectural shortcoming or if it's due to the
    ntfs-3g driver in Linux. Thinking that might be the problem I tried an ext4
    partition and got:

    org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x


    Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hbase, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x


    I successfully ran your "sudo -u mapred test /mnt/mapred/local && echo ok"
    on the ext4 partition. The /mnt/mapred/local folder appears to be owned by
    mapred and group hadoop, while its subfolders (taskTracker, toBeDeleted,
    tt_log_tmp, ttprivate, userlogs) are owned by mapred and group mapred. Any
    ideas?
    On Tuesday, October 9, 2012 4:07:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 3:53 PM, Emre wrote:

    Thanks, I think creating /tmp worked, but now I got

    Can not start task tracker because org.apache.hadoop.util.DiskChecker$DiskErrorException: No mapred local directories are writable


    I believe it is referring to /mnt/dos/mapred/local on my system. It
    exists and is empty.
    Is it writable by the `mapred' user? You can do `sudo -u mapred test
    /mnt/dos/mapred/local && echo ok'.

    Btw, it looks like that that directory is remove (NFS/CIFS). That's very
    bad for performance. You want a real local directory.

    Cheers,
    bc


    On Tuesday, October 9, 2012 3:38:49 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 3:34 PM, Emre wrote:

    I hasted to add that fsimage_0000000000000000366.**md5 does not
    exist, but fsimage_0000000000000000366.**md5.tmp *does*, and it looks
    just like the other MD5 files in the folder.

    The obvious thing do to was to rename the file, which I did, and the
    service started! Then I tried to enable the other services and I got as far
    as the JobTracker which failed with:

    org.apache.hadoop.security.**AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:**drwxr-xr-x

    ...

    Caused by: org.apache.hadoop.ipc.**RemoteException(org.apache.**hadoop.security.**AccessControlException): Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:**drwxr-xr-x

    You hit https://issues.apache.org/**jira/browse/HDFS-3736<https://issues.apache.org/jira/browse/HDFS-3736>.
    Glad you resolved it.

    For the JT startup, go to your CM. HDFS -> NameNode -> Actions ->
    Create /tmp directory. That should fix it.

    Cheers,
    bc


    On Tuesday, October 9, 2012 2:58:05 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 1:18 PM, Emre wrote:

    The NameNode is not running, and (Re)starting fails; that is the
    problem. How do I know whether hdfs has permissions to read/write "it"? Are
    you referring to file fsimage_0000000000000000366? That is owned by user
    and group hdfs.
    If you follow the stack trace, you'll see that the "Failed to load
    image from FSImageFile" is actually caused by another IOException. We need
    to know what that underlying IOException is. (It could range from
    permission problem to corrupted fsimage.) Can you paste the whole stack
    trace here?

    Thanks,
    bc

    I set my computer's IP to 192.168.1.103 in /etc/hosts, as
    recommended.
    On Tuesday, October 9, 2012 12:57:37 PM UTC-7, bc Wong wrote:
    On Tue, Oct 9, 2012 at 2:00 AM, Emre wrote:

    I get the same error too, also using a single-node Ubuntu 12.04
    cluster. The Fatal line of my role log file starts

    Exception in namenode join
    java.io.IOException: Failed to load image from FSImageFile(file=/mnt/dfs/nn/**c****urrent/fsimage_**000000000000000****0366, cpktTxId=0000000000000000366)
    * Is your NN running? If not, does that file exist? And does the
    `hdfs' user have permission to read/write to it?
    * If your NN is running, is it listening on 127.0.1.1 port 8020?
    (You can do a `netstat -tln' to find out.) Because that's where your DN
    thinks your NN is. Hadoop really needs both forward and reverse DNS to work.

    Cheers,
    bc

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedOct 4, '12 at 7:38p
activeOct 10, '12 at 1:52a
posts18
users5
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase