FAQ
Hello everybody ,

I need your help please to solve the following problem I encountered :

I was trying a Hadoop process on CDH 4 , when I got " OutOfMemoryErorr" :
java heap space exception . I have 8 G of RAM on my machine , so I
assumed I should use them to increase the heap space allocated for java on
namenode and datanode " mapred.java.child.opts" property using cloudera
manager .

I did modify the "Java heap size of data node in bytes" at configuration
window of HDFS and I did the same for name node and secondary node. I
modify it to the default value 1 GB where it was about 60MB before.

The HDFS and MAPREDUCE services have unexpectedly stopped when I was
trying to run that Hadoop process. Thus I try to change the java heap size
on nodes back to 62 Mb and tried to start the services. unfortunately they
did not start where the details error was : failed to start its roles
  (datanode, namenode ) , "command a time out after 150 seconds"

And for the whole machine I have "memory overcommit" , so why did the
changes do not take place when I change it back to 62 MB and how can I
start the services as it was before?

For the log it has the following :

Error getting HDFS summary for hdfs: org.apache.avro.AvroRemoteException:
java.net.SocketTimeoutException: Read timed out



thanks in advance,

To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.

Search Discussions

  • Darren Lo at May 8, 2014 at 3:31 pm
    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig, then
    you are trying to run too many things on one host. You should probably not
    select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about HDFS roles
    and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I encountered :

    I was trying a Hadoop process on CDH 4 , when I got " OutOfMemoryErorr" :
    java heap space exception . I have 8 G of RAM on my machine , so I
    assumed I should use them to increase the heap space allocated for java on
    namenode and datanode " mapred.java.child.opts" property using cloudera
    manager .

    I did modify the "Java heap size of data node in bytes" at configuration
    window of HDFS and I did the same for name node and secondary node. I
    modify it to the default value 1 GB where it was about 60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I was
    trying to run that Hadoop process. Thus I try to change the java heap size
    on nodes back to 62 Mb and tried to start the services. unfortunately they
    did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did the
    changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs: org.apache.avro.AvroRemoteException:
    java.net.SocketTimeoutException: Read timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it, send an
    email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Ghadeer Abo uda at May 9, 2014 at 4:16 am
    " CM auto-configured your heaps to 62Mb when the default is 1 gig, then you
    are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues."

      The log is too long , however it keeps repeating the following lines :

    Error getting HDFS summary for hdfs: org.apache.avro.AvroRemoteException:
    java.net.SocketTimeoutException: connect timed out
      WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9b failed on
    nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
      at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
      at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
      at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
      at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without deleting CM ?
    Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about HDFS roles
    and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you."

    I think changing "mapred.java.child.opts" propperty of mapred-site.xml is
    by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ? correct
    me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig, then
    you are trying to run too many things on one host. You should probably not
    select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about HDFS roles
    and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I encountered :

    I was trying a Hadoop process on CDH 4 , when I got " OutOfMemoryErorr"
    : java heap space exception . I have 8 G of RAM on my machine , so I
    assumed I should use them to increase the heap space allocated for java on
    namenode and datanode " mapred.java.child.opts" property using cloudera
    manager .

    I did modify the "Java heap size of data node in bytes" at configuration
    window of HDFS and I did the same for name node and secondary node. I
    modify it to the default value 1 GB where it was about 60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I was
    trying to run that Hadoop process. Thus I try to change the java heap size
    on nodes back to 62 Mb and tried to start the services. unfortunately they
    did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did the
    changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs: org.apache.avro.AvroRemoteException:
    java.net.SocketTimeoutException: Read timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it, send an
    email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Darren Lo at May 9, 2014 at 4:58 am
    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably died
    because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all services,
    then click the dropdown menu next to your cluster name and delete the
    cluster. Also stop and delete the Management service. When you re-create
    the cluster, it will normally use the same directories, so you'll find that
    all of your HDFS data is still present. Some steps like formatting the
    namenode will "fail", but that's just because it already has data, and
    isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly, and
    instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just the bare
    minimum that you need. Don't select All Services.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda wrote:


    " CM auto-configured your heaps to 62Mb when the default is 1 gig, then
    you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following lines :

    Error getting HDFS summary for hdfs: org.apache.avro.AvroRemoteException:
    java.net.SocketTimeoutException: connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9b failed on
    nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without deleting CM
    ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you."

    I think changing "mapred.java.child.opts" propperty of mapred-site.xml is
    by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ?
    correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig, then
    you are trying to run too many things on one host. You should probably not
    select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I encountered :

    I was trying a Hadoop process on CDH 4 , when I got " OutOfMemoryErorr"
    : java heap space exception . I have 8 G of RAM on my machine , so I
    assumed I should use them to increase the heap space allocated for java on
    namenode and datanode " mapred.java.child.opts" property using cloudera
    manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I was
    trying to run that Hadoop process. Thus I try to change the java heap size
    on nodes back to 62 Mb and tried to start the services. unfortunately they
    did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did the
    changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it, send
    an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Ghadeer Abo uda at May 9, 2014 at 6:13 am
    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles . and for
    the fresh cluster setup it stopped at the Parcels distribution phase
    claiming that "Host is in bad Health" , the same error Igot even after
    deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably died
    because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all services,
    then click the dropdown menu next to your cluster name and delete the
    cluster. Also stop and delete the Management service. When you re-create
    the cluster, it will normally use the same directories, so you'll find that
    all of your HDFS data is still present. Some steps like formatting the
    namenode will "fail", but that's just because it already has data, and
    isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly, and
    instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just the bare
    minimum that you need. Don't select All Services.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda wrote:


    " CM auto-configured your heaps to 62Mb when the default is 1 gig, then
    you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following lines :

    Error getting HDFS summary for hdfs: org.apache.avro.AvroRemoteException:
    java.net.SocketTimeoutException: connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9b failed
    on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without deleting CM
    ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you."

    I think changing "mapred.java.child.opts" propperty of mapred-site.xml
    is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ?
    correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig, then
    you are trying to run too many things on one host. You should probably not
    select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I encountered :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I was
    trying to run that Hadoop process. Thus I try to change the java heap size
    on nodes back to 62 Mb and tried to start the services. unfortunately they
    did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did the
    changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it, send
    an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Darren Lo at May 9, 2014 at 2:30 pm
    Hi Ghadeer,

    Why is the host health bad? What errors do you see if you click on that
    host in the Hosts page? What do the agent's logs say on that host?

    I'm not sure why it still affected you after deleting the host.

    On Thu, May 8, 2014 at 11:12 PM, Ghadeer Abo uda wrote:

    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles . and
    for the fresh cluster setup it stopped at the Parcels distribution phase
    claiming that "Host is in bad Health" , the same error Igot even after
    deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably died
    because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all services,
    then click the dropdown menu next to your cluster name and delete the
    cluster. Also stop and delete the Management service. When you re-create
    the cluster, it will normally use the same directories, so you'll find that
    all of your HDFS data is still present. Some steps like formatting the
    namenode will "fail", but that's just because it already has data, and
    isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly, and
    instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just the
    bare minimum that you need. Don't select All Services.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda wrote:


    " CM auto-configured your heaps to 62Mb when the default is 1 gig, then
    you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following lines :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException:
    connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9b failed
    on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without deleting
    CM ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you."

    I think changing "mapred.java.child.opts" propperty of mapred-site.xml
    is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ?
    correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host. You should probably
    not select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real problem.
    There's got to be something more interesting in the role logs, stderr, or
    stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I encountered :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I was
    trying to run that Hadoop process. Thus I try to change the java heap size
    on nodes back to 62 Mb and tried to start the services. unfortunately they
    did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did the
    changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it, send
    an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Darren Lo at May 9, 2014 at 4:01 pm
    If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?

    Your agent keeps looking for files in /proc that don't seem to exist, which
    is very odd. On that host, try running "sudo service cloudera-scm-agent
    hard_restart" and see if that helps.

    snippet from log for other readers:
    [08/May/2014 18:12:53 +0000] 1921 MonitorDaemon-Reporter proc_metrics_utils
    ERROR Failed to get file descriptor count for process 2517: [Errno 2] No
    such file or directory: '/proc/2517/fd/'

    On Fri, May 9, 2014 at 8:28 AM, Ghadeer Abo uda wrote:

    Hi Darren ,

    No error appears on host page , it only shows "unknown health" and at
    distributing parcels it shows the "host is in bad health"
    Sincerely , See the attached log of cloudera-scm-agent log and a print
    screen of error ,

    thanks in advance,


    On Fri, May 9, 2014 at 5:30 PM, Darren Lo wrote:

    Hi Ghadeer,

    Why is the host health bad? What errors do you see if you click on that
    host in the Hosts page? What do the agent's logs say on that host?

    I'm not sure why it still affected you after deleting the host.

    On Thu, May 8, 2014 at 11:12 PM, Ghadeer Abo uda wrote:

    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles . and
    for the fresh cluster setup it stopped at the Parcels distribution phase
    claiming that "Host is in bad Health" , the same error Igot even after
    deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably died
    because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all services,
    then click the dropdown menu next to your cluster name and delete the
    cluster. Also stop and delete the Management service. When you re-create
    the cluster, it will normally use the same directories, so you'll find that
    all of your HDFS data is still present. Some steps like formatting the
    namenode will "fail", but that's just because it already has data, and
    isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly, and
    instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just the
    bare minimum that you need. Don't select All Services.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda wrote:


    " CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following lines :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException:
    connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9bfailed on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without deleting
    CM ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you."

    I think changing "mapred.java.child.opts" propperty of mapred-site.xml
    is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ?
    correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host. You should probably
    not select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I encountered
    :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I
    was trying to run that Hadoop process. Thus I try to change the java heap
    size on nodes back to 62 Mb and tried to start the services. unfortunately
    they did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did
    the changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it,
    send an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Ghadeer Abo uda at May 9, 2014 at 4:34 pm
    Hi ,

    "If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?"
    Yes, this shows "Host is in bad health"

    and the restart command worked and did not show that health issue , but
    "as you said before" at first run it shows error for formating namenode
    "Format HDFS if only is empty" , Shall I try format command , then re-run
    this step ?

    and again for the same issue if i need more heap space for a mapreduce task
    , shall i modify the mapreduce-core.xml directly without the cm at
    /etc/hadoop/conf , so won't have this issue again ?

    thank you Darren ,


    On Fri, May 9, 2014 at 6:52 PM, Darren Lo wrote:

    If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?

    Your agent keeps looking for files in /proc that don't seem to exist,
    which is very odd. On that host, try running "sudo service
    cloudera-scm-agent hard_restart" and see if that helps.

    snippet from log for other readers:
    [08/May/2014 18:12:53 +0000] 1921 MonitorDaemon-Reporter
    proc_metrics_utils ERROR Failed to get file descriptor count for process
    2517: [Errno 2] No such file or directory: '/proc/2517/fd/'

    On Fri, May 9, 2014 at 8:28 AM, Ghadeer Abo uda wrote:

    Hi Darren ,

    No error appears on host page , it only shows "unknown health" and at
    distributing parcels it shows the "host is in bad health"
    Sincerely , See the attached log of cloudera-scm-agent log and a print
    screen of error ,

    thanks in advance,


    On Fri, May 9, 2014 at 5:30 PM, Darren Lo wrote:

    Hi Ghadeer,

    Why is the host health bad? What errors do you see if you click on that
    host in the Hosts page? What do the agent's logs say on that host?

    I'm not sure why it still affected you after deleting the host.

    On Thu, May 8, 2014 at 11:12 PM, Ghadeer Abo uda wrote:

    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles . and
    for the fresh cluster setup it stopped at the Parcels distribution phase
    claiming that "Host is in bad Health" , the same error Igot even after
    deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably
    died because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all
    services, then click the dropdown menu next to your cluster name and delete
    the cluster. Also stop and delete the Management service. When you
    re-create the cluster, it will normally use the same directories, so you'll
    find that all of your HDFS data is still present. Some steps like
    formatting the namenode will "fail", but that's just because it already has
    data, and isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly, and
    instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just the
    bare minimum that you need. Don't select All Services.

    Thanks,
    Darren


    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda <dewet.sends@gmail.com
    wrote:
    " CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following lines
    :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException:
    connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9bfailed on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without
    deleting CM ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you."

    I think changing "mapred.java.child.opts" propperty of
    mapred-site.xml is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ?
    correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host. You should probably
    not select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's probably
    easiest to delete your cluster (and management service) and then re-create
    it with a smaller set of services, allowing CM to auto-configure memory to
    be a bit better. Alternatively you could delete services / roles and
    manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about HDFS
    roles and "mapred.java.child.opts", which only applies to MapReduce. It's
    possible that you didn't correctly revert everything to where you had it
    before if you're getting these mixed up. Still, removing services is likely
    the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I
    encountered :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I
    was trying to run that Hadoop process. Thus I try to change the java heap
    size on nodes back to 62 Mb and tried to start the services. unfortunately
    they did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did
    the changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it,
    send an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Darren Lo at May 9, 2014 at 5:40 pm
    First Run should continue even if the HDFS format step fails. You should
    just ignore that failure. Did HDFS start up ok?

    If you want to modify your mapreduce task heap configuration, do so via the
    CM UI (MapReduce Child Java Maximum Heap Size) and then re-deploy client
    configuration.

    On Fri, May 9, 2014 at 9:34 AM, Ghadeer Abo uda wrote:

    Hi ,

    "If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?"
    Yes, this shows "Host is in bad health"

    and the restart command worked and did not show that health issue , but
    "as you said before" at first run it shows error for formating namenode
    "Format HDFS if only is empty" , Shall I try format command , then re-run
    this step ?

    and again for the same issue if i need more heap space for a mapreduce
    task , shall i modify the mapreduce-core.xml directly without the cm at
    /etc/hadoop/conf , so won't have this issue again ?

    thank you Darren ,


    On Fri, May 9, 2014 at 6:52 PM, Darren Lo wrote:

    If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?

    Your agent keeps looking for files in /proc that don't seem to exist,
    which is very odd. On that host, try running "sudo service
    cloudera-scm-agent hard_restart" and see if that helps.

    snippet from log for other readers:
    [08/May/2014 18:12:53 +0000] 1921 MonitorDaemon-Reporter
    proc_metrics_utils ERROR Failed to get file descriptor count for process
    2517: [Errno 2] No such file or directory: '/proc/2517/fd/'

    On Fri, May 9, 2014 at 8:28 AM, Ghadeer Abo uda wrote:

    Hi Darren ,

    No error appears on host page , it only shows "unknown health" and at
    distributing parcels it shows the "host is in bad health"
    Sincerely , See the attached log of cloudera-scm-agent log and a print
    screen of error ,

    thanks in advance,


    On Fri, May 9, 2014 at 5:30 PM, Darren Lo wrote:

    Hi Ghadeer,

    Why is the host health bad? What errors do you see if you click on that
    host in the Hosts page? What do the agent's logs say on that host?

    I'm not sure why it still affected you after deleting the host.


    On Thu, May 8, 2014 at 11:12 PM, Ghadeer Abo uda <dewet.sends@gmail.com
    wrote:
    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles .
    and for the fresh cluster setup it stopped at the Parcels distribution
    phase
    claiming that "Host is in bad Health" , the same error Igot even after
    deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably
    died because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all
    services, then click the dropdown menu next to your cluster name and delete
    the cluster. Also stop and delete the Management service. When you
    re-create the cluster, it will normally use the same directories, so you'll
    find that all of your HDFS data is still present. Some steps like
    formatting the namenode will "fail", but that's just because it already has
    data, and isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly,
    and instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just the
    bare minimum that you need. Don't select All Services.

    Thanks,
    Darren


    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    " CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following
    lines :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException:
    connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9bfailed on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without
    deleting CM ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about
    HDFS roles and "mapred.java.child.opts", which only applies to MapReduce.
    It's possible that you didn't correctly revert everything to where you had
    it before if you're getting these mixed up. Still, removing services is
    likely the best option for you."

    I think changing "mapred.java.child.opts" propperty of
    mapred-site.xml is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ?
    correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host. You should probably
    not select the "All Services" option on such a small machine, especially if
    it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about
    HDFS roles and "mapred.java.child.opts", which only applies to MapReduce.
    It's possible that you didn't correctly revert everything to where you had
    it before if you're getting these mixed up. Still, removing services is
    likely the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I
    encountered :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when I
    was trying to run that Hadoop process. Thus I try to change the java heap
    size on nodes back to 62 Mb and tried to start the services. unfortunately
    they did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why did
    the changes do not take place when I change it back to 62 MB and how can I
    start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it,
    send an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Ghadeer Abo uda at May 9, 2014 at 5:57 pm
    HDFS did not start yet and I can not continue the setup , Sincerely see the
    attached printscreen

    On Fri, May 9, 2014 at 8:40 PM, Darren Lo wrote:

    First Run should continue even if the HDFS format step fails. You should
    just ignore that failure. Did HDFS start up ok?

    If you want to modify your mapreduce task heap configuration, do so via
    the CM UI (MapReduce Child Java Maximum Heap Size) and then re-deploy
    client configuration.

    On Fri, May 9, 2014 at 9:34 AM, Ghadeer Abo uda wrote:

    Hi ,

    "If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?"
    Yes, this shows "Host is in bad health"

    and the restart command worked and did not show that health issue , but
    "as you said before" at first run it shows error for formating namenode
    "Format HDFS if only is empty" , Shall I try format command , then re-run
    this step ?

    and again for the same issue if i need more heap space for a mapreduce
    task , shall i modify the mapreduce-core.xml directly without the cm at
    /etc/hadoop/conf , so won't have this issue again ?

    thank you Darren ,


    On Fri, May 9, 2014 at 6:52 PM, Darren Lo wrote:

    If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?

    Your agent keeps looking for files in /proc that don't seem to exist,
    which is very odd. On that host, try running "sudo service
    cloudera-scm-agent hard_restart" and see if that helps.

    snippet from log for other readers:
    [08/May/2014 18:12:53 +0000] 1921 MonitorDaemon-Reporter
    proc_metrics_utils ERROR Failed to get file descriptor count for process
    2517: [Errno 2] No such file or directory: '/proc/2517/fd/'

    On Fri, May 9, 2014 at 8:28 AM, Ghadeer Abo uda wrote:

    Hi Darren ,

    No error appears on host page , it only shows "unknown health" and at
    distributing parcels it shows the "host is in bad health"
    Sincerely , See the attached log of cloudera-scm-agent log and a print
    screen of error ,

    thanks in advance,


    On Fri, May 9, 2014 at 5:30 PM, Darren Lo wrote:

    Hi Ghadeer,

    Why is the host health bad? What errors do you see if you click on
    that host in the Hosts page? What do the agent's logs say on that host?

    I'm not sure why it still affected you after deleting the host.


    On Thu, May 8, 2014 at 11:12 PM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles .
    and for the fresh cluster setup it stopped at the Parcels distribution
    phase
    claiming that "Host is in bad Health" , the same error Igot even
    after deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably
    died because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all
    services, then click the dropdown menu next to your cluster name and delete
    the cluster. Also stop and delete the Management service. When you
    re-create the cluster, it will normally use the same directories, so you'll
    find that all of your HDFS data is still present. Some steps like
    formatting the namenode will "fail", but that's just because it already has
    data, and isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly,
    and instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just
    the bare minimum that you need. Don't select All Services.

    Thanks,
    Darren


    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    " CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following
    lines :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException:
    connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9bfailed on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without
    deleting CM ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about
    HDFS roles and "mapred.java.child.opts", which only applies to MapReduce.
    It's possible that you didn't correctly revert everything to where you had
    it before if you're getting these mixed up. Still, removing services is
    likely the best option for you."

    I think changing "mapred.java.child.opts" propperty of
    mapred-site.xml is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different ?
    correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1
    gig, then you are trying to run too many things on one host. You should
    probably not select the "All Services" option on such a small machine,
    especially if it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about
    HDFS roles and "mapred.java.child.opts", which only applies to MapReduce.
    It's possible that you didn't correctly revert everything to where you had
    it before if you're getting these mixed up. Still, removing services is
    likely the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I
    encountered :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when
    I was trying to run that Hadoop process. Thus I try to change the java heap
    size on nodes back to 62 Mb and tried to start the services. unfortunately
    they did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why
    did the changes do not take place when I change it back to 62 MB and how
    can I start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from it,
    send an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Darren Lo at May 9, 2014 at 5:55 pm
    What do the details of that command say? Stderr logs should give more
    information.

    On Fri, May 9, 2014 at 10:49 AM, Ghadeer Abo uda wrote:

    HDFS did not start yet and I can not continue the setup , Sincerely see
    the attached printscreen

    On Fri, May 9, 2014 at 8:40 PM, Darren Lo wrote:

    First Run should continue even if the HDFS format step fails. You should
    just ignore that failure. Did HDFS start up ok?

    If you want to modify your mapreduce task heap configuration, do so via
    the CM UI (MapReduce Child Java Maximum Heap Size) and then re-deploy
    client configuration.

    On Fri, May 9, 2014 at 9:34 AM, Ghadeer Abo uda wrote:

    Hi ,

    "If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?"
    Yes, this shows "Host is in bad health"

    and the restart command worked and did not show that health issue , but
    "as you said before" at first run it shows error for formating namenode
    "Format HDFS if only is empty" , Shall I try format command , then re-run
    this step ?

    and again for the same issue if i need more heap space for a mapreduce
    task , shall i modify the mapreduce-core.xml directly without the cm at
    /etc/hadoop/conf , so won't have this issue again ?

    thank you Darren ,


    On Fri, May 9, 2014 at 6:52 PM, Darren Lo wrote:

    If you hover over (or maybe click) the red X on the parcel distribution
    page, you'll see an error message. What does that say?

    Your agent keeps looking for files in /proc that don't seem to exist,
    which is very odd. On that host, try running "sudo service
    cloudera-scm-agent hard_restart" and see if that helps.

    snippet from log for other readers:
    [08/May/2014 18:12:53 +0000] 1921 MonitorDaemon-Reporter
    proc_metrics_utils ERROR Failed to get file descriptor count for process
    2517: [Errno 2] No such file or directory: '/proc/2517/fd/'

    On Fri, May 9, 2014 at 8:28 AM, Ghadeer Abo uda wrote:

    Hi Darren ,

    No error appears on host page , it only shows "unknown health" and at
    distributing parcels it shows the "host is in bad health"
    Sincerely , See the attached log of cloudera-scm-agent log and a print
    screen of error ,

    thanks in advance,


    On Fri, May 9, 2014 at 5:30 PM, Darren Lo wrote:

    Hi Ghadeer,

    Why is the host health bad? What errors do you see if you click on
    that host in the Hosts page? What do the agent's logs say on that host?

    I'm not sure why it still affected you after deleting the host.


    On Thu, May 8, 2014 at 11:12 PM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles .
    and for the fresh cluster setup it stopped at the Parcels distribution
    phase
    claiming that "Host is in bad Health" , the same error Igot even
    after deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It probably
    died because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all
    services, then click the dropdown menu next to your cluster name and delete
    the cluster. Also stop and delete the Management service. When you
    re-create the cluster, it will normally use the same directories, so you'll
    find that all of your HDFS data is still present. Some steps like
    formatting the namenode will "fail", but that's just because it already has
    data, and isn't a failure you should worry about.

    When using CM, you should generally not modify any files directly,
    and instead configure things through the CM web UI. You are correct that
    mapred.java.child.opts applies to mapred-site.xml. This controls the size
    of your MapReduce tasks and is not related to the NameNode, DataNode,
    JobTracker, TaskTracker, or any other daemon's heap size.

    When re-installing, try to limit the services you install to just
    the bare minimum that you need. Don't select All Services.

    Thanks,
    Darren


    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    " CM auto-configured your heaps to 62Mb when the default is 1 gig,
    then you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and the
    auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the following
    lines :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException:
    connect timed out
    WARN [767509864@scm-web-40092:tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9bfailed on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without
    deleting CM ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking about
    HDFS roles and "mapred.java.child.opts", which only applies to MapReduce.
    It's possible that you didn't correctly revert everything to where you had
    it before if you're getting these mixed up. Still, removing services is
    likely the best option for you."

    I think changing "mapred.java.child.opts" propperty of
    mapred-site.xml is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are different
    ? correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is 1
    gig, then you are trying to run too many things on one host. You should
    probably not select the "All Services" option on such a small machine,
    especially if it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking about
    HDFS roles and "mapred.java.child.opts", which only applies to MapReduce.
    It's possible that you didn't correctly revert everything to where you had
    it before if you're getting these mixed up. Still, removing services is
    likely the best option for you.

    Thanks,
    Darren

    On Thu, May 8, 2014 at 2:18 AM, Ghadeer wrote:

    Hello everybody ,

    I need your help please to solve the following problem I
    encountered :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped when
    I was trying to run that Hadoop process. Thus I try to change the java heap
    size on nodes back to 62 Mb and tried to start the services. unfortunately
    they did not start where the details error was : failed to start its roles
    (datanode, namenode ) , "command a time out after 150 seconds"

    And for the whole machine I have "memory overcommit" , so why
    did the changes do not take place when I change it back to 62 MB and how
    can I start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails from
    it, send an email to scm-users+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Darren Lo at May 9, 2014 at 8:57 pm
    Glad you got it working!

    On Fri, May 9, 2014 at 1:03 PM, Ghadeer Abo uda wrote:

    Hi Darren ,

    I started NameNode as you suggest and rerun the setup then everything goes
    fine.
    thank you very much for your time and helpful mails,
    Thank you,
    Ghadeer

    On Fri, May 9, 2014 at 9:23 PM, Darren Lo wrote:

    Try opening a new window, starting the NameNode (make sure that works),
    then going back to the first run page and retrying first run.

    On Fri, May 9, 2014 at 11:19 AM, Ghadeer Abo uda wrote:

    Here is the Stderr logs ,I ignore it for the previous email because i
    thought it contains no error.

    thanks for your time

    On Fri, May 9, 2014 at 8:55 PM, Darren Lo wrote:

    What do the details of that command say? Stderr logs should give more
    information.


    On Fri, May 9, 2014 at 10:49 AM, Ghadeer Abo uda <dewet.sends@gmail.com
    wrote:
    HDFS did not start yet and I can not continue the setup
    , Sincerely see the attached printscreen

    On Fri, May 9, 2014 at 8:40 PM, Darren Lo wrote:

    First Run should continue even if the HDFS format step fails. You
    should just ignore that failure. Did HDFS start up ok?

    If you want to modify your mapreduce task heap configuration, do so
    via the CM UI (MapReduce Child Java Maximum Heap Size) and then re-deploy
    client configuration.


    On Fri, May 9, 2014 at 9:34 AM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    Hi ,

    "If you hover over (or maybe click) the red X on the parcel
    distribution page, you'll see an error message. What does that say?"
    Yes, this shows "Host is in bad health"

    and the restart command worked and did not show that health issue ,
    but "as you said before" at first run it shows error for formating namenode
    "Format HDFS if only is empty" , Shall I try format command , then re-run
    this step ?

    and again for the same issue if i need more heap space for a
    mapreduce task , shall i modify the mapreduce-core.xml directly without the
    cm at /etc/hadoop/conf , so won't have this issue again ?

    thank you Darren ,


    On Fri, May 9, 2014 at 6:52 PM, Darren Lo wrote:

    If you hover over (or maybe click) the red X on the parcel
    distribution page, you'll see an error message. What does that say?

    Your agent keeps looking for files in /proc that don't seem to
    exist, which is very odd. On that host, try running "sudo service
    cloudera-scm-agent hard_restart" and see if that helps.

    snippet from log for other readers:
    [08/May/2014 18:12:53 +0000] 1921 MonitorDaemon-Reporter
    proc_metrics_utils ERROR Failed to get file descriptor count for process
    2517: [Errno 2] No such file or directory: '/proc/2517/fd/'


    On Fri, May 9, 2014 at 8:28 AM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    Hi Darren ,

    No error appears on host page , it only shows "unknown health" and
    at distributing parcels it shows the "host is in bad health"
    Sincerely , See the attached log of cloudera-scm-agent log and a
    print screen of error ,

    thanks in advance,


    On Fri, May 9, 2014 at 5:30 PM, Darren Lo wrote:

    Hi Ghadeer,

    Why is the host health bad? What errors do you see if you click
    on that host in the Hosts page? What do the agent's logs say on that host?

    I'm not sure why it still affected you after deleting the host.


    On Thu, May 8, 2014 at 11:12 PM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    Hello Darren,

    Thank you for helpful mail ,

    I have deleted the cluster , management service , and host roles
    . and for the fresh cluster setup it stopped at the Parcels distribution
    phase
    claiming that "Host is in bad Health" , the same error Igot even
    after deleting the host.
    What I shall do to solve "health" problem ?

    Thanks in advance ,
    Ghadeer




    On Fri, May 9, 2014 at 7:58 AM, Darren Lo wrote:

    Hi Ghadeer,

    That error just means your Host Monitor isn't running. It
    probably died because the heap was too low as well.

    To delete the cluster, you do not need to delete CM. Stop all
    services, then click the dropdown menu next to your cluster name and delete
    the cluster. Also stop and delete the Management service. When you
    re-create the cluster, it will normally use the same directories, so you'll
    find that all of your HDFS data is still present. Some steps like
    formatting the namenode will "fail", but that's just because it already has
    data, and isn't a failure you should worry about.

    When using CM, you should generally not modify any files
    directly, and instead configure things through the CM web UI. You are
    correct that mapred.java.child.opts applies to mapred-site.xml. This
    controls the size of your MapReduce tasks and is not related to the
    NameNode, DataNode, JobTracker, TaskTracker, or any other daemon's heap
    size.

    When re-installing, try to limit the services you install to
    just the bare minimum that you need. Don't select All Services.

    Thanks,
    Darren


    On Thu, May 8, 2014 at 9:16 PM, Ghadeer Abo uda <
    dewet.sends@gmail.com> wrote:
    " CM auto-configured your heaps to 62Mb when the default is 1
    gig, then you are trying to run too many things on one host"
    yes, I have one host where all services are running on it and
    the auto-configured memory heap was 62Mb before I changed it to 1gig

    "The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues."

    The log is too long , however it keeps repeating the
    following lines :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException:
    connect timed out
    WARN [767509864@scm-web-40092
    :tsquery.TimeSeriesQueryService@503]
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@2fb39e9bfailed on nozzle HOST_MONITORING
    java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:201)
    at
    com.cloudera.server.cmf.tsquery.NozzleRequest.getResponse(NozzleRequest.java:70)
    at
    com.cloudera.server.cmf.tsquery.TimeSeriesQueryService.queryTimeSeries(TimeSeriesQueryService.java:307)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeriesHelper(TimeSeriesQueryController.java:310)
    at
    com.cloudera.server.web.cmf.charts.TimeSeriesQueryController.queryTimeSeries(TimeSeriesQueryController.java:271)
    at sun.reflect.GeneratedMethodAccessor1159.invoke(Unknown
    Source)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)


    "If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable."

    Since I have a single host , Can I delete the cluster without
    deleting CM ? Can you please point me to how to safely delete it?


    "I'm a little confused by your email since you are talking
    about HDFS roles and "mapred.java.child.opts", which only applies to
    MapReduce. It's possible that you didn't correctly revert everything to
    where you had it before if you're getting these mixed up. Still, removing
    services is likely the best option for you."

    I think changing "mapred.java.child.opts" propperty of
    mapred-site.xml is by the configuration panel as I already did rather than
    modify the file at conf folder , Am I right or they are
    different ? correct me please If I am not right.


    On Thu, May 8, 2014 at 6:31 PM, Darren Lo wrote:

    Hi Ghadeer,

    If CM auto-configured your heaps to 62Mb when the default is
    1 gig, then you are trying to run too many things on one host. You should
    probably not select the "All Services" option on such a small machine,
    especially if it's a single-host cluster.

    The socket timeout exception doesn't really indicate the real
    problem. There's got to be something more interesting in the role logs,
    stderr, or stdout, but it's likely related to memory issues.

    If you don't have any data on your cluster yet anyway, it's
    probably easiest to delete your cluster (and management service) and then
    re-create it with a smaller set of services, allowing CM to auto-configure
    memory to be a bit better. Alternatively you could delete services / roles
    and manually modify heaps to be more reasonable.

    I'm a little confused by your email since you are talking
    about HDFS roles and "mapred.java.child.opts", which only applies to
    MapReduce. It's possible that you didn't correctly revert everything to
    where you had it before if you're getting these mixed up. Still, removing
    services is likely the best option for you.

    Thanks,
    Darren


    On Thu, May 8, 2014 at 2:18 AM, Ghadeer <
    dewet.sends@gmail.com> wrote:
    Hello everybody ,

    I need your help please to solve the following problem I
    encountered :

    I was trying a Hadoop process on CDH 4 , when I got "
    OutOfMemoryErorr" : java heap space exception . I have 8 G of RAM on my
    machine , so I assumed I should use them to increase the heap space
    allocated for java on namenode and datanode " mapred.java.child.opts"
    property using cloudera manager .

    I did modify the "Java heap size of data node in bytes" at
    configuration window of HDFS and I did the same for name node and
    secondary node. I modify it to the default value 1 GB where it was about
    60MB before.

    The HDFS and MAPREDUCE services have unexpectedly stopped
    when I was trying to run that Hadoop process. Thus I try to change the java
    heap size on nodes back to 62 Mb and tried to start the services.
    unfortunately they did not start where the details error was : failed to
    start its roles (datanode, namenode ) , "command a time out after 150
    seconds"

    And for the whole machine I have "memory overcommit" , so
    why did the changes do not take place when I change it back to 62 MB and
    how can I start the services as it was before?

    For the log it has the following :

    Error getting HDFS summary for hdfs:
    org.apache.avro.AvroRemoteException: java.net.SocketTimeoutException: Read
    timed out



    thanks in advance,

    To unsubscribe from this group and stop receiving emails
    from it, send an email to scm-users+unsubscribe@cloudera.org
    .
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedMay 8, '14 at 9:18a
activeMay 9, '14 at 8:57p
posts12
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Darren Lo: 7 posts Ghadeer Abo uda: 5 posts

People

Translate

site design / logo © 2022 Grokbase