FAQ
In CDH5 Cloudera Manager, following is the configuration for Namenode:
File-system Checkpoint Transaction Threshold: *1000000 (Default value)*

Filesystem Checkpoint Period : *1 hour (Default value)*

After certain time-limit, Name-node moves to concerning health to Bad
Health.

Following is the details:

*The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
1,630.03% of the configured checkpoint period of 1 hour(s). Critical
threshold: 400.00%. 8,900 transactions have occurred since the last
filesystem checkpoint. This is 0.89% of the configured checkpoint
transaction target of 1,000,000. *

Restarting the Namenode resolve the issue for few hours and after 4 hours
again Namenode moves to bad health.

Is there any configuration issue or anything else, kindly suggest.

Thanks
Mv

To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.

Search Discussions

  • Harsh J at May 12, 2014 at 1:01 pm
    Do you use HA HDFS with 2x NameNodes? If not, is your SecondaryNameNode running?
    On Mon, May 12, 2014 at 12:12 PM, Manish Verma wrote:
    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: 1000000 (Default value)

    Filesystem Checkpoint Period : 1 hour (Default value)

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is 1,630.03%
    of the configured checkpoint period of 1 hour(s). Critical threshold:
    400.00%. 8,900 transactions have occurred since the last filesystem
    checkpoint. This is 0.89% of the configured checkpoint transaction target of
    1,000,000.

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv

    To unsubscribe from this group and stop receiving emails from it, send an
    email to scm-users+unsubscribe@cloudera.org.


    --
    Harsh J

    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Manish Verma at May 12, 2014 at 3:22 pm
    No, We are not using HA HDFS with 2x NameNodes.

    Yes, Secondary NameNode is running.
    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: *1000000 (Default value)*

    Filesystem Checkpoint Period : *1 hour (Default value)*

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    *The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000. *

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Harsh J at May 13, 2014 at 12:13 pm
    Thanks Manish,

    Can you look into your SNN log to check if there's any form of errors
    that may be delaying or aborting checkpointing operations? You can
    also send the log over via a pastebin.com attachment if you'd like
    someone on the lists to take a look at it.
    On Mon, May 12, 2014 at 8:52 PM, Manish Verma wrote:
    No, We are not using HA HDFS with 2x NameNodes.

    Yes, Secondary NameNode is running.

    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: 1000000 (Default value)

    Filesystem Checkpoint Period : 1 hour (Default value)

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000.

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv
    To unsubscribe from this group and stop receiving emails from it, send an
    email to scm-users+unsubscribe@cloudera.org.


    --
    Harsh J

    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Manish Verma at May 13, 2014 at 12:22 pm
    Thanks for the reply. Link for the SNN log:

    http://pastebin.com/hHqNQ0i2

    SNN is not in bad health but still displaying exception in doCheckpoint.

    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: *1000000 (Default value)*

    Filesystem Checkpoint Period : *1 hour (Default value)*

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    *The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000. *

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Harsh J at May 14, 2014 at 5:37 am
    Thanks for sharing the log. It appears your cluster has been upgraded
    to CDH5 but the SNN directory was not cleared as part of the upgrade.

    Can you stop your SNN, then check your HDFS configuration for the
    value of dfs.namenode.checkpoint.dir (typically /dfs/snn if not
    changed), visit that path on the SNN host, and clear out all contents
    within it (move it to /tmp or so, and later/or delete it). Start your
    SNN again, and it should work again now.
    On Tue, May 13, 2014 at 5:52 PM, Manish Verma wrote:
    Thanks for the reply. Link for the SNN log:

    http://pastebin.com/hHqNQ0i2

    SNN is not in bad health but still displaying exception in doCheckpoint.

    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: 1000000 (Default value)

    Filesystem Checkpoint Period : 1 hour (Default value)

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000.

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv
    To unsubscribe from this group and stop receiving emails from it, send an
    email to scm-users+unsubscribe@cloudera.org.


    --
    Harsh J

    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Gvr.deepak at May 14, 2014 at 4:48 am
    I am also having same issue not sure how to resolve this issue please help.

    Thanks
    Deepak Gattala


    Sent via the Samsung Galaxy Note® 3, an AT&T 4G LTE smartphone

    <div>-------- Original message --------</div><div>From: Manish Verma <manishvrmv@gmail.com> </div><div>Date:05/13/2014 5:22 AM (GMT-08:00) </div><div>To: scm-users@cloudera.org </div><div>Subject: Re: Name-node: Bad Health checkpoint issue </div><div>
    </div>Thanks for the reply. Link for the SNN log:

    http://pastebin.com/hHqNQ0i2

    SNN is not in bad health but still displaying exception in doCheckpoint.


    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:
    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: 1000000 (Default value)

    Filesystem Checkpoint Period : 1 hour (Default value)

    After certain time-limit, Name-node moves to concerning health to Bad Health.

    Following is the details:

    The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is 1,630.03% of the configured checkpoint period of 1 hour(s). Critical threshold: 400.00%. 8,900 transactions have occurred since the last filesystem checkpoint. This is 0.89% of the configured checkpoint transaction target of 1,000,000.

    Restarting the Namenode resolve the issue for few hours and after 4 hours again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv

    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.

    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Harsh J at May 14, 2014 at 5:37 am
    Hello Deepak,

    Unless your SNN log has the same error as Manish does, your issue can
    also be something else. I'd suggest starting a new topic if this is
    not true.
    On Wed, May 14, 2014 at 10:18 AM, gvr.deepak wrote:
    I am also having same issue not sure how to resolve this issue please help.

    Thanks
    Deepak Gattala


    Sent via the Samsung Galaxy Note® 3, an AT&T 4G LTE smartphone


    -------- Original message --------
    From: Manish Verma
    Date:05/13/2014 5:22 AM (GMT-08:00)
    To: scm-users@cloudera.org
    Subject: Re: Name-node: Bad Health checkpoint issue

    Thanks for the reply. Link for the SNN log:

    http://pastebin.com/hHqNQ0i2

    SNN is not in bad health but still displaying exception in doCheckpoint.

    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: 1000000 (Default value)

    Filesystem Checkpoint Period : 1 hour (Default value)

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000.

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv
    To unsubscribe from this group and stop receiving emails from it, send an
    email to scm-users+unsubscribe@cloudera.org.

    To unsubscribe from this group and stop receiving emails from it, send an
    email to scm-users+unsubscribe@cloudera.org.


    --
    Harsh J

    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Manish Verma at May 14, 2014 at 9:49 am
    Thanks for the reply. Now NameNode and SNN both works fine.

    I have checked present SNN logs, and following warning is coming :

    WARN org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode

    Checkpoint done. New Image Size: 504800


    Does it require any configuration change.

    Thanks for the help.
    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: *1000000 (Default value)*

    Filesystem Checkpoint Period : *1 hour (Default value)*

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    *The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000. *

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Harsh J at May 14, 2014 at 1:44 pm
    Thanks for following up!

    That log is just informative. It should ideally be at an INFO level. It
    does not represent an issue.

    On Wed, May 14, 2014 at 3:19 PM, Manish Verma wrote:

    Thanks for the reply. Now NameNode and SNN both works fine.

    I have checked present SNN logs, and following warning is coming :

    WARN org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode

    Checkpoint done. New Image Size: 504800


    Does it require any configuration change.

    Thanks for the help.

    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: *1000000 (Default value)*

    Filesystem Checkpoint Period : *1 hour (Default value)*

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    *The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000. *

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv

    To unsubscribe from this group and stop receiving emails from it, send
    an email to scm-users+unsubscribe@cloudera.org.


    --
    Harsh J

    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.
  • Manish Verma at May 14, 2014 at 6:24 pm
    Thanks for your kind suggestion and help to sort out the issue.
    On Monday, May 12, 2014 12:12:04 PM UTC+5:30, Manish Verma wrote:

    In CDH5 Cloudera Manager, following is the configuration for Namenode:
    File-system Checkpoint Transaction Threshold: *1000000 (Default value)*

    Filesystem Checkpoint Period : *1 hour (Default value)*

    After certain time-limit, Name-node moves to concerning health to Bad
    Health.

    Following is the details:

    *The filesystem checkpoint is 16 hour(s), 18 minute(s) old. This is
    1,630.03% of the configured checkpoint period of 1 hour(s). Critical
    threshold: 400.00%. 8,900 transactions have occurred since the last
    filesystem checkpoint. This is 0.89% of the configured checkpoint
    transaction target of 1,000,000. *

    Restarting the Namenode resolve the issue for few hours and after 4 hours
    again Namenode moves to bad health.

    Is there any configuration issue or anything else, kindly suggest.

    Thanks
    Mv
    To unsubscribe from this group and stop receiving emails from it, send an email to scm-users+unsubscribe@cloudera.org.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedMay 12, '14 at 6:42a
activeMay 14, '14 at 6:24p
posts11
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase