going to fill up fast. This partition is always <= 10 GB on EC2 and much of
that space is consumed by the OS install. You should redirect your logs to
some place under /mnt (/dev/sdb1); that's 160 GB.
- Aaron
On Sun, Apr 26, 2009 at 3:21 AM, Rakhi Khatwani wrote:
Hi,
I have faced somewhat a similar issue...
i have a couple of map reduce jobs running on EC2... after a week or so,
i get a no space on device exception while performing any linux command...
so end up shuttin down hadoop and hbase, clear the logs and then restart
them.
is there a cleaner way to do it???
thanks
Raakhi
secondary
FOR
USE,
Hi,
I have faced somewhat a similar issue...
i have a couple of map reduce jobs running on EC2... after a week or so,
i get a no space on device exception while performing any linux command...
so end up shuttin down hadoop and hbase, clear the logs and then restart
them.
is there a cleaner way to do it???
thanks
Raakhi
On Fri, Apr 24, 2009 at 11:59 PM, Todd Lipcon wrote:
Does it sound like this JIRA describes your problem?
https://issues.apache.org/jira/browse/HADOOP-4766
If so, restarting just the JT should help with the symptoms. (I say
symptoms
because this is clearly a problem! Hadoop should be stable and performant
for months without a cluster restart!)
-Todd
aOn Fri, Apr 24, 2009 at 11:18 AM, Marc Limotte wrote:
Actually, I'm concerned about performance of map/reduce jobs for a
long-running cluster. I.e. it seems to get slower the longer it's running.
After a restart of HDFS, the jobs seems to run faster. Not concerned about
the start-up time of HDFS.
Hi Marc,Actually, I'm concerned about performance of map/reduce jobs for a
long-running cluster. I.e. it seems to get slower the longer it's running.
After a restart of HDFS, the jobs seems to run faster. Not concerned about
the start-up time of HDFS.
Does it sound like this JIRA describes your problem?
https://issues.apache.org/jira/browse/HADOOP-4766
If so, restarting just the JT should help with the symptoms. (I say
symptoms
because this is clearly a problem! Hadoop should be stable and performant
for months without a cluster restart!)
-Todd
Of course, as you suggest, this could be poor configuration of the cluster
on my part; but I'd still like to hear best practices around doing a
scheduled restart.
Marc
-----Original Message-----
From: Allen Wittenauer
Sent: Friday, April 24, 2009 10:17 AM
To: [email protected]
Subject: Re: Advice on restarting HDFS in a cron
on my part; but I'd still like to hear best practices around doing a
scheduled restart.
Marc
-----Original Message-----
From: Allen Wittenauer
Sent: Friday, April 24, 2009 10:17 AM
To: [email protected]
Subject: Re: Advice on restarting HDFS in a cron
On 4/24/09 9:31 AM, "Marc Limotte" wrote:
I've heard that HDFS starts to slow down after it's been running for
I've heard that HDFS starts to slow down after it's been running for
long
minutes on Wednesday. I wouldn't really consider that 'slow', but YMMV.
I suspect people aren't running the secondary name node and therefore have
massively large edits file. The name node appears slow on restart because
it has to apply the edits to the fsimage rather than having the
time. And I believe I've experienced this.
We did an upgrade (== complete restart) of a 2000 node instance in ~20minutes on Wednesday. I wouldn't really consider that 'slow', but YMMV.
I suspect people aren't running the secondary name node and therefore have
massively large edits file. The name node appears slow on restart because
it has to apply the edits to the fsimage rather than having the
keep it up to date.
-----Original Message-----
From: Marc Limotte
Hi.
I've heard that HDFS starts to slow down after it's been running for a long
time. And I believe I've experienced this. So, I was thinking to set up a
cron job to execute every week to shutdown HDFS and start it up again.
In concept, it would be something like:
0 0 0 0 0 $HADOOP_HOME/bin/stop-dfs.sh; $HADOOP_HOME/bin/start-dfs.sh
But I'm wondering if there is a safer way to do this. In particular:
* What if a map/reduce job is running when this cron hits. Is
there a way to suspend jobs while the HDFS restart happens?
* Should I also restart the mapred daemons?
* Should I wait some time after "stop-dfs.sh" for things to settle
down, before executing "start-dfs.sh"? Or maybe I should run a command to
verify that it is stopped before I run the start?
Thanks for any help.
Marc
PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT
-----Original Message-----
From: Marc Limotte
Hi.
I've heard that HDFS starts to slow down after it's been running for a long
time. And I believe I've experienced this. So, I was thinking to set up a
cron job to execute every week to shutdown HDFS and start it up again.
In concept, it would be something like:
0 0 0 0 0 $HADOOP_HOME/bin/stop-dfs.sh; $HADOOP_HOME/bin/start-dfs.sh
But I'm wondering if there is a safer way to do this. In particular:
* What if a map/reduce job is running when this cron hits. Is
there a way to suspend jobs while the HDFS restart happens?
* Should I also restart the mapred daemons?
* Should I wait some time after "stop-dfs.sh" for things to settle
down, before executing "start-dfs.sh"? Or maybe I should run a command to
verify that it is stopped before I run the start?
Thanks for any help.
Marc
PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT
ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A
COMMUNICATION
PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW,
COMMUNICATION
PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW,
DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY
PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND
PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND
PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.