|
Bill Graham |
at Jan 27, 2010 at 7:33 pm
|
⇧ |
| |
Thanks Rekha.
These issues seem to be related to cleaning up Pig/Hadoop file upon shutdown
of the VM. I just checked and when I shut down the VM, all files are cleaned
up as expected.
My issue is that I have Pig jobs that run in an app server which are
triggered by quartz. It might be days or weeks between app server bounces.
If anyone knows a way to configure or kick off some sort of cleanup process
without shutting down the VM, please let me know.
Otherwise, I need to deploy a hacky crontab script like this:
find /tmp/Job[0-9]*.jar -type f -mmin +50 -exec rm {} \;
On Tue, Jan 26, 2010 at 8:40 PM, Rekha Joshi wrote:
You might like to check up PIG-116 and HADOOP-5175.Also think there is a
JobCleanup task which takes care of cleaning.., AFAIK.., unless its failed
job.
Cheers,
/R
On 1/27/10 12:01 AM, "Bill Graham" wrote:
Hi,
Every time I run a Pig script I get a number of Job jars left in the /tmp
directory of my client, 1 per MR job it seems. The file names look like
/tmp/Job875278192.jar.
I have scripts that run every five minutes and fire 10 MR jobs each, so the
amount of space used by these jars grows rapidly. Is there a way to tell
Pig
to clean up after itself and remove these jars, or do I need to just write
my own clean-up script?
thanks,
Bill