Grokbase Groups Pig user August 2009
FAQ
Krishna,
Any chance you can find a small subset of your input data that this is
reproducible on, and send that along with the script?



On Mon, Aug 10, 2009 at 5:21 PM, Shrikrishna Shrinwrote:
Dmitriy,

I don't think that is the issue because the same logs were processed
successfully using other scripts. I copy pasted the logs below and it looks
like Hadoop was unable to read/write to /tmp for some reason.

Eg: Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does
not exist.

Should I try modifying some property in hadoop-site.xml?

Thanks,

Krishna


*FULL ERROR LOG:*

ERROR 2998: Unhandled internal error. Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
java.lang.Exception: Task attempt_200908101519_0005_m_000002_0 failed to
report status for 602 seconds. Killing!
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:230)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
ERROR 2998: Unhandled internal error.
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp1635201062 does not
exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp1635201062
does not exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
ERROR 2998: Unhandled internal error.
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
ERROR 2998: Unhandled internal error.
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
ERROR 2056: Cannot create exception from empty string.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backed error: Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
Cannot create exception from empty string.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
... 13 more
ERROR 2056: Cannot create exception from empty string.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backed error: Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
Cannot create exception from empty string.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
... 13 more
ERROR 2056: Cannot create exception from empty string.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backed error: Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
Cannot create exception from empty string.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
... 13 more
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)


On Mon, Aug 10, 2009 at 1:09 PM, Dmitriy Ryaboy wrote:

Krishna,
Is it possible that the data you are reading in is malformed in such a
way that a mapper doesn't see an end of record for a very long time,
and keeps reading your input? Did any other jobs that read the same
input but perform different operations, succeed?

-Dmitriy

On Mon, Aug 10, 2009 at 12:59 PM, Shrikrishna Shrin<krishna@cooliris.com>
wrote:
Hi,

I haven't seen this before but nightly jobs failed over the weekend because
due to memory issues. The weird part is the jobs failed during the map phase
(at about ~98% complete).

The task tracker for the failed map jobs shows the following errors:

Task attempt_200908100026_0065_m_000002_0 failed to report status for
602 seconds. Killing!
Task attempt_200908100026_0065_m_000002_1 failed to report status for
603 seconds. Killing!

The logs indicate memory to be the issue:

2009-08-10 11:53:37.829 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
287290336(280556K) committed = 363593728(355072K) max =
536870912(524288K)

2009-08-10 11:53:43.522 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
350217672(342009K) committed = 422510592(412608K) max =
536870912(524288K)

2009-08-10 11:53:45.290 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Usage threshold exceeded) init = 5439488(5312K) used =
376781240(367950K) committed = 422510592(412608K) max =
536870912(524288K)

2009-08-10 11:53:45.290 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
380504752(371586K) committed = 456720384(446016K) max =
536870912(524288K)

2009-08-10 11:53:46.752 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
401755464(392339K) committed = 482344960(471040K) max =
536870912(524288K)

2009-08-10 11:53:50.599 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
443763584(433362K) committed = 527171584(514816K) max =
536870912(524288K)

2009-08-10 11:53:54.686 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
491575560(480054K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:56.414 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
514928920(502860K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:57.553 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
520781832(508576K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:58.747 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
526636552(514293K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:59.935 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
532493568(520013K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:54:01.158 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
536870904(524287K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:54:02.389 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
536870904(524287K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:54:03.778 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
489852536(478371K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 12:03:40.298 WARN [Comm thread for
attempt_200908100026_0065_m_000077_1]
org.apache.hadoop.mapred.TaskRunner - Parent died.  Exiting
attempt_200908100026_0065_m_000077_1

I have seen this before when jobs fail on the reduce phase but this is the
first time I am noticing jobs failing during the map phase.
Surprisingly,
jobs that load and process much more data ran successfully but when I tried
running the ones that failed, they failed again. Some of the jobs that
failed do nothing more than, load, filter and write out the filtered data.
This leads me to believe that the problem is more specific than I had
originally thought. Any pointers on what the issue might be will be
extremely helpful.

Thanks,

Krishna

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 5 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedAug 10, '09 at 8:00p
activeAug 11, '09 at 4:04p
posts5
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase