Grokbase Groups Pig user January 2011
FAQ
Thanks to Joe and Daniel, I was able to fix this issue.

It was a combination of ambiguity about file paths (which Joe's message
helped me confirm) and an error in my Java that wasn't causing an exception
and failing silently.

Thanks,
Geoff
On Wed, Jan 12, 2011 at 7:43 AM, Joe Crobak wrote:

A = LOAD 'file://home/geoffeg/test.json' will try to load using a relative
path. Pig will understand file:/home/geoffeg/test.json or
file:///home/geoffeg/test.json to load the absolute path. Same goes for a
file in hdfs://

HTH,
Joe

On Sun, Jan 9, 2011 at 11:47 PM, Geoffrey Gallaway <geoffeg@geoffeg.org
wrote:
Hello, I'm looking for some clues to help me fix an annoying error I'm
getting using Pig.

I need to parse a large JSON file so I grabbed kimsterv's (
https://gist.github.com/601331) JSON loader, compiled it and
successfully
tested it on my laptop via -x local. However, when I try to run it on the
edgenode of our dev hadoop instance I am unable to get it to work, even if
I
run it in -x local. I get
"org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable
to
create input splits for test.json". I looked through the mailing list for
this message, only to find a mention of it being related to LZO
compression
issues. I'm not using any file compression and this error still occurs when
running in -x local on the edgenode of the dev cluster. Is there some
environment variables I'm missing? Maybe some permissions issues I'm
unaware
of? Suggestions and theories welcome!

Hadoop version: Hadoop 0.20.2+737
Pig version: 0.7.0+16 (compiled against the pig 0.7.0 jar)

Command line:
java -cp
'/usr/lib/pig/*:/usr/lib/hadoop/*:/usr/lib/hadoop/lib/*:libs/*:.'
org.apache.pig.Main -v -x local json.pig

Pig script:
REGISTER /home/geoffeg/pig-functions/jsontester.jar;
-- file:// should specify the local FS, remove file:// to specify HDFS
A = LOAD 'file://home/geoffeg/test.json' using
org.geoffeg.hadoop.pig.loader.PigJsonLoader() as ( json: map[] );
B = foreach A generate json#'_keyword';
DUMP B;

Full error/log:
2011-01-09 22:33:29,692 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
Connecting
to hadoop file system at: file:///
2011-01-09 22:33:30,345 [main] INFO
org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned
for A
2011-01-09 22:33:30,345 [main] INFO
org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - Map key required
for A: $0->[_keyword]
2011-01-09 22:33:30,455 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name:

Store(file:/tmp/temp1814319995/tmp1141533149:org.apache.pig.builtin.BinStorage)
- 1-36 Operator Key: 1-36)
2011-01-09 22:33:30,482 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2011-01-09 22:33:30,482 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2011-01-09 22:33:30,517 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with
processName=JobTracker, sessionId=
2011-01-09 22:33:30,522 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-01-09 22:33:32,520 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-01-09 22:33:32,552 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-09 22:33:32,552 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-01-09 22:33:32,562 [Thread-2] WARN
org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2011-01-09 22:33:32,692 [Thread-2] INFO
org.apache.hadoop.mapred.JobClient
- Cleaning up the staging area

file:/tmp/hadoop-geoffeg/mapred/staging/geoffeg395595954/.staging/job_local_0001
2011-01-09 22:33:33,054 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-01-09 22:33:33,054 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-01-09 22:33:33,054 [main] ERROR

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map reduce job(s) failed!
2011-01-09 22:33:33,064 [main] ERROR

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed to produce result in: "file:/tmp/temp1814319995/tmp1141533149"
2011-01-09 22:33:33,064 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Records written : Unable to determine number of records written
2011-01-09 22:33:33,065 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Bytes written : Unable to determine number of bytes written
2011-01-09 22:33:33,065 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Spillable Memory Manager spill count : 0
2011-01-09 22:33:33,065 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Proactive spill count : 0
2011-01-09 22:33:33,065 [main] INFO

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
2011-01-09 22:33:33,133 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2997: Unable to recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
create input splits for: file://home/geoffeg/test.json
2011-01-09 22:33:33,134 [main] ERROR org.apache.pig.tools.grunt.Grunt -
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
open iterator for alias B
at org.apache.pig.PigServer.openIterator(PigServer.java:607)
at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:545)
at

org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
at

org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:163)
at

org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:139)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
at org.apache.pig.Main.main(Main.java:414)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
2997:
Unable to recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
create input splits for: file://home/geoffeg/test.json
at

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:169)
at

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:270)
at

org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1007)
at org.apache.pig.PigServer.store(PigServer.java:697)
at org.apache.pig.PigServer.openIterator(PigServer.java:590)
... 6 more

--
Sent from my email client.


--
Sent from my email client.

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 4 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedJan 10, '11 at 4:48a
activeJan 12, '11 at 9:25p
posts4
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase