Grokbase Groups Pig user March 2011
FAQ
I have a very simple Pig script, which I run on a single-node
machine. The script works in local mode, but fails with a null pointer
exception in pseudo-distributed mode. The script:

A = LOAD '/user/paepcke/Datasets/triplets.csv' USING PigStorage(',');
DUMP A;
B = ORDER A BY $0;
DUMP B;

the csv file is:

activity,100_2,631
populations,100_3,937
image,100_3,1408
td,100_4,521
Swiss,100_4,594
238,100_4,697

The dump of A always works (local or distributed):

(activity,100_2,631)
(populations,100_3,937)
(image,100_3,1408)
(td,100_4,521)
(Swiss,100_4,594)
(238,100_4,697)

But the ORDER never makes it in distributed mode, generating the trace
below. When I put the csv file into my local file system, and run
with -x local, everything works. I get the expected:

(238,100_4,697)
(Swiss,100_4,594)
(activity,100_2,631)
(image,100_3,1408)
(populations,100_3,937)
(td,100_4,521)


Advice?

Thanks,

Andreas

P.S.: Is this the right mailing list for this type of question?

Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias B. Backend error : null

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
open iterator for alias B. Backend error : null
at org.apache.pig.PigServer.openIterator(PigServer.java:742)
at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:510)
at org.apache.pig.Main.main(Main.java:107)
Caused by: java.lang.NullPointerException
at java.util.Arrays.binarySearch(Arrays.java:2043)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
================================================================================
(END)

Failed Jobs:
JobId Alias Feature Message Outputs
job_201103262110_0003 B ORDER_BY Message: Job failed!
hdfs://localhost/tmp/temp-136868177/tmp573209873,

Input(s):
Successfully sampled 6 records (101 bytes) from:
"/user/paepcke/Datasets/triplets.csv"
Failed to read data from "/user/paepcke/Datasets/triplets.csv"

Output(s):
Failed to produce result in
"hdfs://localhost/tmp/temp-136868177/tmp573209873"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201103262110_0002 -> job_201103262110_0003,
job_201103262110_0003

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMar 27, '11 at 4:27a
activeMar 27, '11 at 4:27a
posts1
users1
websitepig.apache.org

1 user in discussion

Andreas Paepcke: 1 post

People

Translate

site design / logo © 2021 Grokbase