Grokbase Groups Pig user April 2010
FAQ
Hello,

We have a file heirarchy we want to be accessable with MR/Hive/Pig. In this
way everyone can pick favorites :)

Currently the layout looks like this.

/user/root/data/datepartition1/subpartition2/{sequence file1, sequence
fileN)

I have just installed pig-0.6.0. I am trying to follow the advice here (
http://stackoverflow.com/questions/2423949/storing-data-to-sequencefile-from-apache-pig
)

REGISTER /opt/pig-0.6.0/contrib/piggybank/java/piggybank.jar;
DEFINE SequenceFileLoader
org.apache.pig.piggybank.storage.SequenceFileLoader();
raw = load 'datafile' USING SequenceFileLoader as (version:chararray,
id:int,date:chararray);

2010-04-20 12:10:46,821 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2999: Unexpected internal error.
org.apache.pig.impl.logicalLayer.FrontendException cannot be cast to
java.lang.Error

[root@rs01 piggybank]# more /root/pig_1271779744816.log
Pig Stack Trace
---------------
ERROR 2999: Unexpected internal error.
org.apache.pig.impl.logicalLayer.FrontendException cannot be cast to
java.lan
g.Error

java.lang.ClassCastException:
org.apache.pig.impl.logicalLayer.FrontendException cannot be cast to
java.lang.Error
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1440)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:949)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:738)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1036)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:986)
at org.apache.pig.PigServer.registerQuery(PigServer.java:386)
at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:720)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:352)

So it seems like I have a bug, or have I done something wrong. looks like a
bug because if Pig can't cast the error correctly something is wrong.

Two questions:
1) Can I load all the files in a directory rather then operating on one
file?

raw = load '/datadir/*' USING SequenceFileLoader as (version:chararray,
id:int,date:chararray);
Rather then
raw = load '/datafile' USING SequenceFileLoader as (version:chararray,
id:int,date:chararray);

2) PigStorage seems to let me specify a tab delimeter. How does once specify
a tab delimeter with SequenceFileLoader? Or does one have to pass the entire
line to some other Pig Component to be tokenized.

Thank you,

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 6 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedApr 20, '10 at 4:37p
activeApr 20, '10 at 7:25p
posts6
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase