FAQ
Date is not a separate type in pig.

If you want to group on date, I think what you want is this:

A = load 'atest.csv' using PigStorage(',') as (v1,v2,v3,v4);
B = foreach A generate tokenize(A.v2) as (date,time), v3;
C = foreach B generate date, v3;
D = group C by date;
E = foreach D generate group, SUM(C.v3);
dump E;

This script will first tokenize the datestamp into date and time, then
project just the date and data you're going to sum, and then do the
grouping.

Alan.
On Feb 18, 2009, at 3:19 PM, Avram Aelony wrote:

Hello,

I have a question regarding treatment of dates with PIG.

My input files contain a timestamp field in 'yyyymmdd hh:mm:ss'
format (e.g. 20090201 14:42:00 ) within a comma delimited file. I
want to aggregate to day-level relying on extracting the date
portion (e.g. yyyymmdd, so the 20090201 ) of the timestamp only. I
have been experimenting with the tokenize function but I am unclear
how to accomplish an aggregation by date.

What am I doing wrong? How can I get a date-level aggregation?
Is there a 'Date' data type?


Here are the details:


Input Data:

4,20090201 23:59:56,8,1
3,20090202 23:59:56,101,1
4,20090201 23:59:56,114,1
5,20090202 23:59:56,29,1

Desired Output:
20090201, 122
20090202, 130

--My attempt in Pig:
A = load 'atest.csv' using PigStorage(',') as (v1,v2,v3,v4);
describe A;
B = foreach A generate group, tokenize(A.v2) as (date,time); --fails
here.
describe B;
C = group B by B.date;
describe C;
D = foreach C generate B.date, SUM(A.v3);
dump D;


grunt> A = load 'atest.csv' using PigStorage(',') as (v1,v2,v3,v4);
grunt> describe A;
A: (v1, v2, v3, v4 )
grunt> B = foreach A generate group, tokenize(A.v2) as (date,time);
2009-02-18 15:11:44,278 [main] ERROR
org.apache.pig.tools.grunt.GruntParser - java.io.IOException:
Invalid alias: group in A: (v1, v2, v3, v4 )
at org.apache.pig.PigServer.registerQuery(PigServer.java:278)
at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:
475)
at
org
.apache
.pig
.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
233)
at
org
.apache
.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:91)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:54)
at org.apache.pig.Main.main(Main.java:270)
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException:
Invalid alias: group in A: (v1, v2, v3, v4 )
at
org
.apache
.pig
.impl
.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:
3301)
at
org
.apache
.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:
3225)
at
org
.apache
.pig
.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:
2236)
at
org
.apache
.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:
2175)
at
org
.apache
.pig
.impl
.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:
2106)
at
org
.apache
.pig
.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:
2038)
at
org
.apache
.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:
2006)
at
org
.apache
.pig
.impl
.logicalLayer
.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:1955)
at
org
.apache
.pig
.impl
.logicalLayer
.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:1894)
at
org
.apache
.pig
.impl
.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:
1862)
at
org
.apache
.pig
.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:
1604)
at
org
.apache
.pig
.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:
1569)
at
org
.apache
.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:
711)
at
org
.apache
.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:512)
at
org
.apache
.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:362)
at
org
.apache
.pig
.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:
47)
at org.apache.pig.PigServer.registerQuery(PigServer.java:275)
... 5 more

2009-02-18 15:11:44,279 [main] ERROR
org.apache.pig.tools.grunt.GruntParser - java.io.IOException:
Invalid alias: group in A: (v1, v2, v3, v4 )
grunt>


Thanks in advance,
Avram

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 10 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedFeb 18, '09 at 11:20p
activeFeb 24, '09 at 5:10p
posts10
users5
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase