FAQ
Thanks, Prashant and Pablomar. That fixed it!




----- Original Message -----
From: Prashant Kommireddi <prash1784@gmail.com>
To: "user@pig.apache.org" <user@pig.apache.org>
Cc:
Sent: Tuesday, November 13, 2012 1:40 PM
Subject: Re: Help with Script

SUM function requires that you specify the specific element from the
grouping. In this case, U_tm and U_cnt are both within group/bags and need
to be accessed as "reqd.U_tm" and "reqd.U_cnt".


--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as
count;

On Nov 13, 2012, at 9:58 AM, ingvay7 wrote:

(Apologies for resending but corrected script below)


This is the error I got:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the
matching function for org.apache.pig.builtin.SUM as multiple or none of
them fit. Please use an explicit cast.

Updated code:

a = LOAD 'Report' AS (
dt:chararray,
Server:chararray,
Type:chararray,
Ops:chararray,
UserID:chararray,
U_cnt:int,
U_tm:int,
U_min_tm:int,
U_max_tm:int,
U_avg_tm:float
);


--Remove Test Servers
remtest = filter a by not Server matches 'Test%';

-- Filter to required columns
reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;

--Groupby
G2 = group reqd by (Server,Type,Ops);

--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;

store G3 into 'Servertest';







----- Original Message -----
From: Prashant Kommireddi <prash1784@gmail.com>
To: "user@pig.apache.org" <user@pig.apache.org>
Cc:
Sent: Tuesday, November 13, 2012 11:59 AM
Subject: Re: Help with Script

Hi,

Can you paste the error message here?

Sent from my iPhone

On Nov 13, 2012, at 8:34 AM, "ingvay7@yahoo.com" wrote:

hey all,


Very new Pig user here. I think I'm trying to get something very simple
done but getting a few errors. See me script below.Any guidance will be
appreciated.Thanks.


I get errors such as  Error during parsing. Invalid alias: serverin {time:
double,count: double}

I am basically trying to duplicate the following SQL query:


select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt)

from TableA

group by 1, 2, 3;




My script is as follows:


a = LOAD 'Report' AS (

dt:chararray,

Server:chararray,

Type:chararray,

Ops:chararray,

UserID:chararray,

U_cnt:int,

U_tm:int,

U_min_tm:int,

U_max_tm:int,

U_avg_tm:float,

);



--Remove Test Servers

remtest = filter a by not Server matches 'Test%';

-- Filter to required columns

reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;

--Groupby

G2 = group reqd by Server,Type,Ops;

--Sum the User Counts and Times

G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;

--byServeroperation = order G3 by Server;

store G3 into 'Servertest';


ingvay7

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 9 of 9 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedNov 13, '12 at 4:12p
activeNov 13, '12 at 6:50p
posts9
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase