FAQ
hey all,

Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks.

I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double}
I am basically trying to duplicate the following SQL query:

select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt)
from TableA
group by 1, 2, 3



;My script is as follows:

a = LOAD 'Report' AS (
dt:chararray,
Server:chararray,
Type:chararray,
Ops:chararray,
UserID:chararray,
U_cnt:int,
U_tm:int,
U_min_tm:int,
U_max_tm:int,
U_avg_tm:float,
)


;--Remove Test Servers
remtest = filter a by not Server matches 'Test%'
;-- Filter to required columns
reqd = foreach remtest generate $1,$2,$3,$4,$5,$6
;--Groupby
G2 = group reqd by Server,Type,Ops
;--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count
;--byServeroperation = order G3 by Server
;store G3 into 'Servertest'

;ingvay7

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 9 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedNov 13, '12 at 4:12p
activeNov 13, '12 at 6:50p
posts9
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase