Grokbase Groups Pig user June 2008
FAQ

Search Discussions

18 discussions - 78 posts

  • (a) I see that at a lot of places where PIG doesn't correctly deal with results that are empty bags. Here's an example - Counting Tuples. Let's say I want to count number of tuples in 'b' ( a subset ...
    Prashanth PappuPrashanth Pappu
    Jun 5, 2008 at 10:31 pm
    Jun 7, 2008 at 1:16 am
  • Hi, How does pig handle really large tuples. Assuming after a group, the resulting alias has small subset of tuples (out of the many which were generated) which are really large in size. In excess of ...
    Mridul MuralidharanMridul Muralidharan
    Jun 10, 2008 at 7:57 pm
    Jun 14, 2008 at 5:26 am
  • Lets suppose 1.txt is: 1 1 1 2 1 1 3 1 1 4 1 1 5 1 1 And 2.txt is: 4 2 2 5 2 2 6 2 2 7 2 2 The script: A = LOAD 'ivan/1.txt' USING PigStorage(); B = LOAD 'ivan/2.txt' USING PigStorage(); C = COGROUP ...
    Iván de PradoIván de Prado
    Jun 11, 2008 at 2:17 pm
    Jun 11, 2008 at 4:51 pm
  • I just upgraded my PIG src to top of svn and see really poor performance with group and cogroup queries. Is there a recommended svn version to be used with hadoop 1.17? Prashanth
    Prashanth PappuPrashanth Pappu
    Jun 4, 2008 at 12:07 am
    Jun 10, 2008 at 11:30 pm
  • I think there are a few unresolved issues due to lack of explicit type declarations in PIG. Especially with data atoms that have leading or trialing spaces (' '), implicit typecast into strings and ...
    Prashanth PappuPrashanth Pappu
    Jun 4, 2008 at 11:37 pm
    Jun 9, 2008 at 2:38 pm
  • Hi All, I downloaded the pig tutorial to give it a whirl, set it up on a hadoop cluster I've used for a few other tasks (7 nodes, ec2) and went through the instructions to launch tutorial script1 ...
    Mark SnowMark Snow
    Jun 26, 2008 at 3:09 am
    Jun 26, 2008 at 9:13 pm
  • If I have multiple files in a directory, how do I load this into Pig? I want to run Pig over an input directory, not an individual file. %ls Data myfile1.txt myfile2.txt myfile3.txt myfile4.txt ...
    Kayla JayKayla Jay
    Jun 6, 2008 at 1:30 pm
    Jun 9, 2008 at 2:43 pm
  • I have a PIG script that simply generates a lot of 'counts' over very large data. For example, a = load 'data' as (x,y,z); b1 = filter a by x==1; b1_group = group b1 all; b1_count = foreach b1_group ...
    Prashanth PappuPrashanth Pappu
    Jun 19, 2008 at 7:20 pm
    Jun 19, 2008 at 7:53 pm
  • Hello folks! I've been trying to build Pig from trunk, but I've got the following error: Buildfile: build.xml BUILD FAILED /projetos/woovi/pig/trunk/build.xml:135: Problem: failed to create task or ...
    Rafael TurkRafael Turk
    Jun 6, 2008 at 11:27 pm
    Jun 7, 2008 at 12:46 am
  • Hi, I am grouping by $1, but got IndexOutofBoundException because some rows are mal-formed, they only have 1 field. Is there a built-in function to return the number of fields for each row? So I can ...
    Haijun CaoHaijun Cao
    Jun 3, 2008 at 11:51 pm
    Jun 4, 2008 at 5:51 am
  • Hi, Is there any twiki or doc which details the circumstances when pig uses a combiner ? I am interested specifically in interaction of foreach with (co)group, load and store. Thanks, Mridul
    Mridul MuralidharanMridul Muralidharan
    Jun 20, 2008 at 1:51 am
    Jun 24, 2008 at 6:07 pm
  • Hi, If you are new to Pig, the best place to start is by trying out our brand new tutorial: http://wiki.apache.org/pig/PigTutorial. We hope that you find it useful and informative! As always, your ...
    Olga NatkovichOlga Natkovich
    Jun 20, 2008 at 10:36 pm
    Jun 20, 2008 at 11:31 pm
  • We have a custom input formatter that we use for regular map/reduce jobs. Is there a way to make use of this input formatter in pig? We've looked at most of the docs, and havent found much. The issue ...
    Manish ShahManish Shah
    Jun 5, 2008 at 11:49 pm
    Jun 6, 2008 at 1:04 pm
  • I;m having a problem loading data from multiple paths in Pig. What I'm trying to do is to load data from a range of dates, so I would like to specify an input of two globbed paths: x = LOAD ...
    Tom WhiteTom White
    Jun 4, 2008 at 2:56 pm
    Jun 4, 2008 at 9:21 pm
  • We are planning to host a mini-summit (aka "Camp Hadoop") in conjunction with ApacheCon this year - Nov 6th and 7th - in New Orleans. We are working on putting together the agenda for this now, and ...
    Ajay AnandAjay Anand
    Jun 30, 2008 at 10:23 pm
    Jun 30, 2008 at 10:23 pm
  • Hi, We have just created Piggy Bank - the place for users to share the Pig functions. Details on how to use and how to contribute can be found at http://wiki.apache.org/pig/PiggyBank. Please, let us ...
    Olga NatkovichOlga Natkovich
    Jun 20, 2008 at 10:38 pm
    Jun 20, 2008 at 10:38 pm
  • Please note that the location of the user group meeting has been changed to Building 2, Training Rooms 5&6, at Yahoo! Mission College (2811 Mission College Blvd, Santa Clara). User Group Meeting June ...
    Ajay AnandAjay Anand
    Jun 17, 2008 at 5:09 pm
    Jun 17, 2008 at 5:09 pm
  • The next user group meeting is scheduled for June 18th from 6-7:30 pm at the Yahoo! Mission College campus (2821 Mission College, Santa Clara). Registration, driving directions etc are at ...
    Ajay AnandAjay Anand
    Jun 4, 2008 at 10:02 pm
    Jun 4, 2008 at 10:02 pm
Group Navigation
period‹ prev | Jun 2008 | next ›
Group Overview
groupuser @
categoriespig, hadoop
discussions18
posts78
users21
websitepig.apache.org