Grokbase Groups Pig user May 2008

Search Discussions

19 discussions - 79 posts

  • I upgraded to hadoop 17 and the latest Pig from svn. I'm now getting a ton of lines in my log files that say: 2008-05-23 00:49:27,832 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory ...
    Tanton GibbsTanton Gibbs
    May 23, 2008 at 5:52 am
    May 30, 2008 at 4:32 pm
  • I think one common case for PIG query optimization atleast for batch mode is to compile multiple (CO)GROUPS into minimum number of Map-Reduce jobs so as to avoid scanning large datasets repeatedly, ...
    Goel, AnkurGoel, Ankur
    May 21, 2008 at 10:56 am
    May 21, 2008 at 7:55 pm
  • One of the things we had discussed at the Hadoop summit was to set up monthly user group meetings to discuss topics of interest to the hadoop community. We have scheduled the first of these meetings ...
    Ajay AnandAjay Anand
    May 6, 2008 at 4:55 pm
    May 22, 2008 at 12:22 am
  • I need to use the equivalents of SQL functions FLOOR, CEIL etc in a few PIG queries. And I'm looking for a list of complete PIG supported functions. Is there any such documentation? Else, does anyone ...
    Prashanth PappuPrashanth Pappu
    May 12, 2008 at 8:29 pm
    May 15, 2008 at 10:17 pm
  • How do I get pig to process a file that is already loaded on the hadoop file system. Right now, from GRUNT, I can do an ls, but it shows the local file system. I've also, tried A = load 'myfile' ...
    Tanton GibbsTanton Gibbs
    May 21, 2008 at 9:49 pm
    May 21, 2008 at 11:18 pm
  • Hi all, I read "pig_hadoopsummit.pdf" and tried it. I made a 320MB file (visit) in dir1 and a 20MB file (page) in dir2. And ran this script. Visits= load '/dir1/visit as (user, url, time); Visits= ...
    Hyoung jun kimHyoung jun kim
    May 7, 2008 at 7:58 am
    May 7, 2008 at 11:56 pm
  • All: I've seen a thread with a similar issue and it was left unresolved. So, here it goes again - (a) I'm trying to get PIG to connect to a HADOOP cluster and execute a script. (a.1) The ...
    Prashanth PappuPrashanth Pappu
    May 23, 2008 at 9:53 pm
    May 23, 2008 at 10:32 pm
  • Is there a way to see how Pig has divided up my script. Basically, I have a script that runs in two map/reduce steps. I'd like to know what happens in each of those steps. I don't mind digging into ...
    Tanton GibbsTanton Gibbs
    May 23, 2008 at 2:42 am
    May 23, 2008 at 4:19 am
  • Hi folks, I am new to PIG having a little bit of Hadoop Map-reduce experience. I recently had chance to use PIG for my data analysis task for which I had written a Map-Red program earlier. A few ...
    Goel, AnkurGoel, Ankur
    May 19, 2008 at 8:24 am
    May 21, 2008 at 7:41 am
  • Join did it on purpose or Is this my fault? -- B. Regards, Edward J. Yoon,
    Edward J. YoonEdward J. Yoon
    May 8, 2008 at 3:09 am
    May 8, 2008 at 8:00 pm
  • I have a processing task that will span multiple map/reduce stages. I think pig will help in simplifying the creation of the task. I was wondering if anyone has written anything to make it possible ...
    Manish ShahManish Shah
    May 29, 2008 at 7:53 pm
    May 30, 2008 at 6:53 am
  • Shouldn't applying SPLIT on an empty bag return empty bags? PIG/GRUNT throws up an exception in such cases. . ----------------------------------------------------------------------------------------- ...
    Prashanth PappuPrashanth Pappu
    May 29, 2008 at 9:05 pm
    May 29, 2008 at 11:13 pm
  • James: forwarding your mail to the pig-user mailing list (probably a good idea for you and/or your students to subscribe). Regarding Hadoop and "hadoop-on-demand", I do not know the answer but will ...
    Chris OlstonChris Olston
    May 20, 2008 at 7:55 pm
    May 20, 2008 at 8:02 pm
  • Hi, Is there a way (beside custom functions) to deal with void tuples in a relations or with non-number atoms in number relation (like empty atom)? Thank you, Cosmin
    Cosmin LeheneCosmin Lehene
    May 16, 2008 at 9:10 am
    May 16, 2008 at 11:50 am
  • Is it possible to configure things like * *using a pig script, or am I bound to the values defined in hadoop-site.xml? I would prefer to run some of my scripts with less ...
    May 14, 2008 at 4:24 pm
    May 14, 2008 at 11:29 pm
  • Is there any support for the use of these at present?
    Jason VennerJason Venner
    May 14, 2008 at 5:12 pm
    May 14, 2008 at 11:27 pm
  • Hi, I just updated with performance numbers for streaming. Olga
    Olga NatkovichOlga Natkovich
    May 30, 2008 at 9:29 pm
    May 30, 2008 at 9:29 pm
  • There seems to be a problem with SPLIT when the conditions include 'IsEmpty'. Here's an example - grunt a = load '/test' using PigStorage(' ') as (x,y,z); grunt dump a; (1, 2, 3) (2, 3, 4) (3, 4, 5) ...
    Prashanth PappuPrashanth Pappu
    May 30, 2008 at 6:44 pm
    May 30, 2008 at 6:44 pm
  • We have just started looking at Pig, and at a first search don't see how to handle loading/storing data to hadoop sequence files. In addition many of our datasets have a local object as the value. I ...
    Jason VennerJason Venner
    May 14, 2008 at 3:07 pm
    May 14, 2008 at 3:07 pm
Group Navigation
period‹ prev | May 2008 | next ›
Group Overview
groupuser @
categoriespig, hadoop

24 users for May 2008

Pi song: 13 posts Tanton Gibbs: 10 posts Cosmin Lehene: 7 posts Olga Natkovich: 7 posts Prashanth Pappu: 7 posts Ajay Anand: 5 posts Alan Gates: 3 posts Chris Olston: 3 posts Iván de Prado: 3 posts Arun C Murthy: 2 posts Edward J. Yoon: 2 posts Goel, Ankur: 2 posts Hyoung jun kim: 2 posts Jason Venner: 2 posts Vitthal Gogate: 2 posts Amir Youssefi: 1 post Casper Rasmussen: 1 post Derek Springer: 1 post Iván de Prado: 1 post Manish Shah: 1 post
show more