Search Discussions
-
Hey folks, I've been trying HBaseStorage 0.8.0 trunk with hbase-0.89 and it does not seem to work. It gets stuck at: [...] 2010-10-13 14:58:44,064 [Thread-4] INFO org.apache.zookeeper.ClientCnxn - ...
George Stathis
Oct 13, 2010 at 8:09 pm
Oct 14, 2010 at 5:15 pm -
Hi All, I'm quite new to Pig/Hadoop. So maybe my cluster size will make you laugh. I wrote a script on Pig handling 1.5GB of logs in less than one hour in pig local mode on a Intel core 2 duo with ...
Vincent
Oct 8, 2010 at 6:53 am
Oct 8, 2010 at 6:31 pm -
Hi all! I am struggling to find a working solution to load data from HBase directly. I am using Cloudera CDH3b3 which comes with Pig 0.7. What would be the easiest way to load data from HBase? If it ...
Anze
Oct 25, 2010 at 2:01 pm
Oct 29, 2010 at 3:46 pm -
I propose that we adopt the bylaws proposed at http://wiki.apache.org/pig/ProposedByLaws as the bylaws for the Pig project. In a self referential use of these bylaws I further propose that this vote ...
Alan Gates
Oct 7, 2010 at 4:23 pm
Oct 15, 2010 at 8:23 pm -
All, I have a solution for writing unit test in Java to test pig scripts including stats and output if anyone is interested.
Dave Wellman
Oct 20, 2010 at 4:19 pm
Oct 20, 2010 at 11:32 pm -
My PIG script that is roughly like this: A = LOAD input1 USING JsonLoader AS (x:map[]); B = LOAD input2 USING JsonLoader AS (x:map[]); A = FOREACH A GENERATE x, x#'item' AS item:chararray; B = ...
Rakesh kothari
Oct 21, 2010 at 6:30 am
Oct 25, 2010 at 4:10 pm -
Hi, If I have bags that have a dynamic number of fields that look something like this: ("park", "building", "office") ("store", "school") ("building", "school", "restaurant", "hotel) Is it possible ...
Kim Vogt
Oct 14, 2010 at 5:30 pm
Oct 15, 2010 at 6:05 pm -
Hi again! :) I am trying to run Pig on a local machine, but I want it to connect to a remote cluster. I can't make it use my settings - whatever I do, I get this: ----- $ pig -x mapreduce 10/10/16 ...
Anze
Oct 16, 2010 at 8:53 pm
Oct 25, 2010 at 5:57 am -
Hi, I'm currently working on a simple Cassandra Loader that reads an index and then works on that data. Now whenever I try to work on the retrieved data I get a strange error: java.io.IOException: ...
Christian Decker
Oct 12, 2010 at 7:38 pm
Oct 15, 2010 at 10:57 am -
Hey guys - I have a script that loads a list of ~800,000 category hierarchies, filters them a bit and streams them through a PHP script for some quick procedural work. The file contains one column ...
Rob Wilkerson
Oct 1, 2010 at 11:33 am
Oct 1, 2010 at 4:57 pm -
Hi Pig Users, I am currently writing a UDF loader. In one of my use case, one line in the input stream results in multiple tuples. Has anyone encounter or solve this issue on their end. The current ...
John Hui
Oct 27, 2010 at 10:39 pm
Oct 28, 2010 at 3:52 pm -
Hello, I face an issue with PIG temporary files: they are not deleted once a job is terminated. I got my HDFS storage full of PIG temporary files. I use PIG from Java using a PigServer object. Is ...
Vincent Barat
Oct 23, 2010 at 11:30 am
Nov 29, 2010 at 9:59 pm -
What's the best way to do something like this in PIG: JOIN A with B where (A.property1 = B.property1 OR A.property2 = B.property2) ? Thanks, -Rakesh
Rakesh kothari
Oct 18, 2010 at 12:03 am
Oct 27, 2010 at 9:20 pm -
Hello, I want to set more heap space to my scripts, but I can't make Pig support this, when I call: "pig -Dmapred.child.java.opts=-Xmx2048" It fails (just prints help), and option --help doesn't show ...
Wojciech Langiewicz
Oct 22, 2010 at 4:17 pm
Oct 25, 2010 at 4:09 pm -
Hi, I have a pig script that needs certain parameters (passed using "-p" in pig shell) to execute. Is there a way to pass these parameters if I want to execute this script using "PigServer" after ...
Rakesh kothari
Oct 7, 2010 at 6:47 pm
Oct 8, 2010 at 5:05 pm -
Hi there, I have some doubts about zebra usage. The thing is that all my data is already in HDFS, and want to use the zebra storers and loaders, but I don't want to reprocess all my data just to get ...
Renato Marroquín Mogrovejo
Oct 24, 2010 at 8:15 pm
Oct 28, 2010 at 3:51 am -
Hi, What's the best way to diagnose which M/R step PIG is executing ? I was hoping if name of the PIG job can have some relationship with the operator it is executing. It gets difficult to diagnose ...
Rakesh kothari
Oct 27, 2010 at 9:25 pm
Oct 28, 2010 at 12:41 am -
I've seen a few threads about counters, PigStats, Elephant-Bird's stats utility class, etc. http://www.mail-archive.com/pig-user@hadoop.apache.org/msg00900.html ...
Josh Devins
Oct 17, 2010 at 4:15 pm
Oct 18, 2010 at 6:15 pm -
Hi, Couple of Questions: 1. What's the best way to get "Counters" out of a Pig Job execution ? (e.g. Counters object exposed by "org.apache.mapreduce.hadoop.Job") 2. Is there anything special needs ...
Rakesh kothari
Oct 7, 2010 at 11:48 pm
Oct 8, 2010 at 8:26 pm -
Hi all! I'm trying to create a replacement for Pig shell - for running Pig batches from within our control program (dashboard). I'm having problems with calling Pig and would appreciate some help. I ...
Anze
Oct 16, 2010 at 5:58 pm
Oct 16, 2010 at 7:08 pm -
Hi, I came across this patch (https://issues.apache.org/jira/browse/PIG-1518) which supports multifile input format from Pig 0.8 version on wards. A patch is also available for Pig 0.7. I was ...
Uppuluri, Rohini
Oct 29, 2010 at 3:46 pm
Oct 30, 2010 at 5:28 am -
Hi! I hope this is not too newbie question, but it's driving me crazy... How do you count the records in a relation? Like DUMP, but instead of list of records, I would like their count. Thanks, Anze
Anze
Oct 29, 2010 at 11:01 am
Oct 29, 2010 at 12:44 pm -
Hi all,Facing a weird problem and wondering if anyone has run into this before. I've been playing with PigServer to programmatically run some simple pig scripts and it does not seem to be connecting ...
Zach Bailey
Oct 27, 2010 at 11:26 pm
Oct 28, 2010 at 5:53 am -
Hello, I am having an error that is driving me crazy. Any help will be appreciated. First, I have configured hadoop and hdfs according to this tutorial (I did not created an account hadoop, used mine ...
Ruth Garcia
Oct 25, 2010 at 3:53 pm
Oct 25, 2010 at 4:17 pm -
Hi, I met a headache about using UDFs with many dependence, adding them using register command is very painful and not extensible. I can make self-contained jar for hadoop job using maven (a jar with ...
Yong-gang Cao
Oct 22, 2010 at 12:09 am
Oct 22, 2010 at 3:28 pm -
Hi everybody, I'm trying to use vanilla Pig 0.7.0 to generate monthly consolidations of log files with relatively long lines: 95 fields and growing, of which I'll be using just 7. Just so I didn't ...
Marcos Medrado Rubinelli
Oct 20, 2010 at 2:28 pm
Oct 22, 2010 at 2:30 am -
Hi, Our data contain tuples one of whose fields is a tuple containing a bag field and we've seen the following exceptions when we access the bag field: java.lang.ClassCastException: ...
Lin Guo
Oct 12, 2010 at 10:38 pm
Oct 14, 2010 at 10:54 pm -
Hey everyone! In the Pig Journal page (http://wiki.apache.org/pig/PigJournal) says something about getting statistics for Pig's optimizer. Is there any work being done on that? Or are there any other ...
Renato Marroquín Mogrovejo
Oct 14, 2010 at 2:47 am
Oct 14, 2010 at 6:32 pm -
I'm trying to count N-gram occurrences as a percentage of total tuples, and I'm running into a problem that I assume has a simple solution I'm not thinking of. My script basically looks like: log = ...
Mark Stetzer
Oct 8, 2010 at 7:47 pm
Oct 11, 2010 at 7:04 pm -
I have a simple schema that contains an inner bag. What I need to essentially do is that for each tuple in the inner bag, I need to create a new tuple in a new outer bag. This is easier shown than ...
Josh Devins
Oct 8, 2010 at 4:13 pm
Oct 8, 2010 at 9:00 pm -
Hi, I need to integrate existing pig scripts to java application. the scripts have been run in command line. I'm wondering how to do this? I just want to run pig file(*.pig) in java code. Any advice ...
김영우
Oct 7, 2010 at 1:51 am
Oct 7, 2010 at 7:41 am -
I have seen an pig error reported in a .processed file. I have not been able to find the documentation about what a .processed file is. Is it akin to a .substituted file?
Dave Wellman
Oct 28, 2010 at 9:06 pm
Oct 28, 2010 at 9:43 pm -
Hi All, Is there a need/mechanism to report progress in a storage UDF? Thanks in advance. - Sandesh
Sandesh Devaraju
Oct 27, 2010 at 4:41 pm
Oct 28, 2010 at 4:25 pm -
Hi, I have this pig script. 1 data = LOAD '$INPUT' USING PigStorage(',') AS (app:chararray, user:chararray , timestamp:int, duration:int); 2 3 appUserIn = FOREACH data GENERATE app, user; 5 ...
John Hui
Oct 20, 2010 at 8:13 pm
Oct 20, 2010 at 9:32 pm -
I need to use the output of one alias in a future calculation: Suppose I have: C=(5) and then later, I have G=(1,A) (3,B) (5,C) then I want to do a foreach on G where I multiply each G.$0 by C.$0, ...
Matt Tanquary
Oct 20, 2010 at 7:41 pm
Oct 20, 2010 at 7:56 pm -
Hi, Does anyone know if Bash shell works with Pig streaming the same way as Python? I've been struggling with it without success. Here is the bash code (filter.sh) #!/usr/bin/env bash command ...
Alex Wang
Oct 18, 2010 at 10:22 pm
Oct 18, 2010 at 10:34 pm -
$cat a.out [key1#val1,key2#val2]*[key3#[val31,val32]] grunt a = load 'a.out' using PigStorage('*') AS (A:[], B:[]); grunt dump a; here is the output : ([key2#val2,key1#val1],) I guess it is not ...
Prasenjit mukherjee
Oct 13, 2010 at 6:06 am
Oct 14, 2010 at 10:39 pm -
What would be the right syntax to stream through a python script? This does not seem to work, as pig complains about the syntax: DEFINE force_layout `ForceDirected.py` SHIP ...
Sal Uryasev
Oct 12, 2010 at 2:04 am
Oct 12, 2010 at 10:04 pm -
I have a python script defined as import sys for line in sys.stdin: if not line: break sys.stdout.write(line) my data test looks like ...
Felix gao
Oct 7, 2010 at 1:09 am
Oct 8, 2010 at 7:11 pm -
anyone ever read a pig output file with bags/tuples into a java map reduce program?
Corbin Hoenes
Oct 7, 2010 at 7:04 pm
Oct 8, 2010 at 1:30 am -
Assume that I would like to write this pig script: REGISTER myudfs.jar; A = LOAD 'hist_data' AS (id: chararray, word: chararray, count : float ); B = GROUP A BY id C = CROSS B, B D = FOREACH C ...
Paolo D'alberto
Oct 22, 2010 at 9:49 pm
Oct 22, 2010 at 9:49 pm -
Hi, It is my pleasure to welcome Corinne Chandel as our newest Pig committer. Corinne has been responsible for documentation for all Pig releases 0.3.0 and later. We are very happy to have her on ...
Olga Natkovich
Oct 22, 2010 at 6:56 pm
Oct 22, 2010 at 6:56 pm -
Hi, I haven't been able to stream pig data to a command line script, can someone help out? I want to execute a command line script called GMTFilter (all stdin, stdout, and stderr work) from a pig ...
Alex Wang
Oct 20, 2010 at 7:14 pm
Oct 20, 2010 at 7:14 pm -
Wrong list...
Anthony Urso
Oct 15, 2010 at 1:13 am
Oct 15, 2010 at 1:13 am -
Anyone have any pointers on how to test against ZK outside of the source distribution? All the fun classes (e.g. ClientBase) do not make it into the ZK release jar. Right now I am manually running a ...
Anthony Urso
Oct 15, 2010 at 1:12 am
Oct 15, 2010 at 1:12 am -
I'm using Pig 0.6.0 and a fix for bug PIG-619 is causing a performance issue with some of my Jobs. In Pig 0.3.0 a fix was added to create an empty slice for any file with a zero file length. In some ...
Robert Goodman
Oct 12, 2010 at 12:39 am
Oct 12, 2010 at 12:39 am
Group Overview
group | user |
categories | pig, hadoop |
discussions | 46 |
posts | 251 |
users | 67 |
website | pig.apache.org |
67 users for October 2010
Archives
- May 2013 (92)
- April 2013 (226)
- March 2013 (362)
- February 2013 (192)
- January 2013 (166)
- December 2012 (115)
- November 2012 (223)
- October 2012 (249)
- September 2012 (275)
- August 2012 (249)
- July 2012 (219)
- June 2012 (371)
- May 2012 (281)
- April 2012 (377)
- March 2012 (341)
- February 2012 (323)
- January 2012 (364)
- December 2011 (266)
- November 2011 (234)
- October 2011 (207)
- September 2011 (321)
- August 2011 (271)
- July 2011 (253)
- June 2011 (249)
- May 2011 (239)
- April 2011 (341)
- March 2011 (321)
- February 2011 (276)
- January 2011 (320)
- December 2010 (244)
- November 2010 (136)
- October 2010 (251)
- September 2010 (161)
- August 2010 (201)
- July 2010 (198)
- June 2010 (171)
- May 2010 (205)
- April 2010 (192)
- March 2010 (237)
- February 2010 (192)
- January 2010 (182)
- December 2009 (106)
- November 2009 (169)
- October 2009 (105)
- September 2009 (134)
- August 2009 (108)
- July 2009 (140)
- June 2009 (151)
- May 2009 (150)
- April 2009 (133)
- March 2009 (124)
- February 2009 (119)
- January 2009 (66)
- December 2008 (45)
- November 2008 (80)
- October 2008 (102)
- September 2008 (112)
- August 2008 (32)
- July 2008 (46)
- June 2008 (78)
- May 2008 (79)
- April 2008 (26)
- March 2008 (42)
- February 2008 (30)
- January 2008 (15)
- December 2007 (31)
- November 2007 (13)
- October 2007 (9)