Search Discussions
-
Anyone have an implementation or ideas towards a StoreFunc for JDBC or MySQL? It looks like I would need to spawn a thread to read the InputStream and reparse the tuples. Overall, it sounds a little ...
Anthony Urso
Feb 7, 2010 at 12:24 pm
Feb 8, 2010 at 8:48 pm -
Hello, I seem to have broken my Pig install, and I don't know where to look. If I use directly the script (grunt) everything works ok, but every time I try to run a pig script: 'java -cp ...
Alex Parvulescu
Feb 12, 2010 at 2:51 pm
Feb 24, 2010 at 8:10 pm -
Hello, I ran into a NPE today, which seems to be my fault, but I'm wondering if there anythig that could be done to make the error more clear. What I did it is: 'C = FOREACH B GENERATE group, ...
Alex Parvulescu
Feb 9, 2010 at 10:22 am
Feb 11, 2010 at 11:13 am -
Is there a way to reuse a pig scripts ( like def:: in python or function calls etc) from inside a calling pig script. I have a set of basic pig script which I would like to call from a high-level ...
Prasenjit mukherjee
Feb 10, 2010 at 2:49 am
Feb 11, 2010 at 3:07 pm -
Guys, I know this must be a common use case, but how do you explode and implode in pig? so, I have a file like this... 1, asdf 2, qewrty 3, zcxvb and I want to apply an explode operation to it: 1, a ...
Hc busy
Feb 19, 2010 at 6:21 pm
Feb 22, 2010 at 10:41 pm -
Is there any way I can have a pig statement wait for a condition.This is what I am trying to do : I am first creating and storing a relation in pig, and then I want to upload that relation via ...
Prasenjit mukherjee
Feb 11, 2010 at 5:12 pm
Feb 16, 2010 at 11:05 am -
Hi All, We have a use-case where we want to automatically register certain jars for command-line users. I tried using jar, but this switch seems to do absolutely nothing. How do we go about ...
Chris Riccomini
Feb 5, 2010 at 7:27 am
Feb 5, 2010 at 10:15 pm -
For some reason, I am unable to filter inside my nested foreach. The basic outline of my script is as follows: 1. Load input 1. 2. Load input 2. 3. Join input1 by key1, input2 by key2; 4. foreach ...
Zaki rahaman
Feb 25, 2010 at 7:27 pm
Mar 1, 2010 at 8:50 am -
Hi, Hope this gets to the right list... I'm fairly new to Pig, been playing around with it for a couple of days. Essentially I'm doing a bit of work to evaluate Pig and its ability to simplify the ...
Guy Jeffery
Feb 2, 2010 at 2:24 pm
Feb 3, 2010 at 5:28 pm -
Hi, I have this script, how to achive results ? ==================================================================================== A = LOAD 'file:///home/hadoop/1.csv' USING PigStorage(',') AS ...
Jumping
Feb 25, 2010 at 6:34 am
Feb 25, 2010 at 8:09 am -
Hi, I am running a pig script to process some webapp logs, and got this java heap error. The task logs look like: ... 2010-02-24 10:54:52,147 INFO org.apache.hadoop.mapred.ReduceTask: Read 2186251 ...
Xiaomeng Wan
Feb 24, 2010 at 6:28 pm
Feb 24, 2010 at 7:37 pm -
Excuse me I could have missed important part of PIG document and asked this trivial question here :) What is the best way to find out the total number of tuples (rows) in the bag of data loaded? For ...
Jiang licht
Feb 23, 2010 at 11:55 pm
Feb 24, 2010 at 6:39 am -
Hey, First off, @Ankur, great work so far on the patch. This probably is not an efficient way of doing mass dumps to DB (but why would you want to do that anyway when you have HDFS?), but it hits the ...
Zaki rahaman
Feb 18, 2010 at 5:38 pm
Feb 19, 2010 at 11:00 am -
Just wondering if I can use the DEFINE command to write my custom mapper/reducer functions. Mapper ( I believe) I can, but what not sure about reducer. I guess this depends how the define commands ...
Prasenjit mukherjee
Feb 18, 2010 at 8:48 am
Feb 18, 2010 at 10:56 am -
Any thoughts on this problem ? I am using a DEFINE command ( in PIG ) and hence the actions are not idempotent. Because of which duplicate execution does have an affect on my results. Any way to ...
Prasenjit mukherjee
Feb 10, 2010 at 2:53 am
Feb 10, 2010 at 4:50 pm -
Hi , I have a small doubt in how pig handles queries containing join of more than 2 tables . Suppose we have 3 tables A,B,C .. and the plan is "((AB)C)" .. We can join A,B in a map reduce job and ...
Bharath v
Feb 3, 2010 at 6:53 am
Feb 4, 2010 at 4:15 am -
hi, i recently found pig, really like it and want to use it for one of our actual projects. getting the basics running was easy, but now i am struggling one a problem. i am trying to get customers ...
Jan Zimmek
Feb 25, 2010 at 6:18 pm
Mar 1, 2010 at 8:22 am -
Any thoughts on including python-based UDFs like the following : http://arnab.org/blog/baconsnake-inlined-python-udfs-pig This will be a big help indeed. -Thanks, Prasen
Prasenjit mukherjee
Feb 26, 2010 at 3:14 am
Mar 1, 2010 at 8:09 am -
3
gzip
I've been up and down the docs, and I see people using GZipped files. But when I try to load them, i get garbage. Basically it loads it as raw data from the local file system. test = LOAD ...Cory Radcliff
Feb 27, 2010 at 4:17 am
Feb 27, 2010 at 6:56 pm -
Hi, I'm facing what seem to be re-entrance errors when using PIG through the Java API. I know that the PigServer object is not reentrant, so I instantiate several PigServers and run them in separated ...
Vincent Barat
Feb 25, 2010 at 3:04 pm
Feb 25, 2010 at 6:19 pm -
Generally the stderr goes to the file <hadoop /logs/userlog/attempt_XXXX_XXXX_N/stderr in the hadoop node running that script. But it is not practical as it requires user to go and search all the ...
Prasenjit mukherjee
Feb 23, 2010 at 11:52 am
Feb 23, 2010 at 4:32 pm -
I had a pig script which reads a folder of ".gz" files and perform some operation on the data. However, here's a problem. The folder contains some corrupted gz files and this causes the hadoop job ...
Jiang licht
Feb 21, 2010 at 6:47 am
Feb 22, 2010 at 8:45 am -
Could soemone please point out the mistake in UDF? package UDF; import java.io.IOException; import java.util.Map; import org.apache.pig.EvalFunc; import org.apache.pig.data.Tuple; import ...
Kelvin Moss
Feb 16, 2010 at 6:39 pm
Feb 17, 2010 at 12:16 am -
Hello, I am trying to FILTER and then ORDER an inner bag, for example: A = LOAD ...blah... AS (first, last, age, kids:bag{kid:tuple(name, age)}); B = FOREACH A { filteredkids = FILTER kids BY age != ...
Rusty Klophaus
Feb 7, 2010 at 5:02 pm
Feb 9, 2010 at 5:03 pm -
I'm having a problem getting the SequenceFileLoader, from the Piggybank, to read sequence files whose values are block comressed (gzip'd). I'm using Pig 0.4.99.0+10, and Hadoop hadoop-0.20.1+152, via ...
Derek Brown
Feb 19, 2010 at 10:45 pm
Feb 22, 2010 at 3:09 pm -
Hi, I have a (hopefully) small request regarding JIRA. I quite like the Road Map feature[1] but unfortunately it doesn't work correctly for Pig as all versions (except 0.0.0) are set to ...
Lars Francke
Feb 11, 2010 at 3:22 am
Feb 11, 2010 at 10:41 pm -
Hello, I have a problem starting the grunt shell. I think this affects the 0.6 branch and forward. This is the error I get when I try to start the shell or when I try to run any script: ...
Alex Parvulescu
Feb 9, 2010 at 9:55 am
Feb 11, 2010 at 8:39 am -
Ok, this might sound little weird. my schema is f1, f2, f3 ,f4, f5, f6 when group by f1, f2, f3. I need to drop exactly one tuple when I have more than one tuples by grouping f1,f2,f3. Also the ...
Felix gao
Feb 8, 2010 at 7:44 pm
Feb 9, 2010 at 12:49 pm -
The Seattle Hadoop/Scalability/NoSQL (yeah, we vary the title) meetup is tonight! We're going to have a guest speaker from MongoDB :) As always, it's at the University of Washington, Allen Computer ...
Bradford Stephens
Feb 24, 2010 at 10:16 pm
Feb 25, 2010 at 8:01 am -
well... I have this data: [key#'1', b#'2', c#'3', key2#5] [key#'2', b#'i', c#'m', key2#6] [key#'3', b#'j', c#'n', key2#7] [key#'4', b#'k', c#'o', key2#8] and I run A= load 'simple_map.data' as ...
Hc busy
Feb 24, 2010 at 8:15 pm
Feb 24, 2010 at 9:59 pm -
Hi there I am using the following version of pig: ~/workspace$ pig-test --version Apache Pig version 0.5.0 (r829623) compiled Oct 25 2009, 18:58:38 I expect the following simple script to reduce the ...
Adil Aijaz
Feb 23, 2010 at 11:52 pm
Feb 24, 2010 at 8:19 pm -
Hi, I would like to welcome Thejas Nair as our newest Pig committer. Thejas has been contributing to Pig for over a year now. He is the main contributor to Pig SQL effort. He also has done ...
Olga Natkovich
Feb 23, 2010 at 6:19 pm
Feb 24, 2010 at 12:59 am -
I'm writing my own LoadFunc which take parameters. I'm finding the only valid parameter type is String. I can't seem to pass an int. Are the parameter types for LoadFunc restricted to strings? I'm ...
Robert Goodman
Feb 19, 2010 at 11:58 pm
Feb 20, 2010 at 12:33 am -
Greetings, It's time for another awesome Seattle Hadoop/Lucene/Scalability/NoSQL Meetup! As always, it's at the University of Washington, Allen Computer Science building, Room 303 at 6:45pm. You can ...
Bradford Stephens
Feb 17, 2010 at 2:10 am
Feb 19, 2010 at 8:24 pm -
I have a range of values that can have an associated gender like 'm', 'f'. I want to include all distinct values that have the same gender across all records. Like if the records are - abc f abc m ...
Kelvin Moss
Feb 11, 2010 at 10:25 pm
Feb 12, 2010 at 12:39 am -
Hi Folks This note is to let you know that we'll be kicking off the inaugural Austin Hadoop User Group on March the 18th. At present, we have speakers lined up from IBM and Rackspace and will cover ...
Stephen Watt
Feb 1, 2010 at 9:44 pm
Feb 2, 2010 at 9:59 pm -
Hi, I would like to welcome Dmitriy Ryaboy as yet another committer to Pig project! Dmitriy has been contributing consistently to Pig for the last eight months. He has been very active on the lists ...
Olga Natkovich
Feb 23, 2010 at 6:57 pm
Feb 23, 2010 at 6:57 pm -
I'm fairly new to Pig and am having a problem with a pig script that works fine in local mode, but fails in Hadoop mode. I'm using Cloudera CDH2, which includes Pig 0.5.0 and Hadoop 0.20.1. The line ...
Jon Armstrong
Feb 22, 2010 at 4:20 am
Feb 22, 2010 at 4:20 am -
The merge from load-store-redesign branch to trunk is now completed. New commits can now proceed on trunk. The load-store-redesign branch is deprecated with this merge and no more commits should be ...
Pradeep Kamath
Feb 19, 2010 at 8:07 pm
Feb 19, 2010 at 8:07 pm -
Hi, I will begin this activity now - a request to all committers to not commit to trunk or load-store-redesign till I send an all clear message - I am anticipating this will hopefully be completed by ...
Pradeep Kamath
Feb 18, 2010 at 7:21 pm
Feb 18, 2010 at 7:21 pm -
Hi, We would like to merge the load-store-redesign branch to trunk tentatively on Thursday. To do this, I would like to request all committers to not commit anything to load-store-redesign branch or ...
Pradeep Kamath
Feb 16, 2010 at 7:34 pm
Feb 16, 2010 at 7:34 pm -
Hey guys! As some of you know from my blog (and occasional posts here), Drawn to Scale been building a complete end-to-end platform that makes dealing with data easy and scalable. You can Process, ...
Bradford Stephens
Feb 15, 2010 at 10:44 pm
Feb 15, 2010 at 10:44 pm -
Hadoop Fans, we have scheduled additional developer sessions in both the bay area and NYC. Also, due to popular demand, we'll be offering a public sysadmin training session immediately following our ...
Christophe Bisciglia
Feb 11, 2010 at 2:01 am
Feb 11, 2010 at 2:01 am -
Hi I am wondering if there is a way to make Pig stop writing zero byte files to output. Thanks Swaminathan
P Swaminathan
Feb 10, 2010 at 4:49 pm
Feb 10, 2010 at 4:49 pm -
Hi everybody I have lots of logs in LZMA format. By the API documentation I haven't seen any Storage class that handles compressed files, does anyone know of an LZMA implementation? What would I need ...
Gustavo Enrique Salazar Torres
Feb 3, 2010 at 6:07 pm
Feb 3, 2010 at 6:07 pm
Group Overview
group | user |
categories | pig, hadoop |
discussions | 45 |
posts | 192 |
users | 49 |
website | pig.apache.org |
49 users for February 2010
Archives
- May 2013 (92)
- April 2013 (226)
- March 2013 (362)
- February 2013 (192)
- January 2013 (166)
- December 2012 (115)
- November 2012 (223)
- October 2012 (249)
- September 2012 (275)
- August 2012 (249)
- July 2012 (219)
- June 2012 (371)
- May 2012 (281)
- April 2012 (377)
- March 2012 (341)
- February 2012 (323)
- January 2012 (364)
- December 2011 (266)
- November 2011 (234)
- October 2011 (207)
- September 2011 (321)
- August 2011 (271)
- July 2011 (253)
- June 2011 (249)
- May 2011 (239)
- April 2011 (341)
- March 2011 (321)
- February 2011 (276)
- January 2011 (320)
- December 2010 (244)
- November 2010 (136)
- October 2010 (251)
- September 2010 (161)
- August 2010 (201)
- July 2010 (198)
- June 2010 (171)
- May 2010 (205)
- April 2010 (192)
- March 2010 (237)
- February 2010 (192)
- January 2010 (182)
- December 2009 (106)
- November 2009 (169)
- October 2009 (105)
- September 2009 (134)
- August 2009 (108)
- July 2009 (140)
- June 2009 (151)
- May 2009 (150)
- April 2009 (133)
- March 2009 (124)
- February 2009 (119)
- January 2009 (66)
- December 2008 (45)
- November 2008 (80)
- October 2008 (102)
- September 2008 (112)
- August 2008 (32)
- July 2008 (46)
- June 2008 (78)
- May 2008 (79)
- April 2008 (26)
- March 2008 (42)
- February 2008 (30)
- January 2008 (15)
- December 2007 (31)
- November 2007 (13)
- October 2007 (9)