Search Discussions
-
I'm currently testing PIG 0.9.x branch. Several of my jobs that use to work correctly with PIG 0.8.1 now fail due to a cast error returning a null pointer in one of my UDF function. Apparently, PIG ...
Vincent Barat
Aug 29, 2011 at 3:06 pm
Oct 6, 2011 at 9:36 pm -
I'm analyzing a daily apache log file. I'd like to get the number of requests and of visits by hour. I managed to get the requests, but how do I get the visits? grunt RAW_LOGS = LOAD '<log-file ' ...
David Riccitelli
Aug 19, 2011 at 2:13 pm
Aug 19, 2011 at 4:52 pm -
I think this is similar to to the 'merge' join issue not being automatically supported. If we have done a GROUP in the past, this data should have been mapped, then handed off to the reducers and ...
Kevin Burton
Aug 30, 2011 at 8:21 pm
Aug 31, 2011 at 11:00 pm -
This just bit me. I can do: STORE data INTO '/tmp/brokenfs.out'; but fs -ls '/tmp/brokenfs.out'; won't work because it can't be quoted. fs -ls /tmp/brokenfs.out; works though. ………... I'm pretty sure ...
Kevin Burton
Aug 24, 2011 at 8:28 pm
Aug 25, 2011 at 6:26 pm -
All, I have some data that I would like to store into a file and then load it in a UDF to do some operations in the next pig statement. For example, doc_ids = FOREACH docs GENERATE doc_id; STORE ...
Eshwaran Vijaya Kumar
Aug 10, 2011 at 4:10 pm
Aug 11, 2011 at 12:51 am -
Hello, I found pig-0.8.1 included classes of Hbase-0.90.0. My question is: 1. If I replace Hbase-0.90.3 with Hbase-0.90.0, could pig-0.8.1 work normally? 2. why Hbase class files are included in ...
Lulynn_2008
Aug 7, 2011 at 3:58 pm
Jun 12, 2012 at 1:22 am -
I'm optimizing a somewhat large pig job. One of the intermediate steps is a group which we use moving forward. The data right now looks like: 0 {(1),(2),(3),(4)} which has a second column of a bag of ...
Kevin Burton
Aug 20, 2011 at 8:48 am
Aug 23, 2011 at 2:37 am -
Hi Eric, Thanks for your response. Brisk sounds nice, but I feel that disregarding HDFS and totally switching to Cassandra is not the right thing to do. Just my opinion there. I feel we are not using ...
Tharindu Mathew
Aug 30, 2011 at 5:07 pm
Aug 31, 2011 at 5:31 am -
I have a Jython UDF I've written that works fine in local mode but bombs out when I run it on my cluster. I'm running 0.8.0, and my stack trace and environment variables are below. ...
Mark Roddy
Aug 30, 2011 at 12:15 am
Aug 30, 2011 at 6:00 pm -
Hello, I runTestDataModel test case with IBM JDK, and got the following error: Testcase: testTupleToString took 0.002 sec FAILED toString expected:<...ad a little ...
Lulynn_2008
Aug 17, 2011 at 2:56 am
Aug 26, 2011 at 3:28 am -
I'm looking at BinStorage which I believe if I've read correct is used for all Pig intermediate files. … so any optimizations here would be transparent to the user. I just did a simple STORE using ...
Kevin Burton
Aug 20, 2011 at 10:44 am
Aug 22, 2011 at 3:15 pm -
OK….. I still can't get this to work. I've read the documentation and i still get the same error on 0.9.0 … Here's my code. I think it's implying that I need to have the predecessor as a LOAD and ...
Kevin Burton
Aug 20, 2011 at 9:03 pm
Aug 22, 2011 at 5:18 am -
I was reading about USING 'merge' with JOIN when relations are already sorted. I actually was just looking through some code and realized that one of my JOINs was on two relations that were *already* ...
Kevin Burton
Aug 20, 2011 at 5:52 am
Aug 21, 2011 at 4:59 pm -
Hi, I'm running pig jobs using Amazon pig support, where you submit jobs with comma concatenated parameters like this: elastic-mapreduce --pig-script --args myscript.pig --args ...
Dexin Wang
Aug 17, 2011 at 11:22 pm
Aug 18, 2011 at 7:44 pm -
Hi, I have some metrics stored on a Cassandra supercolumn and the subcolumns are the timestamps of each metric, I'm loading the metrics in pig with this line: all_metrics = LOAD ...
Fabio Souto
Aug 17, 2011 at 3:10 pm
Aug 18, 2011 at 6:40 am -
Despite any amount of finagling I do with the classpath, I can't get pig to connect to my local pseudo-distributed hadoop instance NOR my cluster on EC2. My EC cluster is 20.2 CDH, local ...
Chris Allen
Aug 15, 2011 at 10:38 pm
Aug 17, 2011 at 8:59 pm -
Hi all, I'm curious if it's possible to migrate Apache Pig to other MR runtimes instead of Hadoop? I assume it requires tons of work to do so, right? Thanks, Yuduo
Yuduo Zhou
Aug 29, 2011 at 8:18 pm
Sep 1, 2011 at 1:10 am -
Hey to everyone! I've encountered with a problem when I need to pass null or empty -param to pig, but I can't figure out how could it be done? Following does not work: /pig/bin/pig -param rootPath= ...
Marek Miglinski
Aug 22, 2011 at 10:13 am
Aug 30, 2011 at 3:03 pm -
I'm reading the documentation and it says: "*Regular Join Optimizations* Optimization for regular joins ensures that the last table in the join is not brought into memory but streamed through ...
Kevin Burton
Aug 28, 2011 at 7:12 pm
Aug 29, 2011 at 9:43 pm -
My apologies if this is in the docs somewhere, I was unable to find anything, but I might be calling it the wrong name. I'm doing a full outer join in Pig - as such, one or the other join keys may be ...
James Kebinger
Aug 29, 2011 at 6:15 pm
Aug 29, 2011 at 7:18 pm -
Hi, I run PIG jobs from a Java process (using PigServer). Most of which use HBaseStorage to load data from HBase. Each job is run using a new PigServer object, and I correctly call ...
Vincent Barat
Aug 26, 2011 at 12:30 pm
Aug 29, 2011 at 1:43 pm -
Hi, Over the bunch of request I run using PIG 0.8.1, the most heavy one is the following: /* load session data from HBase */ start_sessions = load ... (start of sessions) end_sessions = load ... (end ...
Vincent Barat
Aug 23, 2011 at 4:28 pm
Aug 26, 2011 at 12:11 pm -
Hello, On https://issues.apache.org/jira/browse/PIG-200, one can find the scripts to generate data. But the data seems generic, meaning no relation to the pigmix scripts 1-17 published on ...
Keren Ouaknine
Aug 26, 2011 at 1:48 am
Aug 26, 2011 at 4:33 am -
Hi Folks, I want to delete a file "xyz.tmp" from my hdfs location below: hdfs://MASTER/user/test/xyz.tmp I have embedded the following statement in my pigscript: --a.pig fs -rmr 'xyz.tmp'; Everytime ...
Ipshita chatterji
Aug 25, 2011 at 8:31 am
Aug 25, 2011 at 5:03 pm -
Here's an explain I'm trying to grok. The last Load is frustrating because the file isn't descriptive at all. I have to scroll up and find out which file it was from which mapred job. I the file had ...
Kevin Burton
Aug 23, 2011 at 8:19 pm
Aug 23, 2011 at 11:46 pm -
Hi, Iam able to compile pig udf for pig-0.8.0 version . Its giving me an error when I have tried compiling on pig-0.8.1 version. following is the error message: cannot access ...
SRINIVAS SURASANI
Aug 19, 2011 at 6:33 pm
Aug 23, 2011 at 5:10 pm -
Hello, I'm trying to generate a tuple from a very wide data set, but running in to problems. I'm running Pig 0.9.0 r1148983 in local mode. Because the data set it so wide, I'd prefer not to ...
Ggrambo
Aug 16, 2011 at 10:17 pm
Aug 18, 2011 at 6:02 pm -
Hi All, I am trying to perform a join of some hbase tables in pig and I am using HBaseStorage to load the data from hbase in pig . I was able to load my data using HBaseStorage but I have one ...
Gayatri Rao
Aug 15, 2011 at 5:58 pm
Aug 15, 2011 at 6:38 pm -
Hi folks, We have a ~35 GB Hbase table that's split across several hundred regions. I'm using the Pig version bundled with CDH3u1, which is 0.8.1 plus a few patches. In particular, it includes ...
Norbert Burger
Aug 15, 2011 at 4:20 pm
Aug 15, 2011 at 6:14 pm -
org.apache.pig.PigCounters PROACTIVE_SPILL_COUNT_RECS 2,372,598 2,372,598 SPILLABLE_MEMORY_MANAGER_SPILL_COUNT 64 64 PROACTIVE_SPILL_COUNT_BAGS I was checking my jobtracker and I have no idea what ...
Sean Barry
Aug 2, 2011 at 6:43 pm
Aug 3, 2011 at 6:30 pm -
How is UNION implemented? Does it read from two source files or does it create a temporary file by reading the N source files/relations and then writing a new temp file which is then read from? I ...
Kevin Burton
Aug 29, 2011 at 8:32 pm
Aug 29, 2011 at 9:45 pm -
Hi I read on the wiki that further developments will be carried out allowing users to write their UDFs in other languages. I am specifically interested in being able to use R functions in Pig. Also, ...
Asif Jan
Aug 29, 2011 at 2:25 pm
Aug 29, 2011 at 6:36 pm -
The COUNT_STAR thing bites people a lot -- clearly, even the most advanced Pig users mess this up once in a while. It's a really hard bug to track down. We should reconsider our decision to make ...
Dmitriy Ryaboy
Aug 1, 2011 at 7:18 pm
Aug 22, 2011 at 5:09 pm -
Hi , Please see the code snippet below: register pig.jar; register piggybank.jar; o1 = load 'observations.csv' as (obs_id, encounter_id, sub_form_id, observed_by, verified_by, remark); oc1 = load ...
Ipshita chatterji
Aug 19, 2011 at 2:42 pm
Aug 19, 2011 at 6:27 pm -
I have a need within a larger Pig script to pull just a few records from an Hbase table. I know the exact key, so it'd be trivial with a get() from a UDF. Another alternative is use to a custom ...
Norbert Burger
Aug 19, 2011 at 4:17 pm
Aug 19, 2011 at 4:31 pm -
Hi Folks, I am very new to PIG. I am facing problems in using DiffDate function present in org.apache.pig.piggybank. evaluation.datetime. How do I pass 2 dates in a tuple format? I get an error. This ...
Ipshita chatterji
Aug 19, 2011 at 4:06 am
Aug 19, 2011 at 6:05 am -
Hey, I wanted to see if the following is possible in pig-0.8.1. a = load '/logs/apache/*/today/access.log.txt' USING PigStorage() AS ('.... tuple') I want to add to the existing tuple a chararray ...
Sridhar basam
Aug 18, 2011 at 7:19 pm
Aug 18, 2011 at 7:53 pm -
Hello, In pig-0.8.1/src directory, I did not find any java file import javax.servlet.jsp... So my question is: why jsp-api-2.1-6.1.14 is included in pig-0.8.1-SNAPSHOT.jar? Can I replace it with ...
Lulynn_2008
Aug 12, 2011 at 7:03 am
Aug 17, 2011 at 6:44 pm -
Hi group, Can I use a nested group in foreach? For example: A = load data ... as (a1:..., a2:..., a3:..., ...) B = group A by a1; C = foreach B { * inner_group = group A by a2;* generate group, ...
唐亮
Aug 16, 2011 at 11:12 pm
Aug 17, 2011 at 6:20 pm -
Hi dear pigs, I got a problem: When I use UNION command to combine some results in one relation at the end of pig script, it sometimes will miss some results from UNION. For example: union_all_res = ...
唐亮
Aug 17, 2011 at 7:15 am
Aug 17, 2011 at 5:54 pm -
Hi all first, sorry if this has been asked before, but could not find any reference in the list archives. I have tried to run the PigUnit example (top_queries.pig) provided on ...
Sotiris Matzanas
Aug 11, 2011 at 8:53 am
Aug 16, 2011 at 7:14 am -
I am trying to use the PIG SUM function to sum a group of integers created by a UDF and I am getting Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer A = ...
Rob parker
Aug 11, 2011 at 3:16 pm
Aug 11, 2011 at 6:27 pm -
Hello, I found pig-0.8.1 included junit-4.5 class files. Could you please give me some suggestion my questions : Can pig-0.8.1 can work with junit 4.3.1 or 4.8.1 or 4.8.2? why included classes in ...
Lulynn_2008
Aug 7, 2011 at 3:53 pm
Aug 9, 2011 at 10:16 am -
Hi, I have been struck with this exception: java.io.IOException: config() at org.apache.hadoop.conf.Configuration.(Configuration.java:211) at ...
Jagaran das
Aug 6, 2011 at 3:23 am
Aug 6, 2011 at 5:00 am -
I have been very excited to give Pig 0.9 a try and run it against our Cloudera CDH3U0 hadoop cluster and I need to point Pig to the cloudera hadoop libraries to make it work. I tried re-building pig ...
Andy Sautins
Aug 4, 2011 at 7:58 pm
Aug 4, 2011 at 11:20 pm -
Fang Fang FF Chen
Aug 1, 2011 at 3:08 pm
Aug 1, 2011 at 3:18 pm -
Hi pigs: Can I distinct by multiple columns? For example: A = load ... as (a1:int, a2:int, a3:int); B = DISTINCT A; -- It's OK. -- But can I distinct by a1 and a2? C = DISTINCT A.a1, A.a2; -- It's ...
唐亮
Aug 31, 2011 at 6:32 am
Aug 31, 2011 at 4:07 pm -
I'm trying to spend more time understanding EXPLAIN so I can see what optimizations pig is doing under the hood. I was actually trying to answer my own question using EXPLAIN and avoid sending a ...
Kevin Burton
Aug 30, 2011 at 8:03 pm
Aug 30, 2011 at 9:34 pm -
Hello Thejas Nair, During running ant -Dtestcase=TestMergeJoinOuter test with other JDK but not SUN JDK. I found my output is different from the one under SUN JDK: SUN JDK: passed Other JDK: failed ...
Lulynn_2008
Aug 23, 2011 at 8:45 am
Aug 25, 2011 at 12:58 am -
I seem to have a need for pre compiler directives in pig. These aren't part of the compiled map reduce job…. For example: if file_exists( "/foo" ): run prepare.pig run execute.pig …. the prepare.pig ...
Kevin Burton
Aug 24, 2011 at 7:28 am
Aug 24, 2011 at 5:07 pm
Group Overview
group | user |
categories | pig, hadoop |
discussions | 71 |
posts | 271 |
users | 58 |
website | pig.apache.org |
58 users for August 2011
Archives
- May 2013 (92)
- April 2013 (226)
- March 2013 (362)
- February 2013 (192)
- January 2013 (166)
- December 2012 (115)
- November 2012 (223)
- October 2012 (249)
- September 2012 (275)
- August 2012 (249)
- July 2012 (219)
- June 2012 (371)
- May 2012 (281)
- April 2012 (377)
- March 2012 (341)
- February 2012 (323)
- January 2012 (364)
- December 2011 (266)
- November 2011 (234)
- October 2011 (207)
- September 2011 (321)
- August 2011 (271)
- July 2011 (253)
- June 2011 (249)
- May 2011 (239)
- April 2011 (341)
- March 2011 (321)
- February 2011 (276)
- January 2011 (320)
- December 2010 (244)
- November 2010 (136)
- October 2010 (251)
- September 2010 (161)
- August 2010 (201)
- July 2010 (198)
- June 2010 (171)
- May 2010 (205)
- April 2010 (192)
- March 2010 (237)
- February 2010 (192)
- January 2010 (182)
- December 2009 (106)
- November 2009 (169)
- October 2009 (105)
- September 2009 (134)
- August 2009 (108)
- July 2009 (140)
- June 2009 (151)
- May 2009 (150)
- April 2009 (133)
- March 2009 (124)
- February 2009 (119)
- January 2009 (66)
- December 2008 (45)
- November 2008 (80)
- October 2008 (102)
- September 2008 (112)
- August 2008 (32)
- July 2008 (46)
- June 2008 (78)
- May 2008 (79)
- April 2008 (26)
- March 2008 (42)
- February 2008 (30)
- January 2008 (15)
- December 2007 (31)
- November 2007 (13)
- October 2007 (9)