Grokbase Groups Pig user
FAQ

Search Discussions

3,031 discussions - 12,338 posts

  • Hi, I have a very weird issue with my PIG script. Following is the content of my script *REGISTER /home/hadoopuser/Workspace/lib/piggybank.jar* *REGISTER /home/hadoopuser/Workspace/lib/datafu.jar;* ...
    Praveen BysaniPraveen Bysani
    May 14, 2013 at 3:10 am
    May 14, 2013 at 10:29 pm
  • I want to something like this B = FOREACH A GENERATE a1, *if a2 = 0: a2=a2+1 else a2*, a3) how to do " if a2 = 0: a2=a2+1 else a2" in PIG (or it could be "if a2 matches < some regex : a2+0 else a2") ...
    Ashish GuptaAshish Gupta
    May 14, 2013 at 4:11 pm
    May 14, 2013 at 4:20 pm
  • Hi, I have a dataset with two three columns, group_id, position, and name. I need for each group to generate a concatenated string of all names ordered by their position. I can do this by sorting all ...
    Ahmed EldawyAhmed Eldawy
    May 13, 2013 at 5:35 pm
    May 13, 2013 at 7:49 pm
  • Hello, What is the bets way to get the count of records in an HDFS file generated by a PIG script. Thanks
    Mix NinMix Nin
    May 13, 2013 at 5:52 pm
    May 13, 2013 at 5:52 pm
  • Hi, I am using PigTest in order to verify a script reading and storing data in avro format. However, at the moment, the script fails due to the optimisation rule ColumnMapKeyPrune. I known I can ...
    Bertrand DechouxBertrand Dechoux
    May 13, 2013 at 9:25 am
    May 13, 2013 at 5:50 pm
  • Hey all, One of my scripts is giving the below error. The script works fine when I run it in Grunt but I get the "Error to read counters into Rank operation counterSize 0". ?? I see this ...
    John MeekJohn Meek
    May 13, 2013 at 5:48 pm
    May 13, 2013 at 5:48 pm
  • Hello, In a Pig script I want to store the results in 2 different MySql tables (using DBStorage) and a file on HDFS. This means 3 different STORE statements. Right now when I do that, it does give ...
    Shahab YunusShahab Yunus
    May 8, 2013 at 2:11 pm
    May 10, 2013 at 1:21 am
  • Hi all, In my script a = load 'data' using PigStorage(); b = foreach a generate 342 as col1, substring(x,0,4) as col2, ; I want to use col2 later in foreach statement. derived col2 should be used ...
    AbhishekAbhishek
    May 7, 2013 at 2:52 am
    May 9, 2013 at 10:24 pm
  • Has anyone built the Piggybank jar with the DSE-Cassandra distribution of Pig? I'm using Pig 0.9.2 on DSE 3.0, and would ultimately just like to use CSVExcelStorage UDF from the Piggybank ...
    Anita MehrotraAnita Mehrotra
    May 9, 2013 at 3:53 pm
    May 9, 2013 at 3:53 pm
  • Hi there, I have a huge input on an HDFS and I would like to use Pig to calculate several unique metrics. To help explain the problem more easily, I assume the input file has the following schema ...
    Thomas EdisonThomas Edison
    May 6, 2013 at 3:11 am
    May 9, 2013 at 3:50 am
  • Any fix for parsing string array in the near future? https://issues.apache.org/jira/browse/PIG-2949 -- Wayne Zhu
    Zhu WayneZhu Wayne
    May 7, 2013 at 10:39 pm
    May 8, 2013 at 7:32 pm
  • I'm trying to set useMatches=false in REGEX_EXTRACT_ALL as per the javadoc: http://pig.apache.org/docs/r0.11.0/api/org/apache/pig/builtin/REGEX_EXTRACT_ALL.html (and yes, I'm using pig 0.11). But it ...
    William ObermanWilliam Oberman
    May 8, 2013 at 5:21 pm
    May 8, 2013 at 5:31 pm
  • Hi, [ I know this question is probably CDH specific, yet I'm hoping one of you may be able to point me in the right direction. ] I want to make a small change to the piggybank for pig 0.10 that is in ...
    Niels BasjesNiels Basjes
    Apr 24, 2013 at 8:46 pm
    May 8, 2013 at 2:59 pm
  • 1

    NVL

    I'd like to write a java UDF that functions more or less the same as a SQL NVL command. I've been stymied on writing a general function by the fact that I want it to work on all data types--in ...
    Catherine MillerCatherine Miller
    May 8, 2013 at 2:15 pm
    May 8, 2013 at 2:55 pm
  • Hi, I'm working on https://issues.apache.org/jira/browse/PIG-3297 and I ran into something strange. The issue is that Pig crashes with a Java Exception on a specific feature in the Avro format (I ...
    Niels BasjesNiels Basjes
    May 8, 2013 at 2:52 pm
    May 8, 2013 at 2:52 pm
  • Hi, I've got some objects originally loaded, using the JSON loader from elephantbird, into nested maps, and subsequently stored using LZOPigStorage after various stages of processing. When I ...
    Kris CowardKris Coward
    May 7, 2013 at 5:06 pm
    May 7, 2013 at 5:06 pm
  • Greetings! Did someone encounter the same issue? Well-formated XML for <Sellers </Sellers is fine: grunt register /usr/lib/pig/piggybank.jar; grunt a = load 'sample.xml' using ...
    Zhu WayneZhu Wayne
    May 6, 2013 at 5:05 pm
    May 7, 2013 at 4:25 pm
  • Dear all, I wonder if someone can tell me if the current version of pig support loop and branching? regards! Yong
    YonghuYonghu
    May 7, 2013 at 11:14 am
    May 7, 2013 at 3:16 pm
  • Hi, I have 2 differents behaviour for the same Pig version (with hadoop 1.1.2) on different servers. If anyone can tell me why with the same versions and the same parameters I don't have the same ...
    Cscetbon ExtCscetbon Ext
    May 3, 2013 at 9:45 am
    May 7, 2013 at 3:06 pm
  • Hi, How to add days to the current date in PIG? Is there any built in fucntion? Regards Soniya
    soniya Bsoniya B
    May 7, 2013 at 12:55 am
    May 7, 2013 at 1:17 am
  • Hey all, If I need to load a Hbase table with Hex values into Pig, does that require a specific UDF? IS there any inbuilt function in Pig? I searched the documentation but cannot find anything that ...
    John MeekJohn Meek
    May 6, 2013 at 3:16 am
    May 6, 2013 at 9:14 pm
  • PIG 0.11 Query : I register the below string String query = "A = LOAD '" + BENCHMARK_PARQUET_MR_DATA_TEXTINPUT + "' using PigStorage() as (" + schemaString + ");"; with ...
    ÐΞ€ρ@Ҝ (๏̯͡๏)ÐΞ€ρ@Ҝ (๏̯͡๏)
    May 4, 2013 at 5:53 am
    May 6, 2013 at 5:26 am
  • I'm new to pig and I'm getting a ClassCastException when I try to run the following script in pig 0.11.1: A = LOAD 'test.log' AS (timestamp:long, pk_id:int, array_field:chararray, fk_id:int); B = ...
    Peter ConnollyPeter Connolly
    May 3, 2013 at 7:01 pm
    May 5, 2013 at 6:57 am
  • it will help me to debug the pig script. Thanks, Jack
    Jinyuan ZhouJinyuan Zhou
    May 3, 2013 at 9:39 pm
    May 3, 2013 at 9:39 pm
  • I'm using pig 0.11.1 to load about 100 rows into mysql using DBStorage. If I run my script from grunt everything works fine and the rows are committed. If I run pig in batch mode the script says it ...
    Peter ConnollyPeter Connolly
    May 3, 2013 at 1:19 pm
    May 3, 2013 at 1:19 pm
  • All, Please join me in welcoming Prashant Kommireddi as our newest Pig committer. He's been contributing to Pig for a while now. We look forward to him being a part of the project. Julien
    Julien Le DemJulien Le Dem
    May 2, 2013 at 7:59 pm
    May 3, 2013 at 5:45 am
  • Hi everyone, I would like to override the input schema in AvroStorage to make a pig script robust to schema evolution. For example, suppose a new field is added to an avro schema with a default value ...
    Enns, StevenEnns, Steven
    Apr 25, 2013 at 11:22 pm
    May 3, 2013 at 5:01 am
  • I was wondering if there is a way to compute the product of all the values in a bag much like the built in function SUM does currently. For reference, I am currently implementing a multinomial naive ...
    Sergey GoderSergey Goder
    May 2, 2013 at 11:42 pm
    May 2, 2013 at 11:42 pm
  • Hi, Anyone can help me to generate system generated DateTime in PIG? I have tried it and didn't get any clue. Regards Soniya
    soniya Bsoniya B
    May 2, 2013 at 7:18 pm
    May 2, 2013 at 7:18 pm
  • Hi, I am just wondering if there is any project that can boost the execution times of PIG scripts through in memory computing or any other possible way. Just like there is Shark/IMPALA for Hive, are ...
    Praveen BysaniPraveen Bysani
    Apr 29, 2013 at 9:16 am
    May 1, 2013 at 6:26 pm
  • Thought I understood how to output to a single file but It doesn't seem to be working. Anything I'm missing here? -- Dedupe and store rows = LOAD '$input'; unique = DISTINCT rows PARELLEL 1; STORE ...
    MarkMark
    May 1, 2013 at 4:52 pm
    May 1, 2013 at 5:21 pm
  • Dear all, I am currently using HBaseStorage to load and store data between HBase and Pig. I have the Pig of the newest version 0.11.1. I worked with hbase-0.90.6 But I found that HBaseStorage in pig ...
    Weiping QuWeiping Qu
    Apr 29, 2013 at 8:08 pm
    Apr 29, 2013 at 8:50 pm
  • *hi all:* * * *i can run pig with cassandra and hadoop in EC2.* * * *I ,m trying to run pig with cassandra ring and hadoop * *The ring cassandra have the tasktrackers and datanodes , too. * * * *and ...
    Miguel Angel Martin junqueraMiguel Angel Martin junquera
    Apr 29, 2013 at 3:21 pm
    Apr 29, 2013 at 4:53 pm
  • Hi Friends, I have registered piggy bank.I tried to to use BinCond funcation but it is not working . Any one can suggest why it not working?
    soniya Bsoniya B
    Apr 26, 2013 at 5:05 pm
    Apr 29, 2013 at 3:29 pm
  • Hi, Anyone can explain me about use of BinCond function with an example? I am trying a lot but didn't work it. Regards Soniya
    soniya Bsoniya B
    Apr 27, 2013 at 2:53 pm
    Apr 28, 2013 at 2:12 am
  • Hi, I have data of format id1,id2, value 1 , abc, 2993 1, dhu, 9284 1,dus,2389 2, acs,29392 and so on For each id1, I want to find the maximum value and then divide value by max_value so in example ...
    Jamal sashaJamal sasha
    Apr 27, 2013 at 9:32 am
    Apr 27, 2013 at 9:41 am
  • I wrote a record loader MyLoader and used it to load aa = LOAD 'input_on_hdfs' USING MyLoader() AS ( blah:chararray, blahblah:chararray ); bb = FOREACH aa generate *; store bb into 'somewhere_else' ...
    YangYang
    Apr 27, 2013 at 6:53 am
    Apr 27, 2013 at 6:53 am
  • Hi, I have a where condition in sql query like below Table1.col1=Table2.col3 and Table2.col2=Table3.col1 and Table3.col3=Table1.col2 In Pig, Can i write like below A= Table1 B=Table2 C=Table3 Joins = ...
    Raj hadoopRaj hadoop
    Apr 25, 2013 at 2:40 am
    Apr 26, 2013 at 4:53 pm
  • Hi, I want to pass a filter statement with in my pig script using parameter substitution. For that I have tried exec -param flt='a1==1 AND a2=2' filterscript.pig But sadly it is throwing an exception ...
    Abhijit ChandaAbhijit Chanda
    Apr 24, 2013 at 11:26 am
    Apr 25, 2013 at 7:00 am
  • Hi, I have just started learning about Pig, and i had a task of importing a line from a text file as a bag in pig. The contents of my file were: {(2,3) (5,6,7,8)} {(2,4) (5,7,8,9)} {(1,3) (4,5,7,9)} ...
    Sachin SudarshanaSachin Sudarshana
    Apr 24, 2013 at 10:17 am
    Apr 25, 2013 at 6:22 am
  • I am using PigServer to run a pig script that communicates with a psuedo-distributed hadoop installation on my localbox. I have included *-site.xml in CLASSPATH. While running the pig script i get ...
    ÐΞ€ρ@Ҝ (๏̯͡๏)ÐΞ€ρ@Ҝ (๏̯͡๏)
    Apr 25, 2013 at 4:47 am
    Apr 25, 2013 at 4:47 am
  • Hi, Can you please help me to generate sequence number using Pig? Raj
    Raj hadoopRaj hadoop
    Apr 24, 2013 at 8:25 pm
    Apr 25, 2013 at 2:25 am
  • Can anybody help on this to convert sql to pig for below query. ---------- Forwarded message ---------- From: suneel hadoop <<span class="m_body_email_addr" title="46ad5ed808a06fbd3caa076aeaadc55c" ...
    Raj hadoopRaj hadoop
    Apr 22, 2013 at 6:45 pm
    Apr 25, 2013 at 12:24 am
  • Hi , I need a way to invoke pig script from a java program and capture the output returned by the pig script. I was looking at the PigRunner api, however did not get much examples. Is there any way ...
    Siddhi BorkarSiddhi Borkar
    Apr 23, 2013 at 6:51 am
    Apr 24, 2013 at 4:34 am
  • If first field is utf8 formate,the output will get unrecognized code SSCNT = FOREACH SSG {UV = DISTINCT SS.ukey; GENERATE '主动搜索uv、pv' as scnt, FLATTEN(group) AS platform, COUNT(UV) as uv, ...
    Centerqi huCenterqi hu
    Apr 24, 2013 at 2:47 am
    Apr 24, 2013 at 3:04 am
  • hi all With Ambrose, but encountered the following problem. Was encountered? https://github.com/twitter/ambrose/issues/68 thx -- <span class="m_body_email_addr" ...
    Centerqi huCenterqi hu
    Apr 23, 2013 at 3:46 am
    Apr 23, 2013 at 9:08 am
  • Hi friends, I am new to PIG script. I need to convert below sql query to PIG script. SELECT ('CSS'||DB.DISTRICT_CODE||DB.BILLING_ACCOUNT_NO) BAC_KEY, CASE WHEN T1.TAC_142 IS NULL THEN 'N' ELSE ...
    Raj hadoopRaj hadoop
    Apr 22, 2013 at 8:59 pm
    Apr 22, 2013 at 10:10 pm
  • Hi Everyone, I have absolutely no experience with Pig and limited experience with hadoop, so please bear with me. We built a small hadoop cluster for experimental purposes and installed pig with all ...
    Mehmet BelginMehmet Belgin
    Apr 22, 2013 at 9:07 pm
    Apr 22, 2013 at 10:09 pm
  • Hi Can someone help me to convert below SQL to pig Latin SELECT ('CSS'||DB.DISTRICT_CODE||DB.BILLING_ACCOUNT_NO) BAC_KEY, CASE WHEN T1.TAC_142 IS NULL THEN 'N' ELSE T1.TAC_142 END TAC_142 FROM ( ...
    Raj hadoopRaj hadoop
    Apr 22, 2013 at 10:05 am
    Apr 22, 2013 at 10:05 am
Group Navigation
period‹ prev | Latest | first ›
Group Overview
groupuser @
categoriespig, hadoop
discussions3,031
posts12,338
users1,182
websitepig.apache.org

Top users

Dmitriy Ryaboy: 1056 posts Jonathan Coveney: 456 posts Alan Gates: 455 posts pRaShAnT: 285 posts Russell Jurney: 270 posts Olga Natkovich: 193 posts Bill Graham: 192 posts Mridul Muralidharan: 183 posts Alan Gates: 178 posts Jeff Zhang: 178 posts Daniel Dai: 160 posts Thejas M Nair: 135 posts Mohit Anchlia: 125 posts Daniel Dai: 122 posts Cheolsoo Park: 115 posts Thejas Nair: 111 posts Hc busy: 105 posts Santhosh Srinivasan: 85 posts Norbert Burger: 80 posts Jeremy Hanna: 79 posts
show more