Grokbase Groups Pig user April 2013

Search Discussions

65 discussions - 226 posts

  • Hi pig users, I tried to load data using PigStorage that was previously stored using PigStorage but it failed. Each line looks like this in the data file that is generated by PigStorage ...
    Jerry LamJerry Lam
    Apr 17, 2013 at 1:29 am
    Apr 20, 2013 at 12:40 am
  • Hi Friends, I have registered piggy bank.I tried to to use BinCond funcation but it is not working . Any one can suggest why it not working?
    soniya Bsoniya B
    Apr 26, 2013 at 5:05 pm
    Apr 29, 2013 at 3:29 pm
  • Hi, I am very new to PIG/Hadoop, I just started writing my first PIG script a couple days ago. I ran into this problem. My cluster has 9 nodes. I have to join two data sets big and small, each is ...
    Mua BanMua Ban
    Apr 12, 2013 at 3:18 pm
    Apr 15, 2013 at 5:49 pm
  • I'm somewhat familiar with WTF code (my day job is managing the analytics infrastructure team at Twitter). WTF is implemented using Pig 0.11 (in fact some of the Pig 11 features/improvements are ...
    Dmitriy RyaboyDmitriy Ryaboy
    Apr 1, 2013 at 4:20 pm
    Apr 9, 2013 at 7:11 am
  • Hi, I am trying to concatenate an open brace ( "{" ) to a string and I believe pig thinks that I am trying to open a bag or something. This does work: A = LOAD 'short' USING PigStorage('\t') AS ...
    Will FordWill Ford
    Apr 5, 2013 at 12:09 am
    Apr 5, 2013 at 2:17 am
  • Hi , I need a way to invoke pig script from a java program and capture the output returned by the pig script. I was looking at the PigRunner api, however did not get much examples. Is there any way ...
    Siddhi BorkarSiddhi Borkar
    Apr 23, 2013 at 6:51 am
    Apr 24, 2013 at 4:34 am
  • +User group Hi Bhooshan, By default you should be running in MapReduce mode unless specified otherwise. Are you creating a PigServer object to run your jobs? Can you provide your code here? Sent from ...
    Prashant KommireddiPrashant Kommireddi
    Apr 13, 2013 at 4:57 am
    Apr 16, 2013 at 12:58 am
  • I'm running a simple script to add a sequence_number to a relation, sort the result and store to a file: a0 = load '<filename ' using PigStorage('\t','-schema'); a1 = rank a0; a2 = foreach a1 ...
    Lauren BlauLauren Blau
    Apr 4, 2013 at 1:31 pm
    Apr 5, 2013 at 3:41 pm
  • Hi everyone, I would like to override the input schema in AvroStorage to make a pig script robust to schema evolution. For example, suppose a new field is added to an avro schema with a default value ...
    Enns, StevenEnns, Steven
    Apr 25, 2013 at 11:22 pm
    May 3, 2013 at 5:01 am
  • Hi, [ I know this question is probably CDH specific, yet I'm hoping one of you may be able to point me in the right direction. ] I want to make a small change to the piggybank for pig 0.10 that is in ...
    Niels BasjesNiels Basjes
    Apr 24, 2013 at 8:46 pm
    May 8, 2013 at 2:59 pm
  • Dear all, I am currently using HBaseStorage to load and store data between HBase and Pig. I have the Pig of the newest version 0.11.1. I worked with hbase-0.90.6 But I found that HBaseStorage in pig ...
    Weiping QuWeiping Qu
    Apr 29, 2013 at 8:08 pm
    Apr 29, 2013 at 8:50 pm
  • Hi, I have a where condition in sql query like below Table1.col1=Table2.col3 and Table2.col2=Table3.col1 and Table3.col3=Table1.col2 In Pig, Can i write like below A= Table1 B=Table2 C=Table3 Joins = ...
    Raj hadoopRaj hadoop
    Apr 25, 2013 at 2:40 am
    Apr 26, 2013 at 4:53 pm
  • Hi, I want to pass a filter statement with in my pig script using parameter substitution. For that I have tried exec -param flt='a1==1 AND a2=2' filterscript.pig But sadly it is throwing an exception ...
    Abhijit ChandaAbhijit Chanda
    Apr 24, 2013 at 11:26 am
    Apr 25, 2013 at 7:00 am
  • Hi Everyone, I have absolutely no experience with Pig and limited experience with hadoop, so please bear with me. We built a small hadoop cluster for experimental purposes and installed pig with all ...
    Mehmet BelginMehmet Belgin
    Apr 22, 2013 at 9:07 pm
    Apr 22, 2013 at 10:09 pm
  • Is there a way to do RANK within a group in PIG 0.11.1? In the following sample dataset, I would like to Rank DESC by Income, and further RANK by Income for each Industry. Name Industry Income ...
    M GM G
    Apr 15, 2013 at 8:25 pm
    Apr 19, 2013 at 2:17 am
  • Hi, I am using Pig to analyze the percentage of each UserAgents from an apache log. The following program failed because of ORDER command at the very last (the result variable is correct and can be ...
    Lei LiuLei Liu
    Apr 13, 2013 at 3:03 am
    Apr 14, 2013 at 1:11 am
  • Greetings all, I am trying to run Pigunit and receiving an error. I had this previously working, but had to rebuild my local workstation and didn't have everything I should have had checked in. This ...
    j.barrett Strausserj.barrett Strausser
    Apr 10, 2013 at 2:06 pm
    Apr 10, 2013 at 5:26 pm
  • Hello I am trying to run a pig script which is suppoesed to read input from s3 and write back to s3. The cluster scenario is as follows: * Cluster is installed on EC2 using Cloudera Manager 4.5 ...
    Panshul WhisperPanshul Whisper
    Apr 7, 2013 at 5:12 pm
    Apr 10, 2013 at 10:01 am
  • Hi friends, I am new to PIG script. I need to convert below sql query to PIG script. SELECT ('CSS'||DB.DISTRICT_CODE||DB.BILLING_ACCOUNT_NO) BAC_KEY, CASE WHEN T1.TAC_142 IS NULL THEN 'N' ELSE ...
    Raj hadoopRaj hadoop
    Apr 22, 2013 at 8:59 pm
    Apr 22, 2013 at 10:10 pm
  • Hi guys, This is probably just a quick answer, but how do I set the pig job name? I'm generating Pig jobs in Java, and each job has the name "PigLatin:DefaultJobName" in the hadoop tracker. How can I ...
    Jeff YuanJeff Yuan
    Apr 15, 2013 at 11:25 pm
    Apr 20, 2013 at 8:59 pm
  • Hey, With the last release support for jRuby was added to Pig. I've started using this now for some work I'm doing but there are a few details missing that are hard to pull out of the pig code. I ...
    Apr 11, 2013 at 2:21 pm
    Apr 12, 2013 at 5:04 pm
  • Hi, I have a simple join question. base = load 'input1' USING PigStorage( ',' ) as (id1, field1, field2); stats = load 'input2' USING PigStorage(',') as (id1, mean, median); joined = JOIN base BY ...
    Jamal sashaJamal sasha
    Apr 1, 2013 at 9:07 pm
    Apr 2, 2013 at 1:22 am
  • Hi, I am just wondering if there is any project that can boost the execution times of PIG scripts through in memory computing or any other possible way. Just like there is Shark/IMPALA for Hive, are ...
    Praveen BysaniPraveen Bysani
    Apr 29, 2013 at 9:16 am
    May 1, 2013 at 6:26 pm
  • Hi, Anyone can explain me about use of BinCond function with an example? I am trying a lot but didn't work it. Regards Soniya
    soniya Bsoniya B
    Apr 27, 2013 at 2:53 pm
    Apr 28, 2013 at 2:12 am
  • Hi, I have just started learning about Pig, and i had a task of importing a line from a text file as a bag in pig. The contents of my file were: {(2,3) (5,6,7,8)} {(2,4) (5,7,8,9)} {(1,3) (4,5,7,9)} ...
    Sachin SudarshanaSachin Sudarshana
    Apr 24, 2013 at 10:17 am
    Apr 25, 2013 at 6:22 am
  • Hi, Can you please help me to generate sequence number using Pig? Raj
    Raj hadoopRaj hadoop
    Apr 24, 2013 at 8:25 pm
    Apr 25, 2013 at 2:25 am
  • hi all With Ambrose, but encountered the following problem. Was encountered? thx -- <span class="m_body_email_addr" ...
    Centerqi huCenterqi hu
    Apr 23, 2013 at 3:46 am
    Apr 23, 2013 at 9:08 am
  • Hi, We are new to hadoop family(Pig,Hive). We started a project on Pig, We are set to define some coding standards as well performance benchmarking activities so kindly help us with any specific doc ...
    Raj hadoopRaj hadoop
    Apr 21, 2013 at 1:54 pm
    Apr 21, 2013 at 2:09 pm
  • Hi, I created an addition to the ua-parser project ( ) so that the methods for parsing a useragent can be called as a UDF from Pig ...
    Niels BasjesNiels Basjes
    Apr 20, 2013 at 10:10 pm
    Apr 20, 2013 at 10:35 pm
  • Hi there, I have a pig script similar to the one below. When testing this on a cluster with an empty file, I see it taking ages to complete - It goes through all the commands and runs the jobs across ...
    John FarrellyJohn Farrelly
    Apr 9, 2013 at 1:29 pm
    Apr 18, 2013 at 12:59 pm
  • Hi, I'm attempting to build a custom LoadFunc for pig and I'm running into a rather silly issue. My project has several dependencies and I've been trying to create a single jar that contains all of ...
    Niels BasjesNiels Basjes
    Apr 16, 2013 at 8:44 pm
    Apr 16, 2013 at 9:49 pm
  • Hi y'all, First time on this list, and hoping you might be able to help me with a (possible) issue. I'm working with some data in Pig that includes strings of interest, optionally separated by ...
    Dylan SatherDylan Sather
    Apr 16, 2013 at 5:04 am
    Apr 16, 2013 at 9:46 pm
  • Is there an easy way to non-truncate ILLUSTRATE and make each statement more verbose for large nested complex Objects (aka, force removal of the ellipsis from ILLUSTRATE and show all data)? Many ...
    Dan DeCapria, CivicScienceDan DeCapria, CivicScience
    Apr 9, 2013 at 3:15 pm
    Apr 9, 2013 at 6:09 pm
  • Hi, all I have a tuple like this: (group_a,group_b,group_c,value) and I want to calculate the values in a data cube way, which means I want to generate new tuples from the original one ...
    Haitao YaoHaitao Yao
    Apr 3, 2013 at 3:34 am
    Apr 3, 2013 at 6:08 am
  • *hi all:* * * *i can run pig with cassandra and hadoop in EC2.* * * *I ,m trying to run pig with cassandra ring and hadoop * *The ring cassandra have the tasktrackers and datanodes , too. * * * *and ...
    Miguel Angel Martin junqueraMiguel Angel Martin junquera
    Apr 29, 2013 at 3:21 pm
    Apr 29, 2013 at 4:53 pm
  • Hi, I have data of format id1,id2, value 1 , abc, 2993 1, dhu, 9284 1,dus,2389 2, acs,29392 and so on For each id1, I want to find the maximum value and then divide value by max_value so in example ...
    Jamal sashaJamal sasha
    Apr 27, 2013 at 9:32 am
    Apr 27, 2013 at 9:41 am
  • Can anybody help on this to convert sql to pig for below query. ---------- Forwarded message ---------- From: suneel hadoop <<span class="m_body_email_addr" title="46ad5ed808a06fbd3caa076aeaadc55c" ...
    Raj hadoopRaj hadoop
    Apr 22, 2013 at 6:45 pm
    Apr 25, 2013 at 12:24 am
  • If first field is utf8 formate,the output will get unrecognized code SSCNT = FOREACH SSG {UV = DISTINCT SS.ukey; GENERATE '主动搜索uv、pv' as scnt, FLATTEN(group) AS platform, COUNT(UV) as uv, ...
    Centerqi huCenterqi hu
    Apr 24, 2013 at 2:47 am
    Apr 24, 2013 at 3:04 am
  • Hallo, i have setup flume to write my logs on hadoop. But flume creates tmp files which i can cat and read using hdfs dfs -cat. But when i try to read that data i only get first line. Is there a way ...
    Bojan KostićBojan Kostić
    Apr 8, 2013 at 4:28 pm
    Apr 22, 2013 at 6:05 am
  • I have a relation of about 50000 tuples that I want to join to itself either by using a cross operator or something similar. Then I would be doing pair wise computation of half the matrix (avoiding ...
    Apr 19, 2013 at 4:18 am
    Apr 19, 2013 at 9:13 pm
  • Consider two aliases (T) and (U), loaded from data with schema defined below. I was considering a left outer join to 'merge' the two records, overriding those in U with the join fields in T, but the ...
    Dan DeCapria, CivicScienceDan DeCapria, CivicScience
    Apr 19, 2013 at 6:25 pm
    Apr 19, 2013 at 9:11 pm
  • Hi, Q1:I have a question about how to use filter on tuple. The code is: -------------------------------------------------------- REGISTER pig.jar; raw = LOAD 'data.txt' USING PigStorage('|') AS ...
    Apr 19, 2013 at 3:07 am
    Apr 19, 2013 at 9:01 pm
  • hey,guys, is there any way to log the detailed warn or error message from my udfs ? Apache Pig version 0.8.1-cdh3u1 i wrote the udf warn like this: IpRegion region = null; try { region = ...
    Apr 17, 2013 at 8:28 am
    Apr 17, 2013 at 4:05 pm
  • I have defined a pig UDF want to track problems using warnings like this: warn("My warning", PigWarning.UDF_WARNING_1); I'm testing this in local mode first, but I never see this warning anywhere ...
    James NewhavenJames Newhaven
    Apr 13, 2013 at 4:24 pm
    Apr 13, 2013 at 11:45 pm
  • I set default_parallel=15 but when I did a y = group z ALL; x = foreach y generate SIZE(z); the 2 lines generate a MR job with only 1 reducer. I guess it's because SIZE() needs to count all the ...
    Apr 11, 2013 at 10:14 pm
    Apr 12, 2013 at 12:53 am
  • Check out this massive ILLUSTRATE that worked in Pig 0.11: -- Russell Jurney <span class="m_body_email_addr" ...
    Russell JurneyRussell Jurney
    Apr 10, 2013 at 9:27 pm
    Apr 10, 2013 at 9:50 pm
  • I have the following data: records = foreach std generate request_date as request_date, SubtractDuration(time, CONCAT('PT', CONCAT((chararray)CEIL(response_time), 'S')) as time_requested, ...
    Apr 9, 2013 at 7:04 pm
    Apr 9, 2013 at 7:53 pm
  • Hi I'm trying to run pig script with builtin macros like count/sum/avg ... (org.apache.pig.builtin) But on every method I trued I get the same ERROR ERROR - ERROR ...
    Raanan nitzanRaanan nitzan
    Apr 6, 2013 at 8:08 pm
    Apr 9, 2013 at 4:16 pm
  • The following gist illustrates my question: It seems pretty surprising to me that all of these cases all return 1.0, at least in python (I will now do this in ...
    Jonathan CoveneyJonathan Coveney
    Apr 5, 2013 at 4:06 pm
    Apr 5, 2013 at 4:11 pm
Group Navigation
period‹ prev | Apr 2013 | next ›
Group Overview
groupuser @
categoriespig, hadoop

69 users for April 2013

Prashant Kommireddi: 20 posts Ruslan Al-Fakikh: 16 posts Jerry Lam: 10 posts j.barrett Strausser: 9 posts Raj hadoop: 9 posts Russell Jurney: 7 posts Centerqi hu: 6 posts Dan DeCapria, CivicScience: 6 posts soniya B: 6 posts Cheolsoo Park: 5 posts Dmitriy Ryaboy: 5 posts Johnny Zhang: 5 posts Lauren Blau: 5 posts Niels Basjes: 5 posts Bill Graham: 4 posts Burakkk: 4 posts Jamal sasha: 4 posts Miguel Angel Martin junquera: 4 posts Mua Ban: 4 posts Siddhi Borkar: 4 posts
show more