Grokbase Groups Pig user
FAQ

Search Discussions

3,031 discussions - 12,338 posts

  • Hi pig users, I tried to load data using PigStorage that was previously stored using PigStorage but it failed. Each line looks like this in the data file that is generated by PigStorage ...
    Jerry LamJerry Lam
    Apr 17, 2013 at 1:29 am
    Apr 20, 2013 at 12:40 am
  • In Java, I am trying to convert a DataBag from it's String representation with its schema String to a valid DataBag Object: String databag_string = "{(apples,1024)}"; String schema_string = ...
    Dan DeCapria, CivicScienceDan DeCapria, CivicScience
    Mar 18, 2013 at 8:19 pm
    Mar 21, 2013 at 3:52 pm
  • Hi, I executed below PIG commands. X= LOAD '/user/lnindrakrishna/input/ExpTag.txt' AS (line:chararray); Y=foreach data { generate STRSPLIT(line,',') ;}; And I get below error. What is wrong in my ...
    Mix NinMix Nin
    Mar 5, 2013 at 10:49 pm
    Mar 5, 2013 at 11:51 pm
  • Hi, i'm using hadoop 1.0.4, cassandra 1.2.2 and pig 0.11.0. Can any one help me with an example on how to use pig either for Storing to cassandra from *pig* using Cassandrastorage, or Loading rows ...
    Mohammed AbdelkhalekMohammed Abdelkhalek
    Mar 18, 2013 at 3:15 pm
    Mar 18, 2013 at 5:41 pm
  • All, Please join me in welcoming Prashant Kommireddi as our newest Pig committer. He's been contributing to Pig for a while now. We look forward to him being a part of the project. Julien
    Julien Le DemJulien Le Dem
    May 2, 2013 at 7:59 pm
    May 3, 2013 at 5:45 am
  • If I define and set tuple like this: Tuple t1 = mTupleFactory.newTuple(2); t1.set(0, "Hello"); t1.set(1, NULL); and have schema like: b:bag{t:tuple(a:chararray, b:chararray) and then in the pig ...
    Mohit AnchliaMohit Anchlia
    Mar 7, 2013 at 12:59 am
    Mar 7, 2013 at 8:34 pm
  • When I try to run pig 0.12.0, I got the following error $ pig12 -param input="t" -param output="s" -c b224G_1.pig log4j:ERROR Could not find value for key log4j.appender.NullAppender log4j:ERROR ...
    Danfeng LiDanfeng Li
    Mar 12, 2013 at 9:50 pm
    Mar 13, 2013 at 5:28 pm
  • Please welcome Bill Graham as our latest Pig PMC member. Congrats Bill!
    Daniel DaiDaniel Dai
    Feb 19, 2013 at 9:48 pm
    Feb 20, 2013 at 10:23 pm
  • Hi , I need to pass parameters dynamically to a pig script. Is there any way to read the parameters passed and their corresponding values without giving the parameter names in the pig script? Thanks, ...
    Siddhi BorkarSiddhi Borkar
    Feb 19, 2013 at 11:59 am
    Feb 21, 2013 at 8:30 am
  • Hello All, I have dataset like 0, 10.1, 20.1, 30, 40, 50, 60, 70, 80.1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1, 2, 3, 4, 5, 56, 6, 7, 8, 9, 9, 9, 9, 12, 1, 3, 14, 1, 5, 6, 7, 8, 8, ...
    Preeti GuptaPreeti Gupta
    Mar 4, 2013 at 11:19 pm
    Mar 5, 2013 at 10:49 pm
  • How do I remove the last item in a bag. For example: (group_1,{(2012-12-15,a),(2012-12-17,a),(2012-12-23,c)}) I would like to remove the last item so that the following is the result ...
    Chan, TimChan, Tim
    Mar 12, 2013 at 11:33 pm
    Mar 15, 2013 at 7:46 pm
  • I am writing a loader for a storage format, which partitions by a particular field in the record. So I would like to implement something which can push down filters on the partitioned field so that ...
    Jeff YuanJeff Yuan
    Mar 14, 2013 at 8:31 pm
    Mar 15, 2013 at 10:17 am
  • I can start a grunt shell just fine: -bash-3.2$ pwd /home/rfcompton/Downloads/pig-0.11.0-src -bash-3.2$ ./bin/pig 2013-03-21 12:55:00,048 [main] INFO org.apache.pig.Main - Apache Pig version ...
    Ryan ComptonRyan Compton
    Mar 21, 2013 at 8:06 pm
    Mar 21, 2013 at 11:17 pm
  • I'm somewhat familiar with WTF code (my day job is managing the analytics infrastructure team at Twitter). WTF is implemented using Pig 0.11 (in fact some of the Pig 11 features/improvements are ...
    Dmitriy RyaboyDmitriy Ryaboy
    Apr 1, 2013 at 4:20 pm
    Apr 9, 2013 at 7:11 am
  • Hi, I am trying to concatenate an open brace ( "{" ) to a string and I believe pig thinks that I am trying to open a bag or something. This does work: A = LOAD 'short' USING PigStorage('\t') AS ...
    Will FordWill Ford
    Apr 5, 2013 at 12:09 am
    Apr 5, 2013 at 2:17 am
  • Hi, I am very new to PIG/Hadoop, I just started writing my first PIG script a couple days ago. I ran into this problem. My cluster has 9 nodes. I have to join two data sets big and small, each is ...
    Mua BanMua Ban
    Apr 12, 2013 at 3:18 pm
    Apr 15, 2013 at 5:49 pm
  • Hi Friends, I have registered piggy bank.I tried to to use BinCond funcation but it is not working . Any one can suggest why it not working?
    soniya Bsoniya B
    Apr 26, 2013 at 5:05 pm
    Apr 29, 2013 at 3:29 pm
  • Hi, I am trying to find the way to run the explain command over the entire pig script in java. I was using PigServer but it offers only to do explain over the single query (alias) not the entire ...
    Petar JovanovicPetar Jovanovic
    Feb 18, 2013 at 4:19 pm
    Feb 18, 2013 at 11:28 pm
  • The JsonLoader works, but problem is I'm not loading a JSON file, but just trying to parse a json string as part of a bigger data set. That's why I needed to use JsonStringToMap.
    Eli FinkelshteynEli Finkelshteyn
    Mar 1, 2013 at 8:24 pm
    Mar 4, 2013 at 5:05 pm
  • Hello, Can I compute SUM or AVG without using GROUPBY OR FILTER?
    Preeti GuptaPreeti Gupta
    Mar 4, 2013 at 11:50 pm
    Mar 5, 2013 at 10:06 pm
  • Sorry for posting same issue multiple times I wrote a pig script as follows and stored it in x.pig file Data = LOAD '/....' as (,,,, ) NoNullData= FILTER Data by qe is not null; STORE (foreach (group ...
    Mix NinMix Nin
    Mar 27, 2013 at 9:58 pm
    Mar 28, 2013 at 4:20 pm
  • I'm running a simple script to add a sequence_number to a relation, sort the result and store to a file: a0 = load '<filename ' using PigStorage('\t','-schema'); a1 = rank a0; a2 = foreach a1 ...
    Lauren BlauLauren Blau
    Apr 4, 2013 at 1:31 pm
    Apr 5, 2013 at 3:41 pm
  • +User group Hi Bhooshan, By default you should be running in MapReduce mode unless specified otherwise. Are you creating a PigServer object to run your jobs? Can you provide your code here? Sent from ...
    Prashant KommireddiPrashant Kommireddi
    Apr 13, 2013 at 4:57 am
    Apr 16, 2013 at 12:58 am
  • Hi , I need a way to invoke pig script from a java program and capture the output returned by the pig script. I was looking at the PigRunner api, however did not get much examples. Is there any way ...
    Siddhi BorkarSiddhi Borkar
    Apr 23, 2013 at 6:51 am
    Apr 24, 2013 at 4:34 am
  • Hi, I have a file with a few hundreds of columns with doubles and I am interested in creating a correlation matrix for the columns: A = load 'myData' using PigStorage(':'); B = group A all; D = ...
    Houssam H.Houssam H.
    Feb 21, 2013 at 7:39 pm
    Feb 25, 2013 at 1:58 pm
  • I pulled together some of the highlights of the pig 0.11 release on the Apache Pig blog (which now officially exists!): https://blogs.apache.org/pig/ D
    Dmitriy RyaboyDmitriy Ryaboy
    Feb 23, 2013 at 2:35 am
    Feb 26, 2013 at 10:25 am
  • I have a file with below data xxxxx 11,22,33 44,55,66 77,88,99 I wrote below PIG script X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc :chararray ,qt :chararray ,qe :chararray ) ...
    Mix NinMix Nin
    Mar 7, 2013 at 12:42 am
    Mar 7, 2013 at 8:43 pm
  • Hello I'm trying to find a SUM of a range of fields, and am having difficulty. I have the following data structure (from the movielens public dataset) where there's a "fixed" field of "Name" and ...
    Nathan NeffNathan Neff
    Mar 10, 2013 at 2:45 pm
    Mar 19, 2013 at 8:17 pm
  • Hi! I am using Pig 0.10 version and I have a question about mapping nested JSON objects from Hbase. *For example: * The below commands loads the field family from Hbase. fields = load ...
    Kiran chitturiKiran chitturi
    Mar 14, 2013 at 3:38 am
    Mar 14, 2013 at 3:09 pm
  • Hi, I am trying to run a simple pig script that uses HbaseStorage class to load data from a hbase table. The pig script runs perfectly fine when run standalone in mapreduce mode. But when i submit it ...
    Praveen BysaniPraveen Bysani
    Mar 14, 2013 at 9:29 am
    Mar 19, 2013 at 8:46 pm
  • Hi there, I have an EvalFunc which uses an internal class that opens up connections to a Redis and MongoDB server. This class has a close() method which closes connections to both Redis and MongoDB ...
    Mike SukmanowskyMike Sukmanowsky
    Mar 14, 2013 at 9:05 pm
    Mar 26, 2013 at 2:48 pm
  • Hi all, Could anyone be kind enough to point me to some examples on using the COVARIANCE and the CORRELATION UDFS described in here?[1] Renato M. [1] https://issues.apache.org/jira/browse/PIG-277
    Renato Marroquín MogrovejoRenato Marroquín Mogrovejo
    Mar 26, 2013 at 10:29 pm
    Mar 28, 2013 at 9:42 pm
  • Hi, I am unable to typecast fields loaded from my hbase to anything other than default bytearray. I tried both during the LOAD statement and using typecast after loading. Neither works. The script ...
    Praveen BysaniPraveen Bysani
    Mar 27, 2013 at 8:30 am
    Apr 1, 2013 at 2:43 am
  • We have some very long pig scripts that run several times per day. We believe that the script parsing process takes very long (about 1h). During this time, the pig command just hangs before any ...
    Patrick SalamiPatrick Salami
    Mar 28, 2013 at 7:51 pm
    Apr 3, 2013 at 8:28 pm
  • Hi everyone, I would like to override the input schema in AvroStorage to make a pig script robust to schema evolution. For example, suppose a new field is added to an avro schema with a default value ...
    Enns, StevenEnns, Steven
    Apr 25, 2013 at 11:22 pm
    May 3, 2013 at 5:01 am
  • Dear all, I wonder if someone can tell me if the current version of pig support loop and branching? regards! Yong
    YonghuYonghu
    May 7, 2013 at 11:14 am
    May 7, 2013 at 3:16 pm
  • Hello, In a Pig script I want to store the results in 2 different MySql tables (using DBStorage) and a file on HDFS. This means 3 different STORE statements. Right now when I do that, it does give ...
    Shahab YunusShahab Yunus
    May 8, 2013 at 2:11 pm
    May 10, 2013 at 1:21 am
  • Hi there, I have a huge input on an HDFS and I would like to use Pig to calculate several unique metrics. To help explain the problem more easily, I assume the input file has the following schema ...
    Thomas EdisonThomas Edison
    May 6, 2013 at 3:11 am
    May 9, 2013 at 3:50 am
  • Hi guys, I'm running pig from the command line in local mode, and trying to pass in some properties, for example: pig -x local ... -p mapred.map.tasks=2 -p mapred.reduce.tasks=1 ... I'm getting ...
    Jeff YuanJeff Yuan
    Mar 2, 2013 at 12:04 am
    Mar 3, 2013 at 6:12 am
  • I have a couple of questions regarding job result and schema. The context is that I'm trying to create a custom entry point for Pig that takes a script, executes it, and always stores the last ...
    Jeff YuanJeff Yuan
    Mar 5, 2013 at 7:18 pm
    Mar 5, 2013 at 10:09 pm
  • Hello, I have a file of size 9GB and having approximately 109.5 million records. I execute a pig script on this file that is doing: 1. Group by on a field of the file 2. Count number of records in ...
    Panshul WhisperPanshul Whisper
    Mar 6, 2013 at 2:29 pm
    Mar 8, 2013 at 2:48 am
  • Hi! I am using Pig 0.10.0 with Hbase in distributed mode to read the records and I have used this command below. fields = load 'hbase://documents' using ...
    Kiran chitturiKiran chitturi
    Mar 13, 2013 at 2:49 pm
    Mar 15, 2013 at 3:17 am
  • Hi there, I would like to do something very similar to a nested foreach with using order by and then limit. But I would like to limit on a relation to the total number of records. users = load ...
    Marco CadetgMarco Cadetg
    Mar 18, 2013 at 10:23 am
    Mar 19, 2013 at 7:49 am
  • Hi, Can we define a UDF in pig that takes a bag as an input and returns another bag as output? How can this be done? Thanks, -- regards Pranjal
    Pranjal rajputPranjal rajput
    Mar 18, 2013 at 9:27 am
    Mar 18, 2013 at 3:58 pm
  • Greetings all, I am trying to run Pigunit and receiving an error. I had this previously working, but had to rebuild my local workstation and didn't have everything I should have had checked in. This ...
    j.barrett Strausserj.barrett Strausser
    Apr 10, 2013 at 2:06 pm
    Apr 10, 2013 at 5:26 pm
  • Is there a way to do RANK within a group in PIG 0.11.1? In the following sample dataset, I would like to Rank DESC by Income, and further RANK by Income for each Industry. Name Industry Income ...
    M GM G
    Apr 15, 2013 at 8:25 pm
    Apr 19, 2013 at 2:17 am
  • Hi Everyone, I have absolutely no experience with Pig and limited experience with hadoop, so please bear with me. We built a small hadoop cluster for experimental purposes and installed pig with all ...
    Mehmet BelginMehmet Belgin
    Apr 22, 2013 at 9:07 pm
    Apr 22, 2013 at 10:09 pm
  • Hi, [ I know this question is probably CDH specific, yet I'm hoping one of you may be able to point me in the right direction. ] I want to make a small change to the piggybank for pig 0.10 that is in ...
    Niels BasjesNiels Basjes
    Apr 24, 2013 at 8:46 pm
    May 8, 2013 at 2:59 pm
  • Hi, I have a where condition in sql query like below Table1.col1=Table2.col3 and Table2.col2=Table3.col1 and Table3.col3=Table1.col2 In Pig, Can i write like below A= Table1 B=Table2 C=Table3 Joins = ...
    Raj hadoopRaj hadoop
    Apr 25, 2013 at 2:40 am
    Apr 26, 2013 at 4:53 pm
  • Any fix for parsing string array in the near future? https://issues.apache.org/jira/browse/PIG-2949 -- Wayne Zhu
    Zhu WayneZhu Wayne
    May 7, 2013 at 10:39 pm
    May 8, 2013 at 7:32 pm
Group Navigation
period‹ prev | Latest | first ›
Group Overview
groupuser @
categoriespig, hadoop
discussions3,031
posts12,338
users1,182
websitepig.apache.org

Top users

Dmitriy Ryaboy: 1056 posts Jonathan Coveney: 456 posts Alan Gates: 455 posts pRaShAnT: 285 posts Russell Jurney: 270 posts Olga Natkovich: 193 posts Bill Graham: 192 posts Mridul Muralidharan: 183 posts Alan Gates: 178 posts Jeff Zhang: 178 posts Daniel Dai: 160 posts Thejas M Nair: 135 posts Mohit Anchlia: 125 posts Daniel Dai: 122 posts Cheolsoo Park: 115 posts Thejas Nair: 111 posts Hc busy: 105 posts Santhosh Srinivasan: 85 posts Norbert Burger: 80 posts Jeremy Hanna: 79 posts
show more