FAQ

Search Discussions

72 discussions - 321 posts

  • Hi I am getting the below mentioned exception after I load a file and do Filter on it. The file(test.txt) is saved inside PIG home/data/ folder. grunt A= LOAD 'data/test.txt' USING PigStorage(); ...
    KiranprasadKiranprasad
    Sep 16, 2011 at 3:03 pm
    Sep 22, 2011 at 4:45 pm
  • Heya! I've been trying to do something with Pig for about 4 days now and I have nothing but failure to show for it. I was wondering if anybody could look at my queries and slap some sense into me? ...
    Pierre-Luc BrunetPierre-Luc Brunet
    Sep 8, 2011 at 11:56 pm
    Sep 14, 2011 at 7:03 pm
  • Hi, I am trying to run a pig streaming perl job using GeoLite DB and I am getting the following failure 2011-09-23 15:49:44,902 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 2997: ...
    Deepak ReddyDeepak Reddy
    Sep 23, 2011 at 10:54 pm
    Sep 26, 2011 at 7:00 pm
  • Hi, I'm currently working on trying to load lzos that contain some JSON elements. This is of the form: item1 item2 {'thing1':'1','thing2':'2'} item3 item4 {'thing3':'1','thing27':'2'} item5 item6 ...
    Eli FinkelshteynEli Finkelshteyn
    Sep 9, 2011 at 10:50 pm
    Sep 13, 2011 at 6:23 pm
  • Hi Iam new to PIG, trying to set up HADOOP cluster. The error Iam getting is [kiranprasad.g@pig1 pig-0.8.1]$ bin/pig 2011-09-07 19:45:50,606 [main] INFO org.apache.pig.Main - Logging error messages ...
    KiranprasadKiranprasad
    Sep 7, 2011 at 8:55 am
    Sep 11, 2011 at 1:01 am
  • Hey guys, How can I LIMIT a relation by percentage? What I need is to sort a relation by a numeric column and then take top 5% of tuples. As far as I understand I cannot use an expression in the ...
    Ruslan Al-FakikhRuslan Al-Fakikh
    Sep 8, 2011 at 1:14 pm
    Sep 12, 2011 at 9:48 am
  • Hello, First of all, I'm new at Pig and NoSQL so I hope you'll forgive stupid questions ;-) So, I'm playing with OpenTSDB (software layer on top of HBase to handle timeseries data) and now I'd like ...
    shazz Ngshazz Ng
    Sep 6, 2011 at 11:59 am
    Sep 7, 2011 at 6:41 am
  • Hi all! I have been experimenting with wrapping some of Apache Mahout's machine learning -related jobs inside Pig macros, via the MAPREDUCE keyword. This seemed quite nearly do-able but I hit a few ...
    Dan BrickleyDan Brickley
    Sep 7, 2011 at 5:26 pm
    Sep 9, 2011 at 5:33 pm
  • Hi dear pigs: Sometimes when I run pig job with huge data, the number of reducers is very small(only 1 reducer), even if set PARALLEL, so the job runs extremely slow! Can I or how can I increase the ...
    唐亮唐亮
    Sep 15, 2011 at 7:48 am
    Sep 16, 2011 at 1:32 am
  • Hi, I'd like to generate based on exclusive conditions (something like the CASE statement in SQL). An example: Say I have data that looks like: (a, 1) (a, 2) (b, 2) (c, 1) (d, 3) (d, 4) And I want to ...
    Eli FinkelshteynEli Finkelshteyn
    Sep 14, 2011 at 8:28 pm
    Sep 14, 2011 at 9:56 pm
  • Hello, I tried several versions to generate data for pigmix queries: *- Hadoop apache 0.20.204 with pig 0.7* *== * java.lang.RuntimeException: Error in configuring object at ...
    Keren OuaknineKeren Ouaknine
    Sep 11, 2011 at 5:05 am
    Sep 11, 2011 at 7:39 pm
  • Hello, I've getting this error when running a pig job. I want to disable LZO so it doesn't try to load and fail. How do I do that? This is a totally stock installation I did from downloading from the ...
    Bradford StephensBradford Stephens
    Sep 28, 2011 at 12:27 am
    Sep 28, 2011 at 7:48 pm
  • Hello I have a folder with many xml files i want to load them and parse them in order to get some information and them store this info to a file My problem is that I cannot send the xml file to my ...
    Baraa MohamadBaraa Mohamad
    Sep 15, 2011 at 4:48 pm
    Sep 15, 2011 at 5:59 pm
  • Hi I am trying to start the PIG in hadoop mode, but it is getting stuck. Pls help. Below is where the process is getting stuck. [kiranprasad.g@pig4 pig-0.8.1]$ bin/pig 2011-09-14 21:48:25,589 [main] ...
    KiranprasadKiranprasad
    Sep 14, 2011 at 11:32 am
    Sep 15, 2011 at 3:08 pm
  • Hi, I have a serious task to finish, hope somebody will help me... I have two inputs with data: record1: epoch, game_id, user_id, other data record2: epoch, game_id, user_id, other data Now I need to ...
    Marek MiglinskiMarek Miglinski
    Sep 12, 2011 at 2:25 pm
    Sep 13, 2011 at 9:23 pm
  • Using pig 0.9. My data is very dynamic so I use a custom LoadFunc to parse it. The problem is that I cant figure out how to access the schema that is defined in the load statement. I am forced to do ...
    RezaReza
    Sep 12, 2011 at 8:12 pm
    Sep 13, 2011 at 1:24 am
  • Hi, When doing an inner join on a column where some values are NULL, I get the following error: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 I can fix this by simply filtering out the NULL ...
    Eli FinkelshteynEli Finkelshteyn
    Sep 16, 2011 at 3:04 pm
    Sep 20, 2011 at 4:22 pm
  • Hello there, Based on http://www.cloudera.com/blog/2009/06/analyzing-apache-logs-with-pig/ I want to add geolocalisation to my haproxy raw logs stored in Hbase Table. Here is my pig script ...
    Damien HardyDamien Hardy
    Sep 16, 2011 at 9:12 am
    Sep 16, 2011 at 1:51 pm
  • Hello, Based on previous discussion with Thejas Nair about pig-0.8.1 unit test. I think jira should be opened for the following unit test(my research indicate the cause is HashMap does not guarantee ...
    Lulynn_2008Lulynn_2008
    Sep 6, 2011 at 2:17 am
    Sep 14, 2011 at 2:43 am
  • Hi, I get an error on executing following: register piggybank.jar; register joda-time-1.6.jar ; A= LOAD 'date.csv' USING PigStorage(',') AS (a1:chararray); dump A; Output : (20110123) toISOA = ...
    Ipshita chatterjiIpshita chatterji
    Sep 12, 2011 at 9:13 am
    Sep 13, 2011 at 3:59 am
  • Hi, I am trying to build locally like so: $ svn co http://svn.apache.org/repos/asf/pig/trunk pig $ cd pig $ ant test And the build fails with dependency problems: [ivy:resolve] :: problems summary :: ...
    Ryan HoeggRyan Hoegg
    Sep 8, 2011 at 9:03 pm
    Sep 9, 2011 at 4:17 pm
  • I'm using Pig 0.8.1. I recieved an error trying to use a dynamic invoker with URLDecoder.decode(String, String) caused by java.lang.NoClassDefFoundError: com/google/common/collect/Sets which is used ...
    Sean TimmSean Timm
    Sep 7, 2011 at 9:07 pm
    Sep 8, 2011 at 1:27 am
  • Hi I am new to PIG, I would like to know how to generate only single output file by using STORE. Regards Kiran.G
    KiranprasadKiranprasad
    Sep 6, 2011 at 9:02 am
    Sep 6, 2011 at 4:00 pm
  • Do we need to load the file multiple times if we want to perform actions again and again on the same file. Regards Kiran.G
    KiranprasadKiranprasad
    Sep 30, 2011 at 12:35 pm
    Sep 30, 2011 at 4:44 pm
  • Hello, This is my pig script : DEFINE iplookup `wrapper.sh GeoIP` ship ('wrapper.sh') cache('/GeoIP/GeoIPcity.dat#GeoIP') input (stdin using PigStreaming(',')) output (stdout using ...
    Damien HardyDamien Hardy
    Sep 20, 2011 at 4:04 pm
    Sep 21, 2011 at 10:08 am
  • I am trying to filter a set of records by key and last modified date so that only one record is returned per key with the most recent date. I have a working script below but wonder if there is a ...
    Cheung, PoCheung, Po
    Sep 16, 2011 at 7:36 am
    Sep 16, 2011 at 8:41 am
  • Hey all, 4 hours of true torture, hope you will help me (the task is easy) up = LOAD '/up.log' USING PigStorage(',') AS (upEpoch:long, upInstance:chararray, upKeyword:chararray); tx = LOAD '/tx.log' ...
    Marek MiglinskiMarek Miglinski
    Sep 13, 2011 at 6:32 pm
    Sep 14, 2011 at 10:57 am
  • Folks -- we have a timeseries-based table we recently converted to a salted key schema [1] in order to avoid region hotspotting. The rowkey format is: salt-timestamp-sessionid-eventtype, where: salt ...
    Norbert BurgerNorbert Burger
    Sep 12, 2011 at 3:10 pm
    Sep 13, 2011 at 4:36 pm
  • I have a large data set ( 2 TB) and I tried scanning 100 records from it. a = load '/usr/largedata/' using PigStorage(','); b = limit a 100; dump b; 2011-09-11 21:56:34,262 [main] INFO ...
    Rajesh BalamohanRajesh Balamohan
    Sep 12, 2011 at 5:14 am
    Sep 12, 2011 at 6:49 am
  • Hi Iam using Hadoop version : hadoop-0.20.2-cdh3u0 and PIG : pig-0.8.1 For Cluster I have 3 VMs(10.0.0.61-master, 10.0.0.62,10.0.0.63 - Slaves) and another VM 10.0.0.64 in which I ve installed PIG ...
    KiranprasadKiranprasad
    Sep 9, 2011 at 9:44 am
    Sep 10, 2011 at 1:19 am
  • Hello, Blocks on test accumulator, details below. Thanks. buildJar-withouthadoop: [echo] svnString exported [jar] Building jar: /mnt/hadoop/CDH3-keren/hadoopCDH3-keren/pig/build/pig- ...
    Keren OuaknineKeren Ouaknine
    Sep 7, 2011 at 1:02 am
    Sep 9, 2011 at 6:33 am
  • Hi Pig users, I wanted to share with you all that we recently open sourced a library we have been developing at LinkedIn. In it we have collected many of the useful UDFs we have developed for ...
    Matthew HayesMatthew Hayes
    Sep 27, 2011 at 5:20 pm
    Sep 28, 2011 at 3:43 pm
  • Hi all, I have a simple schema {"name": "Record", "type": "record", "fields": [ {"name": "name", "type": "string"}, {"name": "id", "type": "int"} ] } which I use to write 2 records to an Avro file ...
    Alex HolmesAlex Holmes
    Sep 22, 2011 at 12:01 am
    Sep 27, 2011 at 5:44 am
  • Hi, My data looks like 6202445(2284,11096,2931,11168) 6202446(83258,738,10215,12987) 6202447(83258,738,10215,12987) 6202448(1001,1284,11550) 6202449(1560,752,13505,12876,2906) ...
    Ayon SinhaAyon Sinha
    Sep 23, 2011 at 1:35 am
    Sep 23, 2011 at 7:39 am
  • Is there a way to use -Dpig.additional.jars with pigunit to auto-register jars for unit test scripts? Maybe we're just missing something because this seems like a basic thing that people would like ...
    Jeremy HannaJeremy Hanna
    Sep 22, 2011 at 6:44 pm
    Sep 22, 2011 at 9:09 pm
  • Hi After setting up Hadoop cluster how to test it by loading a file from PIG grunt and getting the out files in mapreduce mode Warm Regards, Sebtain.
    Sebtain MD RSebtain MD R
    Sep 21, 2011 at 3:44 pm
    Sep 21, 2011 at 6:28 pm
  • Hello, I have a question please How I can read a file in a UDF in pig ex: A = load 'xmlFiles' using myXMLParser ( xmlfile) can I do something like that, so that I can parse the xml file using some ...
    Baraa MohamadBaraa Mohamad
    Sep 14, 2011 at 2:41 pm
    Sep 14, 2011 at 3:26 pm
  • Hi there, I do have a problem with pig 9.0 and hadoop 0.20.204: http://hadoop.apache.org/common/releases.html#5+Sep%2C+2011%3A+release+0.20.204.0+available I tried several things but I am unable to ...
    Marco CadetgMarco Cadetg
    Sep 13, 2011 at 9:09 am
    Sep 13, 2011 at 1:51 pm
  • Hello, In pig script of pig-0.8.1, I did not find command line for adding pig*.jar into CLASSPATH environment variable. Just the following: # during development pig jar might be in build for f in ...
    Lulynn_2008Lulynn_2008
    Sep 9, 2011 at 8:07 am
    Sep 13, 2011 at 12:06 pm
  • hi, i installed pig 0.8.1 and ran the script in local mode but got the following error: ERROR 2998: Unhandled internal error. org/jets3t/service/S3ServiceException i have my S3 connection working, i ...
    Dan YiDan Yi
    Sep 9, 2011 at 3:59 pm
    Sep 10, 2011 at 6:09 pm
  • Hey all, Correct if I'm wrong, but there is no way to test Pig 0.8.1 script through JUnit PigTest if I have multiple STORE? Ex Pig end file: STORE result1 INTO '$outputPath/result1' USING ...
    Marek MiglinskiMarek Miglinski
    Sep 9, 2011 at 10:55 am
    Sep 9, 2011 at 6:14 pm
  • Hello, What is the latest version of pig supporting the pigmix queries? The jira latest update mentions pig 0.7 only: Assuming its 0.8 or 0.9, can I use hadoop cdh3 or should I switch to apache's ...
    Keren OuaknineKeren Ouaknine
    Sep 9, 2011 at 2:00 pm
    Sep 9, 2011 at 6:07 pm
  • Hi Does pig-0.8.1 works with hadoop-0.20.2-cdh3u0. Iam getting below mentioned error. ERROR 2999: Unexpected internal error. Failed to create DataStorage Regards Kiran.G
    KiranprasadKiranprasad
    Sep 9, 2011 at 12:02 pm
    Sep 9, 2011 at 3:43 pm
  • HI all, How can I generate a Map from a Tuple or a Bag? I am looking for an example. Ruihong
    石瑞红石瑞红
    Sep 8, 2011 at 5:13 am
    Sep 8, 2011 at 8:29 pm
  • Hi there, We hit a possible issue with Pig (version 0.9.1) and HBaseStorage where we try to LOAD multiple sets of data and UNION them. Here's a simple example that shows the problem: HBase Data (use ...
    Eduardo Afonso FerreiraEduardo Afonso Ferreira
    Sep 6, 2011 at 4:50 pm
    Sep 6, 2011 at 5:41 pm
  • Hi, I am trying to use a cached filed called GeoLiteCity.dat.gz#datafile in my pig script. For that I used the CACHE keyword as CACHE('HDFS archivefile#symlink'); But when I try to refer to this file ...
    Deepak ReddyDeepak Reddy
    Sep 2, 2011 at 2:14 am
    Sep 2, 2011 at 6:29 pm
  • Due to the syntax of left joins, you can't change the order: alias = JOIN left-alias BY left-alias-column [LEFT|RIGHT|FULL] [OUTER], right-alias BY right-alias-column [USING 'replicated' | 'skewed' | ...
    Kevin BurtonKevin Burton
    Sep 1, 2011 at 1:14 am
    Sep 1, 2011 at 1:23 am
  • Hi I am encountering an issue where the pig job just hangs indefinitely. On debugging, I find that the issue seems to happen when GROUPing a null relation on multiple fields. code: a = LOAD ...
    Sirchabesan, KannappanSirchabesan, Kannappan
    Sep 30, 2011 at 4:03 am
    Sep 30, 2011 at 8:42 pm
  • Small question—the python UDF doc says that "variable names inside a schema string are not used anywhere, they just make the syntax identifiable to the parser" ...
    Doug DanielsDoug Daniels
    Sep 30, 2011 at 7:28 pm
    Sep 30, 2011 at 7:31 pm
  • Does PIG by default uses Zebra to store and retrieve the data in columnar table format ?? Regards Kiran.G
    KiranprasadKiranprasad
    Sep 30, 2011 at 7:26 am
    Sep 30, 2011 at 1:21 pm
Group Navigation
period‹ prev | Sep 2011 | next ›
Group Overview
groupuser @
categoriespig, hadoop
discussions72
posts321
users68
websitepig.apache.org

68 users for September 2011

Dmitriy Ryaboy: 34 posts Daniel Dai: 33 posts Kiranprasad: 26 posts Marek Miglinski: 18 posts Alan Gates: 13 posts Keren Ouaknine: 13 posts Norbert Burger: 12 posts Xiaomeng Wan: 10 posts Deepak Reddy: 9 posts Eli Finkelshteyn: 9 posts Damien Hardy: 8 posts shazz Ng: 6 posts Thejas Nair: 6 posts Baraa Mohamad: 5 posts Jiang licht: 5 posts Kevin Burton: 5 posts Lulynn_2008: 5 posts Pierre-Luc Brunet: 5 posts Ryan Hoegg: 5 posts Ipshita chatterji: 4 posts
show more