Search Discussions

48 discussions - 192 posts

  • 13


    What should I do to fill in the user's home directory when writing about pig? SET HOME `echo $HOME` # Does not work -- Russell Jurney twitter.com/rjurney <span class="m_body_email_addr" ...
    Russell JurneyRussell Jurney
    Feb 11, 2013 at 5:38 am
    Feb 11, 2013 at 5:45 pm
  • Please welcome Bill Graham as our latest Pig PMC member. Congrats Bill!
    Daniel DaiDaniel Dai
    Feb 19, 2013 at 9:48 pm
    Feb 20, 2013 at 10:23 pm
  • Hi , I need to pass parameters dynamically to a pig script. Is there any way to read the parameters passed and their corresponding values without giving the parameter names in the pig script? Thanks, ...
    Siddhi BorkarSiddhi Borkar
    Feb 19, 2013 at 11:59 am
    Feb 21, 2013 at 8:30 am
  • Thaks a lot. It works fine. But one more point, I have only one mapper running with this pig job as my cluster has 4 slaves. How could it be different ? Regards, Jérôme Le 31/01/2013 20:45, Cheolsoo ...
    Jerome PiersonJerome Pierson
    Feb 5, 2013 at 6:18 pm
    Feb 6, 2013 at 5:10 pm
  • Hello, folks! I'm using greatly customized HBaseStorage in my pig script. And during HBaseStorage.setLocation() I'm preparing a file with values that would be source for my filter. The filter is used ...
    Eugene MorozovEugene Morozov
    Feb 4, 2013 at 9:27 pm
    Feb 20, 2013 at 4:55 am
  • Is there a way to specify max number of reduce tasks that a job should span in pig script without having to restart the cluster?
    Mohit AnchliaMohit Anchlia
    Feb 1, 2013 at 10:42 pm
    Feb 6, 2013 at 9:31 pm
  • Hi, I am trying to find the way to run the explain command over the entire pig script in java. I was using PigServer but it offers only to do explain over the single query (alias) not the entire ...
    Petar JovanovicPetar Jovanovic
    Feb 18, 2013 at 4:19 pm
    Feb 18, 2013 at 11:28 pm
  • I pulled together some of the highlights of the pig 0.11 release on the Apache Pig blog (which now officially exists!): https://blogs.apache.org/pig/ D
    Dmitriy RyaboyDmitriy Ryaboy
    Feb 23, 2013 at 2:35 am
    Feb 26, 2013 at 10:25 am
  • Hi, I have a file with a few hundreds of columns with doubles and I am interested in creating a correlation matrix for the columns: A = load 'myData' using PigStorage(':'); B = group A all; D = ...
    Houssam H.Houssam H.
    Feb 21, 2013 at 7:39 pm
    Feb 25, 2013 at 1:58 pm
  • Hi everyone. I'm having a problem loading log files based on parameter input and was wondering whether someone would be able to provide some guidance. The logs in question are Omniture logs, stored ...
    Stevens, IanStevens, Ian
    Feb 14, 2013 at 10:17 pm
    Feb 20, 2013 at 8:06 pm
  • Hello, We are running into performance issues with Pig/Hadoop because our input files are small. Everything goes to only 1 Mapper. To get around this, we are trying to use our own Loader like this ...
    Something SomethingSomething Something
    Feb 11, 2013 at 6:22 pm
    Feb 12, 2013 at 7:33 pm
  • Hi, guys, was wondering what's going on with this. In pig 0.9 if I do something like this: grouped = group data by (field1, field2); count = foreach grouped generate COUNT(data); That count is 0 ...
    Adair KovacAdair Kovac
    Feb 5, 2013 at 11:01 pm
    Feb 7, 2013 at 9:12 pm
  • All, I have a Pig script that reads data from HBase using HBaseStorage, does some manipulation with some Python UDFs and then writes it using PigStorage. It works fine when I run it as a standalone ...
    Shawn HermansShawn Hermans
    Feb 2, 2013 at 8:46 pm
    Feb 3, 2013 at 9:51 pm
  • Hi, Could someone help me and explain why on starting Pig, it is printed the help options and exits? Thanks in advance!
    Ionut IgnatescuIonut Ignatescu
    Feb 4, 2013 at 5:32 pm
    Feb 11, 2013 at 1:50 pm
  • Hi! I am trying to use pig and hbase but i keep running in to classNotFoundException error. I have tried few things but they have never worked. I am using pig 0.10.1 and hbase 0.94.1, hadoop 1.0.4. I ...
    Kiran chitturiKiran chitturi
    Feb 9, 2013 at 12:57 am
    Feb 9, 2013 at 6:58 pm
  • Hi, I need to do some modifications here and need to know how Pig generates DAG. Can someone throw some light on this? regards preeti
    Preeti GuptaPreeti Gupta
    Feb 25, 2013 at 8:02 pm
    Feb 26, 2013 at 2:38 am
  • I'm trying to write a loader, extending LoadFunc, to read a specific file format. My question, how do I pass properties to it (for example the schema of the file type I'm loading)? Would it be using ...
    Jeff YuanJeff Yuan
    Feb 24, 2013 at 11:33 pm
    Feb 25, 2013 at 3:25 am
  • I have a set of jobs to run with different parameters. I'm using Python to prepare the parameter sets and then I'm executing them in batches with Pig.run(batchOfParams). The number of jobs is quite ...
    Jakub GlapaJakub Glapa
    Feb 15, 2013 at 4:27 pm
    Mar 29, 2013 at 9:46 am
  • If I have some information in A, that contains dt_dt and platform, I want to store it in a different json format, So I can create a simple new bag like this X = FOREACH A GENERATE dt_dt as ...
    Robert McCarthyRobert McCarthy
    Feb 21, 2013 at 10:56 am
    Mar 1, 2013 at 4:12 pm
  • Hello, Can somebody please explain me the difference between Limit and Sample statements. Does it read the entire input file in case of Sample if the value is set to 0.1 or it reads randomly only ...
    Panshul WhisperPanshul Whisper
    Feb 26, 2013 at 11:20 pm
    Feb 27, 2013 at 1:09 am
  • I got the following error when using the new built in function CurrentTime() 2013-02-15 14:42:37,228 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from ...
    Danfeng LiDanfeng Li
    Feb 16, 2013 at 12:12 am
    Feb 26, 2013 at 11:04 pm
  • Hi All, I just get started with Pig. It looks very interesting. I might offer me a great alternative with my SQL like tools. Would you please give me some suggestions on a few good books or tutorials ...
    William KangWilliam Kang
    Feb 24, 2013 at 4:45 pm
    Feb 26, 2013 at 2:27 am
  • Dear All, We are using pig with elephant-bird thrift to process structured records. And we were writing tons of UDFs in java before and we are trying to us Jython UDF more since it is much easier to ...
    Stanley XuStanley Xu
    Feb 8, 2013 at 3:19 am
    Feb 8, 2013 at 7:04 am
  • I am using the -tagsource option while loading the input data in order to identify the input source. It seems that, later while I project only selected fields from the input tuple, there are some ...
    Prabu DhakshinamurthyPrabu Dhakshinamurthy
    Feb 3, 2013 at 7:57 pm
    Feb 4, 2013 at 9:08 pm
  • Is there a way to have LOAD ignore missing paths in a list rather than fail? Thanks, Ben
    Benjamin JuhnBenjamin Juhn
    Feb 2, 2013 at 9:13 pm
    Feb 3, 2013 at 6:02 am
  • I called CurrentTime() twice in my pig code, but the final results are end up as the same timestamp. The code is following: A = load 'test.txt' as (a:chararray); dump A; B = foreach A generate a, ...
    Danfeng LiDanfeng Li
    Feb 27, 2013 at 12:09 am
    Feb 27, 2013 at 12:51 am
  • Hi All, I am using pig-0.10.0 with hadoop-1.1.1, and following tests failed with the information: Testcase: testSkewedJoin took 45.001 sec FAILED expected:<0 but was:<2 ...
    Feb 25, 2013 at 11:10 am
    Feb 25, 2013 at 6:21 pm
  • Hi All, It is my understanding that Pig engine will translate the pig script into jar files. But I don't know what is the default location to hold those jar files. I guess it is "/tmp" on local ...
    Colin WangColin Wang
    Feb 21, 2013 at 8:42 am
    Feb 22, 2013 at 6:12 am
  • Hallo, so i have trouble with Pig and Java. Well, in mapreduce mode pig works, but it is slow for testing and learning... Im trying to set Java class to test pig scripts and learn how to use those ...
    Bojan KostićBojan Kostić
    Feb 21, 2013 at 11:40 pm
    Feb 22, 2013 at 12:06 am
  • Is there any way to use script parameters here and cast them to int? In other words, can LoadFunc constructors only accept Strings? These do not work: A = LOAD '/foo/replace_me_with_regex' USING ...
    Stevens, IanStevens, Ian
    Feb 21, 2013 at 5:21 pm
    Feb 21, 2013 at 9:31 pm
  • Hello there, I'm starting to use Pig for processing events and I'm having one specific issue. Currently, the writing process, writes a line to the file and syncs the file to readers ...
    Lucas BernardiLucas Bernardi
    Feb 5, 2013 at 9:06 pm
    Feb 14, 2013 at 11:37 pm
  • I am new to Hadoop/PIG. I have two data sets in my HDFS. One set is the Persons and the second set is the Addresses (CSV files). Both data sets have the unique id called personid. I want to be able ...
    Satish KolliSatish Kolli
    Feb 12, 2013 at 6:30 pm
    Feb 12, 2013 at 7:15 pm
  • I'm writing a UDF of my own that would produce tuples, each tuple has a string field that could be real large. I did a quick test and the current size of the field is 146,447 characters and it ...
    Dexin WangDexin Wang
    Feb 7, 2013 at 12:40 am
    Feb 7, 2013 at 3:47 pm
  • i would like to see what the pig script generates. how can i do that?
    Matthew PurdyMatthew Purdy
    Feb 7, 2013 at 2:54 am
    Feb 7, 2013 at 5:31 am
  • Hey All, I have a pigscript where the load uses PigStorage with a different delimiter. I noticed that pigunit doesnot correctly parse the line of input. Pigunit parses happens correctly only with ...
    Siddhi mehtaSiddhi mehta
    Feb 5, 2013 at 7:44 pm
    Feb 5, 2013 at 11:13 pm
  • Hi community, I have the follwing pig script: define FormatMessage com.cision.hadoop.pig.MessageFormatter(); --If you want the message to have no empty fields use this --define FormatMessage ...
    Jonas HartwigJonas Hartwig
    Feb 4, 2013 at 2:37 pm
    Feb 4, 2013 at 5:32 pm
  • I have a requirement to parse the xml , perform some calculations and generate a csv report. I checked XML Loader , however it seems to display even the xml tags. Is there any working example on xml ...
    Siddhi BorkarSiddhi Borkar
    Feb 4, 2013 at 12:17 pm
    Feb 4, 2013 at 1:12 pm
  • Hello Everyone, I want to make some changes in the way Pig generates Hadoop jobs. Any one got some idea on how to do this? regards Preeti
    Preeti GuptaPreeti Gupta
    Feb 26, 2013 at 9:54 pm
    Feb 26, 2013 at 9:54 pm
  • Hi , Forgive me if this is a really dumb question but I am stuck at this for sometime. Please guide how I can do this. I have a relation like this - (a, { (x), (y), (z) } ) (b, { (s), (t), (u) } ) I ...
    Kodre, RaviKodre, Ravi
    Feb 25, 2013 at 7:37 pm
    Feb 25, 2013 at 7:37 pm
  • The Pig team is happy to announce the Pig 0.11.0 release. Apache Pig provides a high-level data-flow language and execution framework for parallel computation on Hadoop clusters. More details about ...
    Bill GrahamBill Graham
    Feb 22, 2013 at 5:04 am
    Feb 22, 2013 at 5:04 am
  • Hi, I'm a new user of pig, so I apologize if my question seems simplistic. Is there a way to specify (via configuration or cmdline input) a different loader to be used as default? What I mean is, if ...
    Jeff YuanJeff Yuan
    Feb 22, 2013 at 2:35 am
    Feb 22, 2013 at 2:35 am
  • I think the email was filtered out. Resending. ---------- Forwarded message ---------- From: Aniket Mokashi <<span class="m_body_email_addr" title="02516282f40b9e1347f2b33a3b44d415" ...
    Aniket MokashiAniket Mokashi
    Feb 22, 2013 at 2:08 am
    Feb 22, 2013 at 2:08 am
  • Hello, I've been experimenting with Pig using the Accumulo-Pig extension for reading and writing data to an Accumulo table where I've run into a problem doing a join. I'm hoping someone on this list ...
    Robert SacconeRobert Saccone
    Feb 19, 2013 at 10:11 pm
    Feb 19, 2013 at 10:11 pm
  • Hello, I am working with a large dataset of logs (approximately 1.5TB every month). Each record in the log contains a list of fields and a common query by the users on a daily basis is to filter ...
    Prabu DhakshinamurthyPrabu Dhakshinamurthy
    Feb 17, 2013 at 6:45 pm
    Feb 17, 2013 at 6:45 pm
  • I have a requirement to parse an xml and generate columns based on parameters specified by the user to the pig script. For eg, consider the following xml <school <students <student <name test</test ...
    Siddhi BorkarSiddhi Borkar
    Feb 11, 2013 at 11:56 am
    Feb 11, 2013 at 11:56 am
  • Hi am fresher in Hadoop technologies, I want to take part in any(hive, pig) related projects( I used to be informatica developer) and start off my career . All enterprises need experienced ...
    Feb 11, 2013 at 5:08 am
    Feb 11, 2013 at 5:08 am
  • When I try to use the following statement explain -brief A; I got the following error 2013-02-06 19:18:34,250 [Low Memory Detector] INFO org.apache.pig.impl.util.SpillableMemoryManager - first memory ...
    Danfeng LiDanfeng Li
    Feb 7, 2013 at 10:43 pm
    Feb 7, 2013 at 10:43 pm
  • Hi, Pig contributors, For those who plan to attend the Pig meetup today, here are couple of notes: Google handout link ...
    Daniel DaiDaniel Dai
    Feb 7, 2013 at 7:00 pm
    Feb 7, 2013 at 7:00 pm
Group Navigation
period‹ prev | Feb 2013 | next ›
Group Overview
groupuser @
categoriespig, hadoop

64 users for February 2013

Jonathan Coveney: 16 posts Cheolsoo Park: 12 posts Prashant Kommireddi: 10 posts Russell Jurney: 9 posts Siddhi Borkar: 6 posts Bill Graham: 5 posts Eugene Morozov: 5 posts Harsha: 5 posts Johnny Zhang: 5 posts Preeti Gupta: 5 posts Stevens, Ian: 5 posts Alan Gates: 4 posts Danfeng Li: 4 posts David LaBarbera: 4 posts Jameson Li: 4 posts Jeff Yuan: 4 posts Jerome Pierson: 4 posts Mohit Anchlia: 4 posts Petar Jovanovic: 4 posts Prabu Dhakshinamurthy: 4 posts
show more