Grokbase Groups Pig user August 2010
FAQ

Search Discussions

46 discussions - 201 posts

  • Hi , I am trying to integrate Pig with Hadoop for processing of jobs. I am able to run Pig in local mode and Hadoop with streaming api perfectly. But when I try to run Pig with Hadoop I get follwong ...
    RahulRahul
    Aug 27, 2010 at 12:33 am
    Aug 27, 2010 at 4:17 pm
  • Wondering about performance and count... A = load 'test.csv' as (a1,a2,a3); B = GROUP A by a1; -- which preferred? C = FOREACH B GENERATE COUNT(A); -- or would this only send a single field through ...
    Corbin HoenesCorbin Hoenes
    Aug 25, 2010 at 8:59 pm
    Sep 10, 2010 at 3:52 pm
  • Hi all, I am new to pig. I am wondering is there any recommended way to call Pig code from Java? Is there any Java interface which can be called directly from Java and makes them work smoothly? It ...
    Wenhao XuWenhao Xu
    Aug 5, 2010 at 3:57 am
    Aug 5, 2010 at 5:14 pm
  • Hi, I have a very simple script and seeing a very strange behavior, getting wrong results when running this script from a file, while running the same statements on the pig grunt shell I get accurate ...
    Wasti, SyedWasti, Syed
    Aug 24, 2010 at 8:12 pm
    Aug 26, 2010 at 12:03 am
  • I'm trying to perform a top-n query in pig. For example's sake, lets say my input data is (employeeid, departmentid, salary). I'm trying to get the top n-highest-salaried employees of each ...
    Neil KodnerNeil Kodner
    Aug 22, 2010 at 4:26 pm
    Aug 23, 2010 at 2:28 pm
  • Hi all, I'm trying to read some data from CassandraStorage (contrib by Cassandra) and then work on it, but the format of the data is just incredibly ugly. When just loading it and dumping it I can ...
    Christian DeckerChristian Decker
    Aug 25, 2010 at 6:00 pm
    Sep 8, 2010 at 7:24 pm
  • All, I am running pig-0.7.0 and I have been running into an issue running the ORDER command. I have attempted to run pig out of the box on 2 separate LINUX OS (Ubuntu 10.4 and OpenSuse 11.2) and the ...
    Matthew SmithMatthew Smith
    Aug 19, 2010 at 6:36 pm
    Aug 25, 2010 at 4:04 pm
  • Hey, While running in Java a LIMIT statement is not getting executed. /code myServer.registerQuery("flow_firstcut = FOREACH data GENERATE sIP, dIP, sPort, dPort, protocol, bytes, flags;"); ...
    Matthew SmithMatthew Smith
    Aug 4, 2010 at 10:07 pm
    Aug 9, 2010 at 12:58 am
  • Hi folks, at the last Pig contributor meeting, the piggybank question was discussed -- namely, how to make it more easy to contribute to. (by the way, the contributor meetings are generally open to ...
    Dmitriy RyaboyDmitriy Ryaboy
    Aug 27, 2010 at 9:14 pm
    Aug 31, 2010 at 7:01 pm
  • What loader should I use on csv files with quoted strings that contain embedded commas? (i.e. Embedded commas should not be a separator.) And when LOADing large files in local mode, does Pig just ...
    DefenestratorDefenestrator
    Aug 19, 2010 at 7:49 am
    Aug 20, 2010 at 2:26 pm
  • Hi All, We have a requirement of reading sequence files, I have used PIG 0.7.0 and created a custom reader by extending LoadSync. The custom reader is working very well and able to process large ...
    Raman YakkalaRaman Yakkala
    Aug 16, 2010 at 8:36 pm
    Aug 18, 2010 at 10:24 pm
  • Hi , I am trying to run Pig 0.7.0 on Hadoop 0.21.0 . Getting the below error : $ pig 10/08/27 14:38:18 INFO pig.Main: Logging error messages to: /Users/ ...
    Saurav DattaSaurav Datta
    Aug 27, 2010 at 9:42 pm
    Aug 27, 2010 at 10:05 pm
  • Hi all, I'm pretty new to Pig and Hadoop so excuse me if this is trivial, but I couldn't find anyone able to help me. I'm trying to get Pig to read data from a Cassandra cluster, which I thought ...
    Christian DeckerChristian Decker
    Aug 13, 2010 at 11:22 am
    Aug 17, 2010 at 3:25 pm
  • Adding pig-user@ Sanjay, You can do this in Pig by setting following -D switch at the command line through which you invoke Pig. -Dpig.streaming.ship.files=myTopLevel.jar In 0.8 release you will be ...
    Ashutosh ChauhanAshutosh Chauhan
    Aug 11, 2010 at 5:10 pm
    Aug 13, 2010 at 6:34 am
  • Hi All, Just wanted to know has anyone used Pig on a standalone basis in production environment i.e. without integrating with Haddoop? Is that even a good idea? How is the performance if we try ...
    SomdipSomdip
    Aug 27, 2010 at 12:43 am
    Aug 27, 2010 at 1:11 am
  • Hi Guys, I am trying to do join tow data sets and the jobs are failing. There are some warnings reported and I am not good at understanding them. I am seeking your help in adjusting any parameters to ...
    Raman YakkalaRaman Yakkala
    Aug 20, 2010 at 11:22 pm
    Aug 21, 2010 at 5:50 am
  • All, I have what should be a simple problem. I have 2 tuples that are chararrays t1, t2 and want to do a comparision. using x = FILTER y BY (t1 == t2); results in zero (0) records. x = FILTER y BY ...
    Dave WellmanDave Wellman
    Aug 18, 2010 at 10:38 pm
    Aug 20, 2010 at 12:49 am
  • I need to sort the DataBags that are input to my UDF after a COGROUP. I am currently sorting them in memory but it is not going to scale in the long term. Is there a way to control the way that Pig ...
    Anthony UrsoAnthony Urso
    Aug 17, 2010 at 7:00 pm
    Aug 17, 2010 at 11:19 pm
  • I'm attempting to parse some log files using the RegexExtractAll function in the piggybank. Everything was going along swimmingly until I tried to include an expression which contains a semi-colon. ...
    Christopher HackmanChristopher Hackman
    Aug 30, 2010 at 6:39 pm
    Aug 30, 2010 at 7:08 pm
  • The title might be a bit misleading but I hope you can help me. I have some data (let's say a Web Log file) and I want to be able to compare multiple items with each other. For example I want to know ...
    Christian DeckerChristian Decker
    Aug 28, 2010 at 6:13 pm
    Aug 29, 2010 at 4:09 pm
  • Hi All, My PIG jobs are failing since yesterday which was completed successfully in the past. I would appreciate any pointers on the possible root cause. Here is the console log from the job and the ...
    Raman YakkalaRaman Yakkala
    Aug 27, 2010 at 4:55 pm
    Aug 27, 2010 at 5:49 pm
  • In case of many small files that are smaller than a block size, will pig combine them into one map? Thanks, Michael
    Jiang lichtJiang licht
    Aug 24, 2010 at 11:44 pm
    Aug 24, 2010 at 11:53 pm
  • Hi I have the following scenario- Pig version used 0.70 Sample HDFS directory structure: /user/training/test/20100810/<data files /user/training/test/20100811/<data files ...
    Arun A KArun A K
    Aug 18, 2010 at 7:16 pm
    Aug 18, 2010 at 8:00 pm
  • Hello! I try to get the tutorial of pig0.7.0 running in mapreduce mode. But I always get IOExceptions. Looking into the HDFS-Logfiles I found a message "Incorrect header or version mismatch from ... ...
    Rico BergmannRico Bergmann
    Aug 16, 2010 at 3:55 pm
    Aug 17, 2010 at 3:36 pm
  • Just asking again in case someone may know, is there a way to set the priority of a pig job from within the pig script?
    Dave WellmanDave Wellman
    Aug 12, 2010 at 4:48 am
    Aug 12, 2010 at 3:04 pm
  • I keep getting this error, it seems pig cannot locate the jar that contains my udf ImageProcessor.How do i build a jar that contains my udf (p.s the UDF documentation wasnt that helpful ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 11, 2010 at 5:07 pm
    Aug 11, 2010 at 5:28 pm
  • Hi all, I have successfully installed pig on the machine which serves as the namenode/master node. The only problem is when I run pig in hadoop mode it only makes use of the namenode to perform tasks ...
    Ife joesephIfe joeseph
    Aug 9, 2010 at 5:40 pm
    Aug 10, 2010 at 4:17 pm
  • Is there a way to tell Pig to restrict the size of map/reduce output that can be saved to dfs? E.g. if a job creates over-limit data, it won't be allowed to save the result to the dfs and the job ...
    Jiang lichtJiang licht
    Aug 25, 2010 at 10:52 pm
    Aug 26, 2010 at 9:51 pm
  • I come from the DBMS world and am not really familiar with PIG, so hopefully I'm asking reasonable questions. I was basically wondering if there are patterns in PIG to do the following set ...
    DefenestratorDefenestrator
    Aug 21, 2010 at 5:45 pm
    Aug 22, 2010 at 5:17 am
  • New error, don't know what to do... this is the error i get: laptop:~/pig-0.7.0/trunk$ javac -cp $CLASSPATH:pig.jar ImageProcessorr.java Note: ImageProcessorr.java uses or overrides a deprecated API. ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 11, 2010 at 5:52 pm
    Aug 11, 2010 at 6:49 pm
  • The UDF i am making uses JAI from the javax.media.jai library. I have the UPPER udf working fine but when i add this import statement: import javax.media.jai.JAI; I get an error. How can i fix this? ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 10, 2010 at 10:05 pm
    Aug 11, 2010 at 6:39 am
  • i get this error when i run my pig script. 2010-08-10 13:01:08,100 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve ImageProcessor using imports: [, ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 10, 2010 at 5:10 pm
    Aug 10, 2010 at 5:24 pm
  • I get two type of errors when i try to run this pig script: REGISTER reu.log; A = LOAD 'image.log' AS (image: chararray); B = FOREACH A GENERATE reu.imageprocessor1(image); DUMP B; ...
    Iferulz YahadinelIferulz Yahadinel
    Aug 10, 2010 at 4:56 pm
    Aug 10, 2010 at 5:03 pm
  • i have a simple java file that adds a star to a word and prints the result.(this is the simplest java program i could think of) import java.util.*; public class Star { public static void main ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 9, 2010 at 9:27 pm
    Aug 9, 2010 at 9:46 pm
  • Hello! I created custom UDF and I need to have different outputSchema columns(in one case - 2 chararray fields, in the other - 3 chararray fields for output) depends on UDF input parameter(mode_1 = ...
    Васяйчев СергейВасяйчев Сергей
    Aug 6, 2010 at 12:31 pm
    Aug 6, 2010 at 7:54 pm
  • Hi, Have a pig question. I have two HDFS file, a smaller file that has and a larger file that has I would like to replace field2 and field3 in my larger file when they are null match on field1. I am ...
    Kochis, AllanKochis, Allan
    Aug 2, 2010 at 2:14 pm
    Aug 2, 2010 at 7:37 pm
  • With 15 +1 votes (14 from PMC members) the proposal passes. Thanks for voting. Owen, please push this to the Apache board for their consideration. Alan.
    Alan GatesAlan Gates
    Aug 26, 2010 at 5:18 pm
    Aug 26, 2010 at 5:18 pm
  • Hello guys, Over at http://search-hadoop.com we index Pig subprojects mailing lists, wiki, web site, source code, javadoc, jira... Would the community be interested in a patch that replaces the ...
    Alex BaranauAlex Baranau
    Aug 25, 2010 at 4:12 pm
    Aug 25, 2010 at 4:12 pm
  • Hi there! I have hadoop up and running, processing jobs, both in a local and in a clustered configuration. It works fine. I now have a simple pig script written, which works fine in local mode. But ...
    Mr. Jan WalterMr. Jan Walter
    Aug 19, 2010 at 9:24 pm
    Aug 19, 2010 at 9:24 pm
  • ROOM CHANGE TO 209 (one floor up from usual) Hello Fellow Hadoopists, We are meeting at 7:15 pm on August 19th at the University Heights Community Center 5031 University Way NE Seattle WA 98105 Room ...
    Sean jensen-greySean jensen-grey
    Aug 19, 2010 at 5:02 pm
    Aug 19, 2010 at 5:02 pm
  • All, I am running pig-0.7.0 and I have been running into an issue running the ORDER command. I have attempted to run pig out of the box on 2 separate LINUX OS (Ubuntu 10.4 and OpenSuse 11.2) and the ...
    Matthew SmithMatthew Smith
    Aug 18, 2010 at 6:32 pm
    Aug 18, 2010 at 6:32 pm
  • Are there any complete examples of doing a map side merge join on sorted Zebra tables using java map/reduce code anywhere? The one on the wiki ...
    Deem, MikeDeem, Mike
    Aug 17, 2010 at 5:23 pm
    Aug 17, 2010 at 5:23 pm
  • My java file makes use of JAI...will i be able to convert this program to a PIG-UDF? ive tried and it keeps givin me errors. One of the errors i get is "cannot find symbol" (symbol being the ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 10, 2010 at 5:00 pm
    Aug 10, 2010 at 5:00 pm
  • I get two type of errors when i try to run this pig script: A = LOAD 'image.log' AS (image: chararray); B = FOREACH A GENERATE reu.imageprocessor1(image); DUMP B; laptop:~/pig-0.7.0$ java -cp ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 10, 2010 at 4:57 pm
    Aug 10, 2010 at 4:57 pm
  • Hi all, I have successfully installed pig on the machine which serves as the namenode/master node. The only problem is when I run pig in hadoop mode it only makes use of the namenode to perform tasks ...
    Ifeanyichukwu OsujiIfeanyichukwu Osuji
    Aug 9, 2010 at 5:40 pm
    Aug 9, 2010 at 5:40 pm
  • PigServer is our interface to Java. This provides a JDBC like interface. Also, I think you want to send emails to pig-user, not pig-user-owner, so all of the list sees your mail instead of just me, ...
    Alan GatesAlan Gates
    Aug 9, 2010 at 5:39 pm
    Aug 9, 2010 at 5:39 pm
Group Navigation
period‹ prev | Aug 2010 | next ›
Group Overview
groupuser @
categoriespig, hadoop
discussions46
posts201
users55
websitepig.apache.org

55 users for August 2010

Dmitriy Ryaboy: 20 posts Thejas M Nair: 16 posts Jeff Zhang: 13 posts Mridul Muralidharan: 11 posts Ifeanyichukwu Osuji: 10 posts Matthew Smith: 10 posts Ashutosh Chauhan: 9 posts Rahul: 8 posts Raman Yakkala: 7 posts Alan Gates: 6 posts Christian Decker: 5 posts Corbin Hoenes: 5 posts Dave Wellman: 5 posts Kaluskar, Sanjay: 5 posts Wasti, Syed: 5 posts Defenestrator: 4 posts Neil Kodner: 4 posts Bill Graham: 3 posts Dmitriy Lyubimov: 3 posts Harsh J: 3 posts
show more