FAQ

Search Discussions

32 discussions - 119 posts

  • Hi, I have the following query where i want to generate (sld, count of distinct domains). The traffic data comes with domain, subnet and the sld is obtained by a second file (with a join). I had a ...
    Tamir KamaraTamir Kamara
    Feb 17, 2009 at 7:36 am
    Feb 25, 2009 at 12:02 am
  • Hello, I have a question regarding treatment of dates with PIG. My input files contain a timestamp field in 'yyyymmdd hh:mm:ss' format (e.g. 20090201 14:42:00 ) within a comma delimited file. I want ...
    Avram AelonyAvram Aelony
    Feb 18, 2009 at 11:20 pm
    Feb 24, 2009 at 5:10 pm
  • I am processing about 4GB of data on 4-node Hadoop cluster using Pig. The first MAP job it executes generates 80K map tasks. I wonder if this number is a bit excessive. Does such task granularity ...
    Vadim ZalivaVadim Zaliva
    Feb 10, 2009 at 6:40 pm
    Feb 12, 2009 at 9:19 am
  • What is the best way to install the PigPen Eclipse plugin ? I have found the following jira ...
    Avram AelonyAvram Aelony
    Feb 26, 2009 at 12:03 am
    Mar 2, 2009 at 11:03 pm
  • I forgot to add the error that is returned when I use the Find and Install tool. Here it is "Selected archive does not contain an update site. Please select another archive" I had problem using the ...
    Iman ElghandourIman Elghandour
    Feb 23, 2009 at 9:13 pm
    Feb 25, 2009 at 4:44 pm
  • Greetings everyone, I installed, built, and am able to run the types-stable-2 tag on my local machine. I am also able to run Hadoop jobs on our Hadoop cluster. However, when I try to run Pig on the ...
    Dmitriy RyaboyDmitriy Ryaboy
    Feb 2, 2009 at 8:49 pm
    Feb 9, 2009 at 11:14 pm
  • Hi, Have following queries while going through types func spec. a) What does MATCHES on two bytearrays mean ? Spec says it is supported without any comment. b) Multiplication/Division between ...
    Mridul MuralidharanMridul Muralidharan
    Feb 9, 2009 at 11:11 am
    Feb 9, 2009 at 10:27 pm
  • Hi, I'm trying to run the following query in mapreduce mode: A = LOAD 'item.tbl' USING PigStorage('|') AS (item: int, quantity: double, price: double, discount: double, tax: double, returnflag: ...
    Shirley CohenShirley Cohen
    Feb 3, 2009 at 8:21 pm
    Feb 4, 2009 at 6:07 pm
  • Hi, Had a few queries regarding how null's interact with rest of system which was not very clear to me (based on my reading of the types func spec). a) null's are getting treated both as value (a#b) ...
    Mridul MuralidharanMridul Muralidharan
    Feb 9, 2009 at 10:43 am
    Feb 9, 2009 at 8:09 pm
  • Hi, I passed 3,344,109,862 records to ORDER and got 3,339,587,570 in the output with no noticeable errors. There were three jobs. First got 3,344,109,862 records (map input) and produced the same ...
    DagaDaga
    Feb 18, 2009 at 11:03 pm
    Mar 3, 2009 at 9:23 pm
  • Hi, I am trying to run the scripts in the pig tutorial. For script 1, a lexical error is returned: [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 10 00: Error during parsing. Lexical error at ...
    Iman ElghandourIman Elghandour
    Feb 20, 2009 at 12:06 pm
    Feb 23, 2009 at 4:58 pm
  • Hi, I'm trying to use utf-8 strings as follows: phrases = load 'phrases' as (data: chararray, f: int); a = group phrases by f; b = foreach a generate group as f, phrases.data as data; store b into ...
    DagaDaga
    Feb 16, 2009 at 2:11 pm
    Feb 17, 2009 at 6:34 pm
  • I am trying to understand the memory bottleneck in pig's operators. Are all the row-elements of a group-entry loaded in the memory while processing foreach ... command ? For example in the following ...
    Prasenjit mukherjeePrasenjit mukherjee
    Feb 11, 2009 at 5:07 am
    Feb 12, 2009 at 4:54 pm
  • Hello, I would like to do smth like this in a pig script but its not working (i guess the syntax is not right) : thie pig script : *%default period 'day' %default dataname 'session'* LOAD 'mysource' ...
    Mathias FrydeMathias Fryde
    Feb 26, 2009 at 1:50 pm
    Mar 2, 2009 at 11:18 am
  • 2

    NPE

    I am getting this exception running pig task: 2009-02-25 01:11:44,167 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (map) ...
    Vadim ZalivaVadim Zaliva
    Feb 25, 2009 at 5:36 pm
    Feb 25, 2009 at 7:49 pm
  • Hi, I've recently started experimenting with hadoop (0.18.3) and pig, but I would like to start using pigpen. I'm having problems with configuration of pen on my desktop (as the hadoop is installed ...
    Tamir KamaraTamir Kamara
    Feb 11, 2009 at 7:16 pm
    Feb 15, 2009 at 7:46 am
  • Hi all, I have just started run pig. I am trying to follow the steps that are found in the pig tutorial. I was able to compile the latest pig code. I validated the pig.jar using the test unit. Well, ...
    Iman ElghandourIman Elghandour
    Feb 13, 2009 at 10:19 pm
    Feb 14, 2009 at 3:43 am
  • Hi I've downloaded latest Pig code from trunk today. It is not getting compiled using 'ant' command. Trace: nitesh-bhatias-macbook:trunk niteshbhatia$ ant Buildfile: build.xml init: cc-compile: ...
    Nitesh bhatiaNitesh bhatia
    Feb 13, 2009 at 1:02 pm
    Feb 13, 2009 at 2:25 pm
  • Hi, If I understood it right, if there is a schema specified with load, only those fields will be available - that is, there is an implicit project after the load ? To illustrate, A = load 'myFile' ...
    Mridul MuralidharanMridul Muralidharan
    Feb 9, 2009 at 12:05 pm
    Feb 9, 2009 at 10:20 pm
  • Hi, I'm trying to understand the map reduce plan that is given by the explain operator for the following query: A = load 'student.tbl' USING PigStorage('|') AS (name: chararray, course: chararray, ...
    Shirley CohenShirley Cohen
    Feb 4, 2009 at 6:07 pm
    Feb 5, 2009 at 8:56 pm
  • Did this JIRA ever get opened? If so, I can't find it. I'm also interested in this feature, and although I'm completely new to Pig, would be willing to dive in and try to implement it given ...
    Avi BryantAvi Bryant
    Feb 11, 2009 at 12:45 am
    Feb 11, 2009 at 12:51 am
  • I've noticed a couple of old threads about implementing storage to a database, but I haven't seen anything in the mailing list archives or the online docs about sourcing data from a database - it's ...
    Gregory HarmanGregory Harman
    Feb 10, 2009 at 4:28 am
    Feb 10, 2009 at 5:37 am
  • Hi all, I'm just getting started with Pig, and am having a problem that I'm hoping is some standard rookie mistake. I have the following data in a file "t.csv": 59001000,FOO,6/29/08,22,23,BAR ...
    Gregory HarmanGregory Harman
    Feb 7, 2009 at 8:53 pm
    Feb 8, 2009 at 12:31 am
  • I am looking for a documentation which clearly specifies how each of the operators use the mapreduce paradigm. For example foreach..generate may not use any reducer at all ( I am assming ). ...
    Prasenjit mukherjeePrasenjit mukherjee
    Feb 6, 2009 at 2:36 pm
    Feb 6, 2009 at 3:51 pm
  • Hi: I am using pig latin to write map/reduce programs. I run a daily job to create daily results, and then aggregate daily results to weekly/monthly results. I consistently hit a stack overflow ...
    Charles duCharles du
    Feb 2, 2009 at 10:52 pm
    Feb 3, 2009 at 6:23 pm
  • I see a lot of warnings like this in my task logs (trunk version) running in mapreduce mode: ne.physicalLayer.expressionOperators.POProject: Attempt to access field which was not found in the input ...
    Vadim ZalivaVadim Zaliva
    Feb 26, 2009 at 8:31 pm
    Feb 26, 2009 at 8:31 pm
  • Hello , I have a map with a Int in it, something like : [*data:userID#232340* ,data:Time#1234464616347L,metaData:agentVersion#1.0-beta,metaData:eventName#Event_PlayerAuthentication] By doing this ...
    Mathias FrydeMathias Fryde
    Feb 16, 2009 at 1:37 pm
    Feb 16, 2009 at 1:37 pm
  • The next Bay Area Hadoop User Group meeting is scheduled for Wednesday, February 18th at Yahoo! 2811 Mission College Blvd, Santa Clara, Building 2, Training Rooms 5 & 6 from 6:00-7:30 pm. Agenda: ...
    Ajay AnandAjay Anand
    Feb 12, 2009 at 9:47 pm
    Feb 12, 2009 at 9:47 pm
  • Subject: ApacheCon Europe 2009: Early Bird Deadline Extended until 13th of February Here's some great news for everyone who's thinking of traveling to Amsterdam for this year's ApacheCon Europe. The ...
    Olga NatkovichOlga Natkovich
    Feb 10, 2009 at 6:26 pm
    Feb 10, 2009 at 6:26 pm
  • Just curious, if anyone has developed the PLSI EM ( original hofmann 04's ) in pig. Thanks, -Prasen
    Prasenjit mukherjeePrasenjit mukherjee
    Feb 9, 2009 at 11:00 am
    Feb 9, 2009 at 11:00 am
  • We are planning the 2009 Hadoop Summit, to be held the second week of June in Santa Clara, CA. Please send me (aanand@yahoo-inc.com) your presentation proposals and suggested topics. Areas we plan to ...
    Ajay AnandAjay Anand
    Feb 5, 2009 at 7:26 pm
    Feb 5, 2009 at 7:26 pm
  • Hey All Just wanted to let everyone know that Scale Unlimited will start offering many of its courses heavily discounted, if not free, to independent consultants and contractors. ...
    Chris K WenselChris K Wensel
    Feb 3, 2009 at 12:33 am
    Feb 3, 2009 at 12:33 am
Group Navigation
period‹ prev | Feb 2009 | next ›
Group Overview
groupuser @
categoriespig, hadoop
discussions32
posts119
users27
websitepig.apache.org

27 users for February 2009

Alan Gates: 17 posts Iman Elghandour: 11 posts Mridul Muralidharan: 11 posts Tamir Kamara: 9 posts Avram Aelony: 8 posts Olga Natkovich: 8 posts Santhosh Srinivasan: 7 posts Dmitriy Ryaboy: 5 posts Prasenjit mukherjee: 5 posts Shirley Cohen: 5 posts Daga: 4 posts Vadim Zaliva: 4 posts Vadim Zaliva: 4 posts Nitesh bhatia: 3 posts Shubham Chopra: 3 posts Ajay Anand: 2 posts Gregory Harman: 2 posts Mathias Fryde: 2 posts Avi Bryant: 1 post Charles du: 1 post
show more