Grokbase Groups Pig user May 2009
Thank you, Shubham for the reply.
Actually my eclipse can find hadoop and pig, and in the console it displays:

Connected to map-reduce job tracker at: localhost:9001
Connected to hadoop file system at: hdfs://localhost:9000/

My pig version: 0.1.1
Hadoop version: 0.18.3

Then I copy the file "excite-small.log" to HDFS.

But after I save the script (the one in last Email) I run pig
example (pressing the pig icon), no example displays.
After I run hadoop (pressing the elephant icon), it's shown on the console:

Launching the job!
Using the configuration from pigpen
2009-05-29 10:26:25,319 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to hadoop file system at: file:///
2009-05-29 10:26:25,729 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with
processName=JobTracker, sessionId=
2009-05-29 10:26:26,430 [main] INFO
- Choosing to move algebraic foreach to combiner
2009-05-29 10:26:28,684 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2009-05-29 10:26:28,703 [Thread-3] WARN org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2009-05-29 10:26:29,173 [main] ERROR
- Map reduce job failed
2009-05-29 10:26:29,176 [main] ERROR
- excite-small.log does not exist

It looks like it can't find where HDFS again. But it has proved it can in
the beginning.

I have also a question: How does PigPen find pig installation? I only see
the path for Is it how it finds pig?

Thank you!


2009/5/29 Shubham Chopra <>
Hi George,

Set the variable ConfigurationPath in Preferences->Pig to point to the
directory containing hadoop-site.xml that contains the configuration of your
hadoop cluster.


George Pang wrote:
Dear users,
I run the query on the "Pig Script" window, my query is :

log = LOAD 'excite-small.log' AS (user, timestamp, query);
grpd = GROUP log BY user;
cntd = FOREACH grpd GENERATE group, COUNT(log);
STORE cntd INTO 'output_1';

Nothing unusual. But now I have a question: Where do I indicate the pig
mode( local, hadoop) ? Because an error message display that it can't
the "excite-small.log". So also how to indicate it that this file is
actually located in a HDFS?

Thank you,


Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 3 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 28, '09 at 11:29p
activeMay 29, '09 at 5:46p

2 users in discussion

George Pang: 2 posts Shubham Chopra: 1 post



site design / logo © 2021 Grokbase