Grokbase Groups Pig user May 2009
FAQ
I think it's running at last. I add to the Build Path/Configure Build
Path/Add Variable "HADOOPDIR" and its value.

However, something a little strange of my outcome. This is the message from
my console:
09/06/10 02:34:13 INFO executionengine.HExecutionEngine: Connecting to
hadoop file system at: hdfs://localhost:9000
09/06/10 02:34:14 INFO executionengine.HExecutionEngine: Connecting to
map-reduce job tracker at: localhost:9001
09/06/10 02:34:14 INFO
mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
POPackage->POForEach to POJoinPackage
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
before optimization: 2
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: Merged 0 out of
total 1 splittees.
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
after optimization: 2
09/06/10 02:34:16 INFO mapReduceLayer.JobControlCompiler: Setting up single
store job
09/06/10 02:34:16 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
09/06/10 02:34:16 INFO mapReduceLayer.MapReduceLauncher: 0% complete
09/06/10 02:34:20 INFO mapReduceLayer.MapReduceLauncher: 12% complete
09/06/10 02:34:22 INFO
........

mapReduceLayer.MapReduceLauncher: 50% complete
BYTES WRITTEN : 70731111
09/06/10 02:34:38 INFO mapReduceLayer.JobControlCompiler: Setting up single
store job
09/06/10 02:34:39 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
BYTES WRITTEN : 009/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher:
100% complete

09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
result in: "hdfs://localhost:9000/tmp/temp-1002982376/tmp-536709268"
09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
result in: "TEST"
09/06/10 02:36:40 INFO *mapReduceLayer.MapReduceLauncher: Records written :

09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Bytes written : 0
*09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Success!

For some reason it write nothing into the output file. It's weird, because
I can run the same script in Grunt and get a correct result.

What would you say? Thanks.

George


2009/6/10 Alan Gates <gates@yahoo-inc.com>
You are running in map reduce mode, but youit are not attaching to your
hadoop cluster. It's running it locally. That's what the "Connecting to
hadoop file system at file:///" means. If you were connecting to a cluster
it would saying "hdfs://yournamenode" instead of "file:///" Is the
directory containing your hadoop-site.xml in your classpath when executing
the pig command? See
http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html, the section
"Running the Pig Scripts in Hadoop Mode".

Alan.


On Jun 9, 2009, at 11:03 PM, George Pang wrote:

I think it's not in mapreduce mode. Because I also found the error, again:
INFO executionengine.HExecutionEngine: Connecting to hadoop file system
at:
file:///
09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=

George

2009/6/9 George Pang <p0941p@gmail.com>

Now I can run id.hadoop(from the official tutorial
http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
Java program, and I can get the result from HDFS. But one line of the
console message before the "Success! " reads:
WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
instead of Hadoop LocalExecution

What does it mean or does it matter? Am my program running in map-reduce
mode at all? Thanks for any idea!

George


2009/6/3 George Pang <p0941p@gmail.com>

Hi Ankur,
Everything runs in the command line, the error only happens when I use
Eclipse. My Eclipse version is 4.3.2. When I run the embedded java
program, it gave me the error
"INFO executionengine.HExecutionEngine: Connecting to hadoop file system
at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId="

The environment variables are set, is it something to do with where the
data files are put? Thank you.

George

2009/6/3 Ankur Goel <gankur@yahoo-inc.com>

Make sure you have the following parameters set:-
PIGDIR=your/pig/dir

# you will need to set this, else pig assumes the version to be 17
# and may not be able to find/connect your namenode/jobtracker
PIG_HADOOP_VERSION=18

HADOOPDIR=your/hadoop/dir/conf

PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

Also make sure you have yahoo specific lines at the bottom commented
out
in pig.properties
under PIGDIR/conf.

-Ankur

----- Original Message -----
From: "George Pang" <p0941p@gmail.com>
To: pig-user@hadoop.apache.org
Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
Mumbai, New Delhi
Subject: Re: Error on running pig-embedded Java code

Any one trying to answer this one?
Thanks

George

2009/5/30 George Pang <p0941p@gmail.com>

Dear users,
I compiled and ran the pig-embedded Java code from the "pig quick start"
example on Eclipse. I got the following error:

INFO executionengine.HExecutionEngine: Connecting to hadoop file
system at:
file:///
INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker,
sessionId=

Obviously it can't find the HDFS or Hadoop. But I have set the
PIG_CLASSPATH as

/usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
and other environments under Run Configurations / Environment
Is there anything I forgot to do? Any idea is much appreciated!

Pig: 0.1.1
Hadoop: 0.18.3
Eclipse: 3.4.2

George

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 11 of 13 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 31, '09 at 2:44a
activeJun 11, '09 at 12:56a
posts13
users4
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase