Grokbase Groups Pig user May 2009
FAQ
Dear users,

I compiled and ran the pig-embedded Java code from the "pig quick start"
example on Eclipse. I got the following error:

INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
file:///
INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
sessionId=

Obviously it can't find the HDFS or Hadoop. But I have set the PIG_CLASSPATH
as
/usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
and other environments under Run Configurations / Environment
Is there anything I forgot to do? Any idea is much appreciated!

Pig: 0.1.1
Hadoop: 0.18.3
Eclipse: 3.4.2

George

Search Discussions

  • Zhang jianfeng at May 31, 2009 at 2:48 am
    From your logs I can see you run the pig in local model rather than hadoop
    model


    On Sun, May 31, 2009 at 10:43 AM, George Pang wrote:

    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH
    as

    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at May 31, 2009 at 3:03 am
    My java code is the example "idhadoop.java" from
    http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html
    So I think it ran in hadoop mode.

    George


    2009/5/30 zhang jianfeng <zjffdu@gmail.com>
    From your logs I can see you run the pig in local model rather than hadoop
    model


    On Sun, May 31, 2009 at 10:43 AM, George Pang wrote:

    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH
    as

    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at Jun 3, 2009 at 12:35 am
    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>
    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • Ankur Goel at Jun 3, 2009 at 7:59 am
    Make sure you have the following parameters set:-

    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom commented out in pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>
    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at Jun 3, 2009 at 5:28 pm
    Hi Ankur,
    Everything runs in the command line, the error only happens when I use
    Eclipse. My Eclipse version is 4.3.2. When I run the embedded java
    program, it gave me the error
    "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
    at: file:///INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId="

    The environment variables are set, is it something to do with where the data
    files are put? Thank you.

    George

    2009/6/3 Ankur Goel <gankur@yahoo-inc.com>
    Make sure you have the following parameters set:-

    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom commented out in
    pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
    Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>
    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at Jun 10, 2009 at 5:37 am
    Now I can run id.hadoop(from the official tutorial
    http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
    Java program, and I can get the result from HDFS. But one line of the
    console message before the "Success! " reads:
    WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
    Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
    instead of Hadoop LocalExecution

    What does it mean or does it matter? Am my program running in map-reduce
    mode at all? Thanks for any idea!

    George


    2009/6/3 George Pang <p0941p@gmail.com>
    Hi Ankur,
    Everything runs in the command line, the error only happens when I use
    Eclipse. My Eclipse version is 4.3.2. When I run the embedded java
    program, it gave me the error
    "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
    at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId="

    The environment variables are set, is it something to do with where the
    data files are put? Thank you.

    George

    2009/6/3 Ankur Goel <gankur@yahoo-inc.com>

    Make sure you have the following parameters set:-
    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom commented out
    in pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
    Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>
    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at Jun 10, 2009 at 6:03 am
    I think it's not in mapreduce mode. Because I also found the error, again:
    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId=

    George

    2009/6/9 George Pang <p0941p@gmail.com>
    Now I can run id.hadoop(from the official tutorial
    http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
    Java program, and I can get the result from HDFS. But one line of the
    console message before the "Success! " reads:
    WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
    Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
    instead of Hadoop LocalExecution

    What does it mean or does it matter? Am my program running in map-reduce
    mode at all? Thanks for any idea!

    George


    2009/6/3 George Pang <p0941p@gmail.com>

    Hi Ankur,
    Everything runs in the command line, the error only happens when I use
    Eclipse. My Eclipse version is 4.3.2. When I run the embedded java
    program, it gave me the error
    "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
    at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId="

    The environment variables are set, is it something to do with where the
    data files are put? Thank you.

    George

    2009/6/3 Ankur Goel <gankur@yahoo-inc.com>

    Make sure you have the following parameters set:-
    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom commented out
    in pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
    Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>
    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • Alan Gates at Jun 10, 2009 at 1:00 pm
    You are running in map reduce mode, but you are not attaching to your
    hadoop cluster. It's running it locally. That's what the "Connecting
    to hadoop file system at file:///" means. If you were connecting to a
    cluster it would saying "hdfs://yournamenode" instead of "file:///"
    Is the directory containing your hadoop-site.xml in your classpath
    when executing the pig command? See http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html
    , the section "Running the Pig Scripts in Hadoop Mode".

    Alan.
    On Jun 9, 2009, at 11:03 PM, George Pang wrote:

    I think it's not in mapreduce mode. Because I also found the error,
    again:
    INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system at:
    file:///
    09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId=

    George

    2009/6/9 George Pang <p0941p@gmail.com>
    Now I can run id.hadoop(from the official tutorial
    http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an
    embedded
    Java program, and I can get the result from HDFS. But one line of
    the
    console message before the "Success! " reads:
    WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the
    JobClient.
    Please try to use Local, Hadoop Distributed or Hadoop MiniCluster
    modes
    instead of Hadoop LocalExecution

    What does it mean or does it matter? Am my program running in map-
    reduce
    mode at all? Thanks for any idea!

    George


    2009/6/3 George Pang <p0941p@gmail.com>

    Hi Ankur,
    Everything runs in the command line, the error only happens when I
    use
    Eclipse. My Eclipse version is 4.3.2. When I run the embedded
    java
    program, it gave me the error
    "INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system
    at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId="

    The environment variables are set, is it something to do with
    where the
    data files are put? Thank you.

    George

    2009/6/3 Ankur Goel <gankur@yahoo-inc.com>

    Make sure you have the following parameters set:-
    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom
    commented out
    in pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai,
    Kolkata,
    Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>
    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/
    cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at Jun 10, 2009 at 8:21 pm
    I think it's running at last. I add to the Build Path/Configure Build
    Path/Add Variable "HADOOPDIR" and its value.

    However, something a little strange of my outcome. This is the message from
    my console:
    09/06/10 02:34:13 INFO executionengine.HExecutionEngine: Connecting to
    hadoop file system at: hdfs://localhost:9000
    09/06/10 02:34:14 INFO executionengine.HExecutionEngine: Connecting to
    map-reduce job tracker at: localhost:9001
    09/06/10 02:34:14 INFO
    mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
    POPackage->POForEach to POJoinPackage
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
    before optimization: 2
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: Merged 0 out of
    total 1 splittees.
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
    after optimization: 2
    09/06/10 02:34:16 INFO mapReduceLayer.JobControlCompiler: Setting up single
    store job
    09/06/10 02:34:16 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    09/06/10 02:34:16 INFO mapReduceLayer.MapReduceLauncher: 0% complete
    09/06/10 02:34:20 INFO mapReduceLayer.MapReduceLauncher: 12% complete
    09/06/10 02:34:22 INFO
    ........

    mapReduceLayer.MapReduceLauncher: 50% complete
    BYTES WRITTEN : 70731111
    09/06/10 02:34:38 INFO mapReduceLayer.JobControlCompiler: Setting up single
    store job
    09/06/10 02:34:39 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    BYTES WRITTEN : 009/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher:
    100% complete

    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
    result in: "hdfs://localhost:9000/tmp/temp-1002982376/tmp-536709268"
    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
    result in: "TEST"
    09/06/10 02:36:40 INFO *mapReduceLayer.MapReduceLauncher: Records written :

    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Bytes written : 0
    *09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Success!

    For some reason it write nothing into the output file. It's weird, because
    I can run the same script in Grunt and get a correct result.

    What would you say? Thanks.

    George


    2009/6/10 Alan Gates <gates@yahoo-inc.com>
    You are running in map reduce mode, but youit are not attaching to your
    hadoop cluster. It's running it locally. That's what the "Connecting to
    hadoop file system at file:///" means. If you were connecting to a cluster
    it would saying "hdfs://yournamenode" instead of "file:///" Is the
    directory containing your hadoop-site.xml in your classpath when executing
    the pig command? See
    http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html, the section
    "Running the Pig Scripts in Hadoop Mode".

    Alan.


    On Jun 9, 2009, at 11:03 PM, George Pang wrote:

    I think it's not in mapreduce mode. Because I also found the error, again:
    INFO executionengine.HExecutionEngine: Connecting to hadoop file system
    at:
    file:///
    09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId=

    George

    2009/6/9 George Pang <p0941p@gmail.com>

    Now I can run id.hadoop(from the official tutorial
    http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
    Java program, and I can get the result from HDFS. But one line of the
    console message before the "Success! " reads:
    WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
    Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
    instead of Hadoop LocalExecution

    What does it mean or does it matter? Am my program running in map-reduce
    mode at all? Thanks for any idea!

    George


    2009/6/3 George Pang <p0941p@gmail.com>

    Hi Ankur,
    Everything runs in the command line, the error only happens when I use
    Eclipse. My Eclipse version is 4.3.2. When I run the embedded java
    program, it gave me the error
    "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
    at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId="

    The environment variables are set, is it something to do with where the
    data files are put? Thank you.

    George

    2009/6/3 Ankur Goel <gankur@yahoo-inc.com>

    Make sure you have the following parameters set:-
    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom commented
    out
    in pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
    Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>

    Dear users,
    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as

    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • Zjffdu at Jun 11, 2009 at 12:45 am
    Hi George,

    Do you use your customed Load Func? if then you can add some log in the
    getNext() method. Maybe some exceptions happened there.

    I also meet this problem before.

    Jeff Zhang



    -----Original Message-----
    From: George Pang
    Sent: 2009年6月10日 13:21
    To: pig-user@hadoop.apache.org
    Subject: Re: Error on running pig-embedded Java code

    I think it's running at last. I add to the Build Path/Configure Build
    Path/Add Variable "HADOOPDIR" and its value.

    However, something a little strange of my outcome. This is the message from
    my console:
    09/06/10 02:34:13 INFO executionengine.HExecutionEngine: Connecting to
    hadoop file system at: hdfs://localhost:9000
    09/06/10 02:34:14 INFO executionengine.HExecutionEngine: Connecting to
    map-reduce job tracker at: localhost:9001
    09/06/10 02:34:14 INFO
    mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
    POPackage->POForEach to POJoinPackage
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
    before optimization: 2
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: Merged 0 out of
    total 1 splittees.
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
    after optimization: 2
    09/06/10 02:34:16 INFO mapReduceLayer.JobControlCompiler: Setting up single
    store job
    09/06/10 02:34:16 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    09/06/10 02:34:16 INFO mapReduceLayer.MapReduceLauncher: 0% complete
    09/06/10 02:34:20 INFO mapReduceLayer.MapReduceLauncher: 12% complete
    09/06/10 02:34:22 INFO
    ........

    mapReduceLayer.MapReduceLauncher: 50% complete
    BYTES WRITTEN : 70731111
    09/06/10 02:34:38 INFO mapReduceLayer.JobControlCompiler: Setting up single
    store job
    09/06/10 02:34:39 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    BYTES WRITTEN : 009/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher:
    100% complete

    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
    result in: "hdfs://localhost:9000/tmp/temp-1002982376/tmp-536709268"
    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
    result in: "TEST"
    09/06/10 02:36:40 INFO *mapReduceLayer.MapReduceLauncher: Records written :

    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Bytes written :
    *09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Success!

    For some reason it write nothing into the output file. It's weird, because
    I can run the same script in Grunt and get a correct result.

    What would you say? Thanks.

    George


    2009/6/10 Alan Gates <gates@yahoo-inc.com>
    You are running in map reduce mode, but youit are not attaching to your
    hadoop cluster. It's running it locally. That's what the "Connecting to
    hadoop file system at file:///" means. If you were connecting to a cluster
    it would saying "hdfs://yournamenode" instead of "file:///" Is the
    directory containing your hadoop-site.xml in your classpath when executing
    the pig command? See
    http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html, the section
    "Running the Pig Scripts in Hadoop Mode".

    Alan.


    On Jun 9, 2009, at 11:03 PM, George Pang wrote:

    I think it's not in mapreduce mode. Because I also found the error,
    again:
    INFO executionengine.HExecutionEngine: Connecting to hadoop file system
    at:
    file:///
    09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId=

    George

    2009/6/9 George Pang <p0941p@gmail.com>

    Now I can run id.hadoop(from the official tutorial
    http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
    Java program, and I can get the result from HDFS. But one line of the
    console message before the "Success! " reads:
    WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
    Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
    instead of Hadoop LocalExecution

    What does it mean or does it matter? Am my program running in
    map-reduce
    mode at all? Thanks for any idea!

    George


    2009/6/3 George Pang <p0941p@gmail.com>

    Hi Ankur,
    Everything runs in the command line, the error only happens when I use
    Eclipse. My Eclipse version is 4.3.2. When I run the embedded java
    program, it gave me the error
    "INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system
    at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId="

    The environment variables are set, is it something to do with where the
    data files are put? Thank you.

    George

    2009/6/3 Ankur Goel <gankur@yahoo-inc.com>

    Make sure you have the following parameters set:-
    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom commented
    out
    in pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
    Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>

    Dear users,
    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/ap
    ps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at Jun 11, 2009 at 12:56 am
    Thank you Jeff, I didn't use a customized Load. But I did debug the script
    line by line. Since it can run fine on my PIgPen, there is no reason it
    can't run on my embedded program. Did you remember what your bugs were for
    the same problem?
    George

    2009/6/11 zjffdu <zjffdu@gmail.com>
    Hi George,

    Do you use your customed Load Func? if then you can add some log in the
    getNext() method. Maybe some exceptions happened there.

    I also meet this problem before.

    Jeff Zhang



    -----Original Message-----
    From: George Pang
    Sent: 2009年6月10日 13:21
    To: pig-user@hadoop.apache.org
    Subject: Re: Error on running pig-embedded Java code

    I think it's running at last. I add to the Build Path/Configure Build
    Path/Add Variable "HADOOPDIR" and its value.

    However, something a little strange of my outcome. This is the message
    from
    my console:
    09/06/10 02:34:13 INFO executionengine.HExecutionEngine: Connecting to
    hadoop file system at: hdfs://localhost:9000
    09/06/10 02:34:14 INFO executionengine.HExecutionEngine: Connecting to
    map-reduce job tracker at: localhost:9001
    09/06/10 02:34:14 INFO
    mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
    POPackage->POForEach to POJoinPackage
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
    before optimization: 2
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: Merged 0 out of
    total 1 splittees.
    09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
    after optimization: 2
    09/06/10 02:34:16 INFO mapReduceLayer.JobControlCompiler: Setting up single
    store job
    09/06/10 02:34:16 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    09/06/10 02:34:16 INFO mapReduceLayer.MapReduceLauncher: 0% complete
    09/06/10 02:34:20 INFO mapReduceLayer.MapReduceLauncher: 12% complete
    09/06/10 02:34:22 INFO
    ........

    mapReduceLayer.MapReduceLauncher: 50% complete
    BYTES WRITTEN : 70731111
    09/06/10 02:34:38 INFO mapReduceLayer.JobControlCompiler: Setting up single
    store job
    09/06/10 02:34:39 WARN mapred.JobClient: Use GenericOptionsParser for
    parsing the arguments. Applications should implement Tool for the same.
    BYTES WRITTEN : 009/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher:
    100% complete

    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully
    stored
    result in: "hdfs://localhost:9000/tmp/temp-1002982376/tmp-536709268"
    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully
    stored
    result in: "TEST"
    09/06/10 02:36:40 INFO *mapReduceLayer.MapReduceLauncher: Records written :

    09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Bytes written :
    *09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Success!

    For some reason it write nothing into the output file. It's weird, because
    I can run the same script in Grunt and get a correct result.

    What would you say? Thanks.

    George


    2009/6/10 Alan Gates <gates@yahoo-inc.com>
    You are running in map reduce mode, but youit are not attaching to your
    hadoop cluster. It's running it locally. That's what the "Connecting to
    hadoop file system at file:///" means. If you were connecting to a cluster
    it would saying "hdfs://yournamenode" instead of "file:///" Is the
    directory containing your hadoop-site.xml in your classpath when executing
    the pig command? See
    http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html, the section
    "Running the Pig Scripts in Hadoop Mode".

    Alan.


    On Jun 9, 2009, at 11:03 PM, George Pang wrote:

    I think it's not in mapreduce mode. Because I also found the error,
    again:
    INFO executionengine.HExecutionEngine: Connecting to hadoop file system
    at:
    file:///
    09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId=

    George

    2009/6/9 George Pang <p0941p@gmail.com>

    Now I can run id.hadoop(from the official tutorial
    embedded
    Java program, and I can get the result from HDFS. But one line of the
    console message before the "Success! " reads:
    WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
    Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
    instead of Hadoop LocalExecution

    What does it mean or does it matter? Am my program running in
    map-reduce
    mode at all? Thanks for any idea!

    George


    2009/6/3 George Pang <p0941p@gmail.com>

    Hi Ankur,
    Everything runs in the command line, the error only happens when I use
    Eclipse. My Eclipse version is 4.3.2. When I run the embedded java
    program, it gave me the error
    "INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system
    at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker, sessionId="

    The environment variables are set, is it something to do with where
    the
    data files are put? Thank you.

    George

    2009/6/3 Ankur Goel <gankur@yahoo-inc.com>

    Make sure you have the following parameters set:-
    PIGDIR=your/pig/dir

    # you will need to set this, else pig assumes the version to be 17
    # and may not be able to find/connect your namenode/jobtracker
    PIG_HADOOP_VERSION=18

    HADOOPDIR=your/hadoop/dir/conf

    PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

    Also make sure you have yahoo specific lines at the bottom commented
    out
    in pig.properties
    under PIGDIR/conf.

    -Ankur

    ----- Original Message -----
    From: "George Pang" <p0941p@gmail.com>
    To: pig-user@hadoop.apache.org
    Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
    Mumbai, New Delhi
    Subject: Re: Error on running pig-embedded Java code

    Any one trying to answer this one?
    Thanks

    George

    2009/5/30 George Pang <p0941p@gmail.com>

    Dear users,
    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file
    system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with
    processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/ap
    ps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George
  • George Pang at Jun 10, 2009 at 8:22 am
    The way I run it on eclipse is like,
    1) Create a project and under it have idhadoop.java
    2) use Build Path/Configure BuildPath/Add External Jar/($PIGDIR )
    3) Run Configurations/ under "Main": Main Class: idhadoop

    Then I get the error message as described.
    Please help If you see anything wrong or lack in this process, thank you.

    George

    2009/5/30 George Pang <p0941p@gmail.com>
    Dear users,

    I compiled and ran the pig-embedded Java code from the "pig quick start"
    example on Eclipse. I got the following error:

    INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
    file:///
    INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
    sessionId=

    Obviously it can't find the HDFS or Hadoop. But I have set the
    PIG_CLASSPATH as
    /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
    and other environments under Run Configurations / Environment
    Is there anything I forgot to do? Any idea is much appreciated!

    Pig: 0.1.1
    Hadoop: 0.18.3
    Eclipse: 3.4.2

    George

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 31, '09 at 2:44a
activeJun 11, '09 at 12:56a
posts13
users4
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase