FAQ
Hi,

I am new to Hbase/Hadoop concept. Following is the scenario -:

1) Our Hadoop is installed in a remote system. Data is loaded in HBase
through HBase writer.

2) I am trying to install pig on my local mac OS X( version 10.6.5) so that
i will fetch data from that remote system. I downloaded Pig latest release
from http://pig.apache.org/releases.html ( 17 December, 2010: release 0.8.0
available)

I did the following things - :

supp:~ rashmi$ export PATH=/Users/rashmi/Desktop/pig-0.8.0/bin:$PATH
supp:~ rashmi$ pig -help
Error: JAVA_HOME is not set.
supp:~ rashmi$ export
JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home

when i ran pig -help i got the following output -:

supp:~ rashmi$ pig -help

Apache Pig version 0.8.0 (r1043805)
compiled Dec 08 2010, 17:26:09

USAGE: Pig [options] [-] : Run interactively in grunt shell.
Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
Pig [options] [-f[ile]] file : Run cmds found in file.
options include:
-4, -log4jconf - Log4j configuration file, overrides log conf
-b, -brief - Brief logging (no timestamps)
-c, -check - Syntax check
-d, -debug - Debug level, INFO is default
-e, -execute - Commands to execute (within quotes)
-f, -file - Path to the script to execute
-h, -help - Display this message. You can specify topic to get help for
that topic.
properties is the only topic currently supported: -h properties.
-i, -version - Display version information
-l, -logfile - Path to client side log file; default is current working
directory.
-m, -param_file - Path to the parameter file
-p, -param - Key value pair of the form param=val
-r, -dryrun - Produces script with substituted parameters. Script is not
executed.
-t, -optimizer_off - Turn optimizations off. The following values are
supported:
SplitFilter - Split filter conditions
MergeFilter - Merge filter conditions
PushUpFilter - Filter as early as possible
PushDownForeachFlatten - Join or explode as late as possible
ColumnMapKeyPrune - Remove unused data
LimitOptimizer - Limit as early as possible
AddForEach - Add ForEach to remove unneeded columns
MergeForEach - Merge adjacent ForEach
LogicalExpressionSimplifier - Combine multiple expressions
All - Disable all optimizations
All optimizations are enabled by default. Optimization values are
case insensitive.
-v, -verbose - Print all error messages to screen
-w, -warning - Turn warning logging on; also turns warning aggregation
off
-x, -exectype - Set execution mode: local|mapreduce, default is
mapreduce.
-F, -stop_on_failure - Aborts execution on the first failed job; default
is off
-M, -no_multiquery - Turn multiquery optimization off; default is on
-P, -propertyFile - Path to property file


when i ran pig command i got the following error -:

supp:~ rashmi$ pig
2011-02-22 12:48:26,319 [main] INFO org.apache.pig.Main - Logging error
messages to: /Users/rashmi/pig_1298359106317.log
2011-02-22 12:48:26,474 [main] ERROR org.apache.pig.Main - ERROR 4010:
Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor
core-site.xml was found in the classpath).If you plan to use local mode,
please put -x local option in command line
Details at logfile: /Users/rashmi/pig_1298359106317.log


My Question is

1) What all i need to do , so that i could connect to remote hadoop system
and fetch data. I read the documentation for this , but couldn't get any
clear idea.
may be becoz i m not java developer. But could you please explain what
all changes i need to do in my case? I ll be highly grateful for this.






--
Thanks and Regards

Rashmi R B

Search Discussions

  • Jacob Perkins at Feb 22, 2011 at 1:08 pm
    Your mac needs to have the hadoop configuration (eg. hdfs-site.xml,
    mapred-site.xml, core-site.xml, depending on the version of hadoop)
    files available somewhere in pig's classpath. It may do to simply copy
    them directly from one of the remote machines.

    --jacob
    @thedatachef
    On Tue, 2011-02-22 at 17:12 +0530, rashmi behera wrote:
    Hi,

    I am new to Hbase/Hadoop concept. Following is the scenario -:

    1) Our Hadoop is installed in a remote system. Data is loaded in HBase
    through HBase writer.

    2) I am trying to install pig on my local mac OS X( version 10.6.5) so that
    i will fetch data from that remote system. I downloaded Pig latest release
    from http://pig.apache.org/releases.html ( 17 December, 2010: release 0.8.0
    available)

    I did the following things - :

    supp:~ rashmi$ export PATH=/Users/rashmi/Desktop/pig-0.8.0/bin:$PATH
    supp:~ rashmi$ pig -help
    Error: JAVA_HOME is not set.
    supp:~ rashmi$ export
    JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home

    when i ran pig -help i got the following output -:

    supp:~ rashmi$ pig -help

    Apache Pig version 0.8.0 (r1043805)
    compiled Dec 08 2010, 17:26:09

    USAGE: Pig [options] [-] : Run interactively in grunt shell.
    Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
    Pig [options] [-f[ile]] file : Run cmds found in file.
    options include:
    -4, -log4jconf - Log4j configuration file, overrides log conf
    -b, -brief - Brief logging (no timestamps)
    -c, -check - Syntax check
    -d, -debug - Debug level, INFO is default
    -e, -execute - Commands to execute (within quotes)
    -f, -file - Path to the script to execute
    -h, -help - Display this message. You can specify topic to get help for
    that topic.
    properties is the only topic currently supported: -h properties.
    -i, -version - Display version information
    -l, -logfile - Path to client side log file; default is current working
    directory.
    -m, -param_file - Path to the parameter file
    -p, -param - Key value pair of the form param=val
    -r, -dryrun - Produces script with substituted parameters. Script is not
    executed.
    -t, -optimizer_off - Turn optimizations off. The following values are
    supported:
    SplitFilter - Split filter conditions
    MergeFilter - Merge filter conditions
    PushUpFilter - Filter as early as possible
    PushDownForeachFlatten - Join or explode as late as possible
    ColumnMapKeyPrune - Remove unused data
    LimitOptimizer - Limit as early as possible
    AddForEach - Add ForEach to remove unneeded columns
    MergeForEach - Merge adjacent ForEach
    LogicalExpressionSimplifier - Combine multiple expressions
    All - Disable all optimizations
    All optimizations are enabled by default. Optimization values are
    case insensitive.
    -v, -verbose - Print all error messages to screen
    -w, -warning - Turn warning logging on; also turns warning aggregation
    off
    -x, -exectype - Set execution mode: local|mapreduce, default is
    mapreduce.
    -F, -stop_on_failure - Aborts execution on the first failed job; default
    is off
    -M, -no_multiquery - Turn multiquery optimization off; default is on
    -P, -propertyFile - Path to property file


    when i ran pig command i got the following error -:

    supp:~ rashmi$ pig
    2011-02-22 12:48:26,319 [main] INFO org.apache.pig.Main - Logging error
    messages to: /Users/rashmi/pig_1298359106317.log
    2011-02-22 12:48:26,474 [main] ERROR org.apache.pig.Main - ERROR 4010:
    Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor
    core-site.xml was found in the classpath).If you plan to use local mode,
    please put -x local option in command line
    Details at logfile: /Users/rashmi/pig_1298359106317.log


    My Question is

    1) What all i need to do , so that i could connect to remote hadoop system
    and fetch data. I read the documentation for this , but couldn't get any
    clear idea.
    may be becoz i m not java developer. But could you please explain what
    all changes i need to do in my case? I ll be highly grateful for this.




Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedFeb 22, '11 at 11:43a
activeFeb 22, '11 at 1:08p
posts2
users2
websitepig.apache.org

2 users in discussion

Rashmi behera: 1 post Jacob Perkins: 1 post

People

Translate

site design / logo © 2022 Grokbase