Grokbase Groups Pig user October 2008
FAQ
Greetings!

My requirement is to search an input string in a given file and output all
lines of the file that contains the string.

Am writing the following searchString.pig script( Have integrated hadoop
with pig. Hence, all my files are available at hdfs)

searchString.info : text file will have a single word 'sample' .
indexFile : text file will have 10 key-value pairs separated by
space, with word 'sample' appearing in 2 or 3 lines.

<searchString.pig>

searchStr = load 'searchString.info' using PigStorage();
A = load 'indexFile' using PigStorage(' ') as (str,filename);
-- This string comparision works!
Y = FILTER A BY str eq 'abc';

--None of the below works
Y = FILTER A BY str eq searchStr ;
Y = FILTER A BY str eq searchStr.$0;


Instead of hardcoding the search string, I would like to search the string
available at searchStr dataatom.
1. Which way can i compare "A.str" with "searchStr" data atom?
2. Is there a way i can pass search string as an argument to the Pig Script?

Thankyou for your assistance.
Regards,
Srilatha

Search Discussions

  • Santhosh Srinivasan at Oct 5, 2008 at 1:49 am
    Use parameter substitution -
    http://wiki.apache.org/pig/ParameterSubstitution

    Santhosh

    -----Original Message-----
    From: Latha
    Sent: Saturday, October 04, 2008 6:33 PM
    To: pig-user@incubator.apache.org
    Subject: How to pass arguments to PigScript; + How to compare data atoms
    within pig script

    Greetings!

    My requirement is to search an input string in a given file and output
    all
    lines of the file that contains the string.

    Am writing the following searchString.pig script( Have integrated hadoop
    with pig. Hence, all my files are available at hdfs)

    searchString.info : text file will have a single word 'sample' .
    indexFile : text file will have 10 key-value pairs separated
    by
    space, with word 'sample' appearing in 2 or 3 lines.

    <searchString.pig>

    searchStr = load 'searchString.info' using PigStorage();
    A = load 'indexFile' using PigStorage(' ') as (str,filename);
    -- This string comparision works!
    Y = FILTER A BY str eq 'abc';

    --None of the below works
    Y = FILTER A BY str eq searchStr ;
    Y = FILTER A BY str eq searchStr.$0;


    Instead of hardcoding the search string, I would like to search the
    string
    available at searchStr dataatom.
    1. Which way can i compare "A.str" with "searchStr" data atom?
    2. Is there a way i can pass search string as an argument to the Pig
    Script?

    Thankyou for your assistance.
    Regards,
    Srilatha
  • Latha at Oct 5, 2008 at 3:20 am
    Thankyou Santosh! The link is quite useful.

    However, am not able to pass parameters to pig script the following way.
    Could you please help me to get out of this error.

    bash$ java -cp $CLASSPATH/pig.jar:$HADOOPSITEPATH org.apache.pig.Main
    search.pig -param string=sample

    java.lang.RuntimeException: You can only run one pig script at a time from
    the command line.
    at org.apache.pig.Main.main(Main.java:276)


    Am using pig 0.1.0. Had downloaded the source and compiled to get
    pig.jar.Also am using Hadoop 0.17 branch.

    Regards,
    Srilatha

    On Sun, Oct 5, 2008 at 7:18 AM, Santhosh Srinivasan wrote:

    Use parameter substitution -
    http://wiki.apache.org/pig/ParameterSubstitution

    Santhosh

    -----Original Message-----
    From: Latha
    Sent: Saturday, October 04, 2008 6:33 PM
    To: pig-user@incubator.apache.org
    Subject: How to pass arguments to PigScript; + How to compare data atoms
    within pig script

    Greetings!

    My requirement is to search an input string in a given file and output
    all
    lines of the file that contains the string.

    Am writing the following searchString.pig script( Have integrated hadoop
    with pig. Hence, all my files are available at hdfs)

    searchString.info : text file will have a single word 'sample' .
    indexFile : text file will have 10 key-value pairs separated
    by
    space, with word 'sample' appearing in 2 or 3 lines.

    <searchString.pig>

    searchStr = load 'searchString.info' using PigStorage();
    A = load 'indexFile' using PigStorage(' ') as (str,filename);
    -- This string comparision works!
    Y = FILTER A BY str eq 'abc';

    --None of the below works
    Y = FILTER A BY str eq searchStr ;
    Y = FILTER A BY str eq searchStr.$0;


    Instead of hardcoding the search string, I would like to search the
    string
    available at searchStr dataatom.
    1. Which way can i compare "A.str" with "searchStr" data atom?
    2. Is there a way i can pass search string as an argument to the Pig
    Script?

    Thankyou for your assistance.
    Regards,
    Srilatha
  • Santhosh Srinivasan at Oct 5, 2008 at 3:41 am
    The pig script should be the last argument in your command line, i.e.,

    bash$ java -cp $CLASSPATH/pig.jar:$HADOOPSITEPATH org.apache.pig.Main
    -param string=sample search.pig

    Santhosh

    -----Original Message-----
    From: Latha
    Sent: Saturday, October 04, 2008 8:20 PM
    To: pig-user@incubator.apache.org
    Subject: Re: How to pass arguments to PigScript; + How to compare data
    atoms within pig script

    Thankyou Santosh! The link is quite useful.

    However, am not able to pass parameters to pig script the following way.
    Could you please help me to get out of this error.

    bash$ java -cp $CLASSPATH/pig.jar:$HADOOPSITEPATH org.apache.pig.Main
    search.pig -param string=sample

    java.lang.RuntimeException: You can only run one pig script at a time
    from
    the command line.
    at org.apache.pig.Main.main(Main.java:276)


    Am using pig 0.1.0. Had downloaded the source and compiled to get
    pig.jar.Also am using Hadoop 0.17 branch.

    Regards,
    Srilatha


    On Sun, Oct 5, 2008 at 7:18 AM, Santhosh Srinivasan
    wrote:
    Use parameter substitution -
    http://wiki.apache.org/pig/ParameterSubstitution

    Santhosh

    -----Original Message-----
    From: Latha
    Sent: Saturday, October 04, 2008 6:33 PM
    To: pig-user@incubator.apache.org
    Subject: How to pass arguments to PigScript; + How to compare data atoms
    within pig script

    Greetings!

    My requirement is to search an input string in a given file and output
    all
    lines of the file that contains the string.

    Am writing the following searchString.pig script( Have integrated hadoop
    with pig. Hence, all my files are available at hdfs)

    searchString.info : text file will have a single word 'sample' .
    indexFile : text file will have 10 key-value pairs separated
    by
    space, with word 'sample' appearing in 2 or 3 lines.

    <searchString.pig>

    searchStr = load 'searchString.info' using PigStorage();
    A = load 'indexFile' using PigStorage(' ') as (str,filename);
    -- This string comparision works!
    Y = FILTER A BY str eq 'abc';

    --None of the below works
    Y = FILTER A BY str eq searchStr ;
    Y = FILTER A BY str eq searchStr.$0;


    Instead of hardcoding the search string, I would like to search the
    string
    available at searchStr dataatom.
    1. Which way can i compare "A.str" with "searchStr" data atom?
    2. Is there a way i can pass search string as an argument to the Pig
    Script?

    Thankyou for your assistance.
    Regards,
    Srilatha
  • Latha at Oct 5, 2008 at 4:32 am
    Thankyou :) it Worked!

    Thanks & Regards,
    Srilatha
    On Sun, Oct 5, 2008 at 9:09 AM, Santhosh Srinivasan wrote:

    The pig script should be the last argument in your command line, i.e.,

    bash$ java -cp $CLASSPATH/pig.jar:$HADOOPSITEPATH org.apache.pig.Main
    -param string=sample search.pig

    Santhosh

    -----Original Message-----
    From: Latha
    Sent: Saturday, October 04, 2008 8:20 PM
    To: pig-user@incubator.apache.org
    Subject: Re: How to pass arguments to PigScript; + How to compare data
    atoms within pig script

    Thankyou Santosh! The link is quite useful.

    However, am not able to pass parameters to pig script the following way.
    Could you please help me to get out of this error.

    bash$ java -cp $CLASSPATH/pig.jar:$HADOOPSITEPATH org.apache.pig.Main
    search.pig -param string=sample

    java.lang.RuntimeException: You can only run one pig script at a time
    from
    the command line.
    at org.apache.pig.Main.main(Main.java:276)


    Am using pig 0.1.0. Had downloaded the source and compiled to get
    pig.jar.Also am using Hadoop 0.17 branch.

    Regards,
    Srilatha


    On Sun, Oct 5, 2008 at 7:18 AM, Santhosh Srinivasan
    wrote:
    Use parameter substitution -
    http://wiki.apache.org/pig/ParameterSubstitution

    Santhosh

    -----Original Message-----
    From: Latha
    Sent: Saturday, October 04, 2008 6:33 PM
    To: pig-user@incubator.apache.org
    Subject: How to pass arguments to PigScript; + How to compare data atoms
    within pig script

    Greetings!

    My requirement is to search an input string in a given file and output
    all
    lines of the file that contains the string.

    Am writing the following searchString.pig script( Have integrated hadoop
    with pig. Hence, all my files are available at hdfs)

    searchString.info : text file will have a single word 'sample' .
    indexFile : text file will have 10 key-value pairs separated
    by
    space, with word 'sample' appearing in 2 or 3 lines.

    <searchString.pig>

    searchStr = load 'searchString.info' using PigStorage();
    A = load 'indexFile' using PigStorage(' ') as (str,filename);
    -- This string comparision works!
    Y = FILTER A BY str eq 'abc';

    --None of the below works
    Y = FILTER A BY str eq searchStr ;
    Y = FILTER A BY str eq searchStr.$0;


    Instead of hardcoding the search string, I would like to search the
    string
    available at searchStr dataatom.
    1. Which way can i compare "A.str" with "searchStr" data atom?
    2. Is there a way i can pass search string as an argument to the Pig
    Script?

    Thankyou for your assistance.
    Regards,
    Srilatha

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedOct 5, '08 at 1:34a
activeOct 5, '08 at 4:32a
posts5
users2
websitepig.apache.org

2 users in discussion

Latha: 3 posts Santhosh Srinivasan: 2 posts

People

Translate

site design / logo © 2022 Grokbase