Grokbase Groups Pig user May 2011
FAQ
When I run a pig job the hadoop job tracker gui (the one on port 50030) shows ‘PigLatin:myscript.pig’ as the name of the job. How can I configure that to show a different name than the name of the script?

Thanks in advance,

Will

William F Dowling

Search Discussions

  • Eric Gaudet at May 26, 2011 at 6:15 pm
    At the beginning of your script, use:

    SET job.name 'this is my alternative name';

    You can also use parameters like $PARAM in the name.

    EG
    On 05/26/2011 11:04 AM, william.dowling@thomsonreuters.com wrote:
    When I run a pig job the hadoop job tracker gui (the one on port 50030) shows ‘PigLatin:myscript.pig’ as the name of the job. How can I configure that to show a different name than the name of the script?

    Thanks in advance,

    Will

    William F Dowling

  • Mark Laczin at May 26, 2011 at 6:23 pm
    This will work but will (in .80 at least) change only the part of the
    job name that's not 'PigLatin:'.

    That is, if you use job.name 'hello' in a script named test.pig
    You end up with a full name of:

    PigLatin:hello

    Instead of PigLatin:test.pig

    Just FYI.
    On Thu, May 26, 2011 at 2:15 PM, Eric Gaudet wrote:
    At the beginning of your script, use:

    SET job.name 'this is my alternative name';

    You can also use parameters like $PARAM in the name.

    EG
    On 05/26/2011 11:04 AM, william.dowling@thomsonreuters.com wrote:

    When I run a pig job the hadoop job tracker gui (the one on port 50030)
    shows ‘PigLatin:myscript.pig’ as the name of the job.  How can I configure
    that to show a different name than the name of the script?

    Thanks in advance,

    Will

    William F Dowling

  • William Dowling at May 26, 2011 at 6:49 pm
    Thanks Eric and Mark.

    Now I see that job.name is documented in http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#set (duh). It also says there "All Pig and Hadoop properties can be set."

    Trying to figure out what exactly those properties are (is there a list someplace?) I looked at my job configuration at

    http://myserver:50030/jobconf.jsp?jobid=job_2011999999_9999

    where I now see 'mapred.job.name' --> PigLatin:hello

    after I set job.name to 'hello'.

    But
    SET mapred.job.name hello
    doesn't seem to have any effect at all.

    So I am confused about which properties I can set, and how to refer to them. Is there a doc or wiki page someplace that explains this to a pig/hadoop novice?

    Thanks again for your help.

    William F Dowling

    -----Original Message-----
    From: Mark Laczin
    Sent: Thursday, May 26, 2011 2:23 PM
    To: user@pig.apache.org
    Subject: Re: Set visible name of a running pig job

    This will work but will (in .80 at least) change only the part of the
    job name that's not 'PigLatin:'.

    That is, if you use job.name 'hello' in a script named test.pig
    You end up with a full name of:

    PigLatin:hello

    Instead of PigLatin:test.pig

    Just FYI.
    On Thu, May 26, 2011 at 2:15 PM, Eric Gaudet wrote:
    At the beginning of your script, use:

    SET job.name 'this is my alternative name';

    You can also use parameters like $PARAM in the name.

    EG
    On 05/26/2011 11:04 AM, william.dowling@thomsonreuters.com wrote:

    When I run a pig job the hadoop job tracker gui (the one on port 50030)
    shows ‘PigLatin:myscript.pig’ as the name of the job.  How can I configure
    that to show a different name than the name of the script?

    Thanks in advance,

    Will

    William F Dowling

  • Jonathan Coveney at May 26, 2011 at 8:47 pm
    Another option is to use -Dmapred.job.name=whatever on the command line.

    2011/5/26 <william.dowling@thomsonreuters.com>
    Thanks Eric and Mark.

    Now I see that job.name is documented in
    http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#set (duh). It also
    says there "All Pig and Hadoop properties can be set."

    Trying to figure out what exactly those properties are (is there a list
    someplace?) I looked at my job configuration at

    http://myserver:50030/jobconf.jsp?jobid=job_2011999999_9999

    where I now see 'mapred.job.name' --> PigLatin:hello

    after I set job.name to 'hello'.

    But
    SET mapred.job.name hello
    doesn't seem to have any effect at all.

    So I am confused about which properties I can set, and how to refer to
    them. Is there a doc or wiki page someplace that explains this to a
    pig/hadoop novice?

    Thanks again for your help.

    William F Dowling

    -----Original Message-----
    From: Mark Laczin
    Sent: Thursday, May 26, 2011 2:23 PM
    To: user@pig.apache.org
    Subject: Re: Set visible name of a running pig job

    This will work but will (in .80 at least) change only the part of the
    job name that's not 'PigLatin:'.

    That is, if you use job.name 'hello' in a script named test.pig
    You end up with a full name of:

    PigLatin:hello

    Instead of PigLatin:test.pig

    Just FYI.
    On Thu, May 26, 2011 at 2:15 PM, Eric Gaudet wrote:
    At the beginning of your script, use:

    SET job.name 'this is my alternative name';

    You can also use parameters like $PARAM in the name.

    EG
    On 05/26/2011 11:04 AM, william.dowling@thomsonreuters.com wrote:

    When I run a pig job the hadoop job tracker gui (the one on port 50030)
    shows ‘PigLatin:myscript.pig’ as the name of the job. How can I
    configure
    that to show a different name than the name of the script?

    Thanks in advance,

    Will

    William F Dowling

  • William Dowling at May 26, 2011 at 9:36 pm
    Thanks Jonathan. I've seen other references to using -D... on the command line, but I haven't had success with it. I tried
    pig -param a=b -Dmapred.job.name=whatever myscript.pig
    and the script failed and I got a usage message

    Apache Pig version 0.8.1 (r1094835)
    compiled Apr 18 2011, 19:26:53

    USAGE: Pig [options] [-] : Run interactively in grunt shell.
    Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
    Pig [options] [-f[ile]] file : Run cmds found in file.
    options include:
    -4, -log4jconf - Log4j configuration file, overrides log conf
    [...]

    The usage message doesn't mention -D.

    I think there's a bug in command line processing: when I reversed the order of the -param and -D, then I did not get the usage message, and my script ran.

    But, mapred.job.name did not get passed through to hadoop. The configuration reported by the jobtracker shows the default name for the job, PigLatin:myscript.pig, not the string following the -D.

    So that looks like a different bug. I have not tried the -propertyFile switch. What would be the format of entries in that file:
    a b
    a = b
    <xml>???</>

    Thanks again,
    Will



    -----Original Message-----
    From: Jonathan Coveney
    Sent: Thursday, May 26, 2011 4:47 PM
    To: user@pig.apache.org
    Subject: Re: Set visible name of a running pig job

    Another option is to use -Dmapred.job.name=whatever on the command line.

    2011/5/26 <william.dowling@thomsonreuters.com>
    Thanks Eric and Mark.

    Now I see that job.name is documented in
    http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#set (duh). It also
    says there "All Pig and Hadoop properties can be set."

    Trying to figure out what exactly those properties are (is there a list
    someplace?) I looked at my job configuration at

    http://myserver:50030/jobconf.jsp?jobid=job_2011999999_9999

    where I now see 'mapred.job.name' --> PigLatin:hello

    after I set job.name to 'hello'.

    But
    SET mapred.job.name hello
    doesn't seem to have any effect at all.

    So I am confused about which properties I can set, and how to refer to
    them. Is there a doc or wiki page someplace that explains this to a
    pig/hadoop novice?

    Thanks again for your help.

    William F Dowling

    -----Original Message-----
    From: Mark Laczin
    Sent: Thursday, May 26, 2011 2:23 PM
    To: user@pig.apache.org
    Subject: Re: Set visible name of a running pig job

    This will work but will (in .80 at least) change only the part of the
    job name that's not 'PigLatin:'.

    That is, if you use job.name 'hello' in a script named test.pig
    You end up with a full name of:

    PigLatin:hello

    Instead of PigLatin:test.pig

    Just FYI.
    On Thu, May 26, 2011 at 2:15 PM, Eric Gaudet wrote:
    At the beginning of your script, use:

    SET job.name 'this is my alternative name';

    You can also use parameters like $PARAM in the name.

    EG
    On 05/26/2011 11:04 AM, william.dowling@thomsonreuters.com wrote:

    When I run a pig job the hadoop job tracker gui (the one on port 50030)
    shows ‘PigLatin:myscript.pig’ as the name of the job. How can I
    configure
    that to show a different name than the name of the script?

    Thanks in advance,

    Will

    William F Dowling

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 26, '11 at 6:04p
activeMay 26, '11 at 9:36p
posts6
users4
websitepig.apache.org

People

Translate

site design / logo © 2022 Grokbase