FAQ
Is there a way to reuse a pig scripts ( like def:: in python or
function calls etc) from inside a calling pig script. I have a set of
basic pig script which I would like to call from a high-level
pig-script. Currently I have to copy/paste exact same set of code with
different input relation. This makes the code unnecessary bulky and is
error prone. Even a macro def will be a great help.

-Thanks,
Prasen

Search Discussions

  • Zaki rahaman at Feb 10, 2010 at 3:07 am
    Hi Prasen,

    If the only thing changing is the input and/or output, you can parametrize
    your script so that you can easily re-use your script with parameter
    substitution.
    On Tue, Feb 9, 2010 at 9:49 PM, prasenjit mukherjee wrote:

    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc) from inside a calling pig script. I have a set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code with
    different input relation. This makes the code unnecessary bulky and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen


    --
    Zaki Rahaman
  • Prasenjit mukherjee at Feb 10, 2010 at 4:12 am
    May be I was not clear enough on my problem. I would like to call
    another pig-script from a pig-script. How can I do that.

    As far as I understand, you can call a pig script from a unix-shell (
    or windows ) passing those parameters, but not from another pig-script
    ? I would be glad to be proved wrong. I wish I am wrong :)
    On Wed, Feb 10, 2010 at 8:36 AM, zaki rahaman wrote:
    Hi Prasen,

    If the only thing changing is the input and/or output, you can parametrize
    your script so that you can easily re-use your script with parameter
    substitution.

    On Tue, Feb 9, 2010 at 9:49 PM, prasenjit mukherjee <
    pmukherjee@quattrowireless.com> wrote:
    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc)  from inside a calling pig script. I have a set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code with
    different input relation. This makes the code unnecessary bulky and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen


    --
    Zaki Rahaman
  • Alan Gates at Feb 10, 2010 at 5:08 am
    You are not wrong. This is a feature we'd like to add but haven't
    gotten to yet.

    Alan.
    On Feb 9, 2010, at 8:12 PM, prasenjit mukherjee wrote:

    May be I was not clear enough on my problem. I would like to call
    another pig-script from a pig-script. How can I do that.

    As far as I understand, you can call a pig script from a unix-shell (
    or windows ) passing those parameters, but not from another pig-script
    ? I would be glad to be proved wrong. I wish I am wrong :)

    On Wed, Feb 10, 2010 at 8:36 AM, zaki rahaman
    wrote:
    Hi Prasen,

    If the only thing changing is the input and/or output, you can
    parametrize
    your script so that you can easily re-use your script with parameter
    substitution.

    On Tue, Feb 9, 2010 at 9:49 PM, prasenjit mukherjee <
    pmukherjee@quattrowireless.com> wrote:
    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc) from inside a calling pig script. I have a
    set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code
    with
    different input relation. This makes the code unnecessary bulky
    and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen


    --
    Zaki Rahaman
  • Theo Hultberg at Feb 10, 2010 at 8:51 am
    Hi,

    I wrote Piglet, a Ruby DSL, for exactly this purpose:

    http://github.com/iconara/piglet

    It makes it possible to reuse blocks of operations, and even define
    new operations (that are blocks of other operations), as well as use
    looping and other control-of-flow structures. It compiles down to Pig
    Latin.

    T#
    On Wed, Feb 10, 2010 at 6:08 AM, Alan Gates wrote:
    You are not wrong.  This is a feature we'd like to add but haven't gotten to
    yet.

    Alan.
    On Feb 9, 2010, at 8:12 PM, prasenjit mukherjee wrote:

    May be I was not clear enough on my problem. I would like to call
    another pig-script from a pig-script. How can I do that.

    As far as I understand, you can call a pig script from a unix-shell (
    or windows ) passing those parameters, but not from another pig-script
    ? I would be glad to be proved wrong. I wish I am wrong :)

    On Wed, Feb 10, 2010 at 8:36 AM, zaki rahaman <zaki.rahaman@gmail.com>
    wrote:
    Hi Prasen,

    If the only thing changing is the input and/or output, you can
    parametrize
    your script so that you can easily re-use your script with parameter
    substitution.

    On Tue, Feb 9, 2010 at 9:49 PM, prasenjit mukherjee <
    pmukherjee@quattrowireless.com> wrote:
    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc)  from inside a calling pig script. I have a set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code with
    different input relation. This makes the code unnecessary bulky and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen


    --
    Zaki Rahaman
  • Ankur C. Goel at Feb 11, 2010 at 9:50 am
    I tried this in local mode, it works - http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#exec

    $ cat s1.pig

    a = LOAD 'student' AS (name, age, gpa);
    b = ORDER a BY age;
    DUMP b;

    $ cat s.pig

    a = LOAD 'student' AS (name, age, gpa);
    b = LIMIT a 3;
    DUMP b;
    exec s1.pig


    $ pig -x local s.pig
    I can't find HOD configuration for cluster, hopefully you weren't planning on using HOD.
    2010-02-11 09:47:45,708 [main] INFO org.apache.pig.Main - Logging error messages to: /homes/gankur/pig_1265881665706.log
    2010-02-11 09:47:46,506 [main] WARN org.apache.pig.impl.io.FileLocalizer - FileLocalizer.create: failed to create /tmp/temp-1460326964
    2010-02-11 09:47:46,545 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-1460326964/tmp-1949967585"
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 3
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 101
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
    (alice,20,2.47)
    (luke,18,4.00)
    (holly,24,3.27)
    2010-02-11 09:47:46,625 [main] WARN org.apache.pig.impl.io.FileLocalizer - FileLocalizer.create: failed to create /tmp/temp-1460326964
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-1460326964/tmp1891452502"
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 3
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 101
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
    2010-02-11 09:47:46,653 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
    (luke,18,4.00)
    (alice,20,2.47)
    (holly,24,3.27)

    -@nkur

    On 2/10/10 10:38 AM, "Alan Gates" wrote:

    You are not wrong. This is a feature we'd like to add but haven't
    gotten to yet.

    Alan.
    On Feb 9, 2010, at 8:12 PM, prasenjit mukherjee wrote:

    May be I was not clear enough on my problem. I would like to call
    another pig-script from a pig-script. How can I do that.

    As far as I understand, you can call a pig script from a unix-shell (
    or windows ) passing those parameters, but not from another pig-script
    ? I would be glad to be proved wrong. I wish I am wrong :)

    On Wed, Feb 10, 2010 at 8:36 AM, zaki rahaman
    wrote:
    Hi Prasen,

    If the only thing changing is the input and/or output, you can
    parametrize
    your script so that you can easily re-use your script with parameter
    substitution.

    On Tue, Feb 9, 2010 at 9:49 PM, prasenjit mukherjee <
    pmukherjee@quattrowireless.com> wrote:
    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc) from inside a calling pig script. I have a
    set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code
    with
    different input relation. This makes the code unnecessary bulky
    and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen


    --
    Zaki Rahaman
  • Prasenjit mukherjee at Feb 11, 2010 at 3:02 pm
    Thanks Ankur, this comes closest to what I was looking for. Have some
    questions though :

    *) Can you pass relation_names to the calling script. e.g.

    SPLIT r1 INTO split1 IF ( r1.a>5) ELSE split2 IF ( r1.a<=5)
    exec --param relation1=split1 relation2=split2 caller_script.pig

    *) Does it work properly in distributed mode or there are some
    exceptions I need to worry about.

    -Prasen
    On Thu, Feb 11, 2010 at 3:19 PM, Ankur C. Goel wrote:
    I tried this in local mode, it works - http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#exec

    $ cat s1.pig

    a = LOAD 'student' AS (name, age, gpa);
    b = ORDER a BY age;
    DUMP b;

    $ cat s.pig

    a = LOAD 'student' AS (name, age, gpa);
    b = LIMIT a 3;
    DUMP b;
    exec s1.pig


    $ pig -x local s.pig
    I can't find HOD configuration for cluster, hopefully you weren't planning on using HOD.
    2010-02-11 09:47:45,708 [main] INFO  org.apache.pig.Main - Logging error messages to: /homes/gankur/pig_1265881665706.log
    2010-02-11 09:47:46,506 [main] WARN  org.apache.pig.impl.io.FileLocalizer - FileLocalizer.create: failed to create /tmp/temp-1460326964
    2010-02-11 09:47:46,545 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-1460326964/tmp-1949967585"
    2010-02-11 09:47:46,546 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 3
    2010-02-11 09:47:46,546 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 101
    2010-02-11 09:47:46,546 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
    2010-02-11 09:47:46,546 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
    (alice,20,2.47)
    (luke,18,4.00)
    (holly,24,3.27)
    2010-02-11 09:47:46,625 [main] WARN  org.apache.pig.impl.io.FileLocalizer - FileLocalizer.create: failed to create /tmp/temp-1460326964
    2010-02-11 09:47:46,652 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-1460326964/tmp1891452502"
    2010-02-11 09:47:46,652 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 3
    2010-02-11 09:47:46,652 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 101
    2010-02-11 09:47:46,652 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
    2010-02-11 09:47:46,653 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
    (luke,18,4.00)
    (alice,20,2.47)
    (holly,24,3.27)

    -@nkur

    On 2/10/10 10:38 AM, "Alan Gates" wrote:

    You are not wrong.  This is a feature we'd like to add but haven't
    gotten to yet.

    Alan.
    On Feb 9, 2010, at 8:12 PM, prasenjit mukherjee wrote:

    May be I was not clear enough on my problem. I would like to call
    another pig-script from a pig-script. How can I do that.

    As far as I understand, you can call a pig script from a unix-shell (
    or windows ) passing those parameters, but not from another pig-script
    ? I would be glad to be proved wrong. I wish I am wrong :)

    On Wed, Feb 10, 2010 at 8:36 AM, zaki rahaman
    wrote:
    Hi Prasen,

    If the only thing changing is the input and/or output, you can
    parametrize
    your script so that you can easily re-use your script with parameter
    substitution.

    On Tue, Feb 9, 2010 at 9:49 PM, prasenjit mukherjee <
    pmukherjee@quattrowireless.com> wrote:
    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc)  from inside a calling pig script. I have a
    set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code
    with
    different input relation. This makes the code unnecessary bulky
    and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen


    --
    Zaki Rahaman
  • Ankur C. Goel at Feb 11, 2010 at 3:07 pm
    You should be able to pass any parameter value to the called script. However to think that you are passing a relation from callee to called
    Script is not right.

    AFAIK, it should work in disributed mode too, unless there's a bug in pig :-)

    -@nkur

    On 2/11/10 8:32 PM, "prasenjit mukherjee" wrote:

    Thanks Ankur, this comes closest to what I was looking for. Have some
    questions though :

    *) Can you pass relation_names to the calling script. e.g.

    SPLIT r1 INTO split1 IF ( r1.a>5) ELSE split2 IF ( r1.a<=5)
    exec --param relation1=split1 relation2=split2 caller_script.pig

    *) Does it work properly in distributed mode or there are some
    exceptions I need to worry about.

    -Prasen
    On Thu, Feb 11, 2010 at 3:19 PM, Ankur C. Goel wrote:
    I tried this in local mode, it works - http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#exec

    $ cat s1.pig

    a = LOAD 'student' AS (name, age, gpa);
    b = ORDER a BY age;
    DUMP b;

    $ cat s.pig

    a = LOAD 'student' AS (name, age, gpa);
    b = LIMIT a 3;
    DUMP b;
    exec s1.pig


    $ pig -x local s.pig
    I can't find HOD configuration for cluster, hopefully you weren't planning on using HOD.
    2010-02-11 09:47:45,708 [main] INFO org.apache.pig.Main - Logging error messages to: /homes/gankur/pig_1265881665706.log
    2010-02-11 09:47:46,506 [main] WARN org.apache.pig.impl.io.FileLocalizer - FileLocalizer.create: failed to create /tmp/temp-1460326964
    2010-02-11 09:47:46,545 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-1460326964/tmp-1949967585"
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 3
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 101
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
    2010-02-11 09:47:46,546 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
    (alice,20,2.47)
    (luke,18,4.00)
    (holly,24,3.27)
    2010-02-11 09:47:46,625 [main] WARN org.apache.pig.impl.io.FileLocalizer - FileLocalizer.create: failed to create /tmp/temp-1460326964
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-1460326964/tmp1891452502"
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 3
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 101
    2010-02-11 09:47:46,652 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
    2010-02-11 09:47:46,653 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
    (luke,18,4.00)
    (alice,20,2.47)
    (holly,24,3.27)

    -@nkur

    On 2/10/10 10:38 AM, "Alan Gates" wrote:

    You are not wrong. This is a feature we'd like to add but haven't
    gotten to yet.

    Alan.
    On Feb 9, 2010, at 8:12 PM, prasenjit mukherjee wrote:

    May be I was not clear enough on my problem. I would like to call
    another pig-script from a pig-script. How can I do that.

    As far as I understand, you can call a pig script from a unix-shell (
    or windows ) passing those parameters, but not from another pig-script
    ? I would be glad to be proved wrong. I wish I am wrong :)

    On Wed, Feb 10, 2010 at 8:36 AM, zaki rahaman
    wrote:
    Hi Prasen,

    If the only thing changing is the input and/or output, you can
    parametrize
    your script so that you can easily re-use your script with parameter
    substitution.

    On Tue, Feb 9, 2010 at 9:49 PM, prasenjit mukherjee <
    pmukherjee@quattrowireless.com> wrote:
    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc) from inside a calling pig script. I have a
    set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code
    with
    different input relation. This makes the code unnecessary bulky
    and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen


    --
    Zaki Rahaman
  • Dmitriy Ryaboy at Feb 10, 2010 at 3:08 am
    There isn't. There are a few projects that wrap pig to enable this
    sort of functionality.
    It doesn't really have to be a sophisticated thing -- you can simply
    use a templating library like Template Toolkit
    (http://template-toolkit.org/) to get fairly far.

    For pig-specific stuff, check out wukong or piglet in Ruby.

    -D

    On Tue, Feb 9, 2010 at 6:49 PM, prasenjit mukherjee
    wrote:
    Is there a way to reuse a pig scripts ( like def:: in python or
    function calls etc)  from inside a calling pig script. I have a set of
    basic pig script which I would like to call from a high-level
    pig-script. Currently I have to copy/paste exact same set of code with
    different input relation. This makes the code unnecessary bulky and is
    error prone. Even a macro def will be a great help.

    -Thanks,
    Prasen

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedFeb 10, '10 at 2:49a
activeFeb 11, '10 at 3:07p
posts9
users6
websitepig.apache.org

People

Translate

site design / logo © 2022 Grokbase