Grokbase Groups Pig user June 2011
FAQ
I was wondering why assertOutput in PigTest calls registerScript
twice? Once in assertOutput and then again in getAlias? I added a mv
to the end of my pig script and its getting called each time
registerScript is called and thus failing the second time bc the
source directory is no longer there.

Thanks,
Jennie

Search Discussions

  • Romain Rigaux at Jul 8, 2011 at 11:39 pm
    Hi,

    This double parsing is there because I did not find an easy way to modify
    the Pig plan after the first parsing.

    So PigUnit parses it one time, then modifies the Pig script and then
    reparses it while adding some modifications like:

    - modify an alias (e.g. change A = LOAD 'txt' --> A = LOAD 'anotherdata')
    - guess a schema
    - remind what it the last alias...

    I could post a patch where PigUnit drops all the mv (or any other shell
    command) by default if you want? Maybe the plan is easier to modify now too.

    Romain
    On Wed, Jun 29, 2011 at 12:09 PM, Jennie Cochran-Chinn wrote:

    I was wondering why assertOutput in PigTest calls registerScript
    twice? Once in assertOutput and then again in getAlias? I added a mv
    to the end of my pig script and its getting called each time
    registerScript is called and thus failing the second time bc the
    source directory is no longer there.

    Thanks,
    Jennie
  • Jennie Cochran-Chinn at Jul 11, 2011 at 4:46 pm
    Hey Romain,

    Thanks for the reply. Instead of assertOutput directly, I'm using
    just the second line of it for now:
    Assert.assertEquals(StringUtils.join(tupleOutput, "\n"),
    StringUtils.join(pigTest.getAlias(alias), "\n"));

    - it seems to work. Any gotchas I should be looking out for that you
    can think of though?

    Thanks,
    Jennie
    On Fri, Jul 8, 2011 at 4:38 PM, Romain Rigaux wrote:
    Hi,

    This double parsing is there because I did not find an easy way to modify
    the Pig plan after the first parsing.

    So PigUnit parses it one time, then modifies the Pig script and then
    reparses it while adding some modifications like:

    - modify an alias (e.g. change A = LOAD 'txt' --> A = LOAD 'anotherdata')
    - guess a schema
    - remind what it the last alias...

    I could post a patch where PigUnit drops all the mv (or any other shell
    command) by default if you want? Maybe the plan is easier to modify now too.

    Romain

    On Wed, Jun 29, 2011 at 12:09 PM, Jennie Cochran-Chinn <
    [email protected]> wrote:
    I was wondering why assertOutput in PigTest calls registerScript
    twice?  Once in assertOutput and then again in getAlias?  I added a mv
    to the end of my pig script and its getting called each time
    registerScript is called and thus failing the second time bc the
    source directory is no longer there.

    Thanks,
    Jennie
  • Romain Rigaux at Jul 15, 2011 at 11:03 pm
    Hi,

    I just tried it and indeed it works as you skip some extra
    registerScript(s).

    Some tips:
    With this method if you want to load some input data from an array of String
    (instead of a file) you will need to create a temporary input file yourself
    and load it.
    Overriding aliases is still possible:
    PigTest test = new PigTest(pig, args);
    test.override("B", "B = FILTER A BY ...");

    I will keep in mind this problem of registerScript for a new update of
    PigUnit one day!

    Romain
    On Mon, Jul 11, 2011 at 9:45 AM, Jennie Cochran-Chinn wrote:

    Hey Romain,

    Thanks for the reply. Instead of assertOutput directly, I'm using
    just the second line of it for now:
    Assert.assertEquals(StringUtils.join(tupleOutput, "\n"),
    StringUtils.join(pigTest.getAlias(alias), "\n"));

    - it seems to work. Any gotchas I should be looking out for that you
    can think of though?

    Thanks,
    Jennie
    On Fri, Jul 8, 2011 at 4:38 PM, Romain Rigaux wrote:
    Hi,

    This double parsing is there because I did not find an easy way to modify
    the Pig plan after the first parsing.

    So PigUnit parses it one time, then modifies the Pig script and then
    reparses it while adding some modifications like:

    - modify an alias (e.g. change A = LOAD 'txt' --> A = LOAD
    'anotherdata')
    - guess a schema
    - remind what it the last alias...

    I could post a patch where PigUnit drops all the mv (or any other shell
    command) by default if you want? Maybe the plan is easier to modify now too.
    Romain

    On Wed, Jun 29, 2011 at 12:09 PM, Jennie Cochran-Chinn <
    [email protected]> wrote:
    I was wondering why assertOutput in PigTest calls registerScript
    twice? Once in assertOutput and then again in getAlias? I added a mv
    to the end of my pig script and its getting called each time
    registerScript is called and thus failing the second time bc the
    source directory is no longer there.

    Thanks,
    Jennie

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJun 29, '11 at 7:09p
activeJul 15, '11 at 11:03p
posts4
users2
websitepig.apache.org

People

Translate

site design / logo © 2023 Grokbase