FAQ
Small question—the python UDF doc says that "variable names inside a schema string are not used anywhere, they just make the syntax identifiable to the parser" (https://pig.apache.org/docs/r0.9.0/udf.html#schemafunction). However, it looks like pig is picking up those field names and keeping them if I don't override them.

For instance if I have a python UDF:

@outputSchema('a:int')
def my_udf(x):
return 123

And a pig script:

raw = LOAD 'data.txt' USING PigStorage() AS (x:int);
with_udf = FOREACH raw GENERATE my_udfs.my_udf(x);

Running describe on with_udf gives me:

with_udf: {a: int}

Is the doc incorrect there?

Thanks,
Doug

Search Discussions

  • Alan Gates at Sep 30, 2011 at 7:31 pm
    Looks like it, which is good. The behavior you're seeing is what we want.

    Alan.
    On Sep 30, 2011, at 12:26 PM, Doug Daniels wrote:

    Small question—the python UDF doc says that "variable names inside a schema string are not used anywhere, they just make the syntax identifiable to the parser" (https://pig.apache.org/docs/r0.9.0/udf.html#schemafunction). However, it looks like pig is picking up those field names and keeping them if I don't override them.

    For instance if I have a python UDF:

    @outputSchema('a:int')
    def my_udf(x):
    return 123

    And a pig script:

    raw = LOAD 'data.txt' USING PigStorage() AS (x:int);
    with_udf = FOREACH raw GENERATE my_udfs.my_udf(x);

    Running describe on with_udf gives me:

    with_udf: {a: int}

    Is the doc incorrect there?

    Thanks,
    Doug

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedSep 30, '11 at 7:28p
activeSep 30, '11 at 7:31p
posts2
users2
websitepig.apache.org

2 users in discussion

Doug Daniels: 1 post Alan Gates: 1 post

People

Translate

site design / logo © 2021 Grokbase