Grokbase Groups Pig user August 2010
FAQ
Hello!

I created custom UDF and I need to have different outputSchema columns(in one
case - 2 chararray fields, in the other - 3 chararray fields for
output) depends on UDF input parameter(mode_1 = 'one'):
..FLATTEN(myUDF('int, char', 'one'))

I overrided
@Override
public Schema getOutputSchema(Schema input){
Schema schema = new Schema();
// check mode_1 parameter 'one'- it's null because it will be set
// in evaluate func. only
if( mode_1!= null && mode_1.equal("one")){
schema.add(new FieldSchema("type", DataType.CHARARRAY));
schema.add(new FieldSchema("text", DataType.CHARARRAY));
return schema;
}
// default
schema.add(new FieldSchema("type", DataType.CHARARRAY));
schema.add(new FieldSchema("text", DataType.CHARARRAY));
schema.add(new FieldSchema("age", DataType.CHARARRAY));
return schema;
}
}
but this function is called before
@Override
public DataBag evaluate(Tuple input)
{
// get UDF mode parameter mode_1
Object xmode = input.get(1);
mode_1 = xmode.toString();
...
}
where I can check needed parameter 'one' - so I can't change schema defined
in getOutputSchema(...) function initially.

Is it a way to change OutputSchema from evaluate(...) ?

I tried to create a constructor
public myUDF(String param, String mode )
{
}
but it was never called. Only myUDF() - constructor without parameters
was called.

Class' static variable for mode down't work also.
I use Pig 0.3.0.

--
Best regards,
Serg.

Search Discussions

  • Mridul Muralidharan at Aug 6, 2010 at 7:54 pm
    If I understood your problem right, you can use define to pass
    parameters to constructor and then use that (after populating it into a
    instance field).


    -- note, only String's are accepted as parameters !
    define MY_UDF org.me.udfp.MyUDF('param1', 'param2');
    --- This will call the constructor public MyUDF(String str1, String
    str2){...}


    ....
    B = FOREACH A GENERATE MY_UDF($0);
    ....



    - Mridul


    On Friday 06 August 2010 05:59 PM, Васяйчев Сергей wrote:
    Hello!

    I created custom UDF and I need to have different outputSchema columns(in one
    case - 2 chararray fields, in the other - 3 chararray fields for
    output) depends on UDF input parameter(mode_1 = 'one'):
    ..FLATTEN(myUDF('int, char', 'one'))

    I overrided
    @Override
    public Schema getOutputSchema(Schema input){
    Schema schema = new Schema();
    // check mode_1 parameter 'one'- it's null because it will be set
    // in evaluate func. only
    if( mode_1!= null&& mode_1.equal("one")){
    schema.add(new FieldSchema("type", DataType.CHARARRAY));
    schema.add(new FieldSchema("text", DataType.CHARARRAY));
    return schema;
    }
    // default
    schema.add(new FieldSchema("type", DataType.CHARARRAY));
    schema.add(new FieldSchema("text", DataType.CHARARRAY));
    schema.add(new FieldSchema("age", DataType.CHARARRAY));
    return schema;
    }
    }
    but this function is called before
    @Override
    public DataBag evaluate(Tuple input)
    {
    // get UDF mode parameter mode_1
    Object xmode = input.get(1);
    mode_1 = xmode.toString();
    ...
    }
    where I can check needed parameter 'one' - so I can't change schema defined
    in getOutputSchema(...) function initially.

    Is it a way to change OutputSchema from evaluate(...) ?

    I tried to create a constructor
    public myUDF(String param, String mode )
    {
    }
    but it was never called. Only myUDF() - constructor without parameters
    was called.

    Class' static variable for mode down't work also.
    I use Pig 0.3.0.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedAug 6, '10 at 12:31p
activeAug 6, '10 at 7:54p
posts2
users2
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase