FAQ
Hi,

I'm trying to write a UDF to take a bag of http get arguments (e.g.
{(s=556477989), (ts=1265964662)} ) and turn them into a map (e.g.
[s#556477989, ts#1265964662] ), and have written a class extending
EvalFunc, but am having no end of trouble declaring/defining the exec
method. My first shot at compiling it had code like this:

public InternalMap exec(DataBag input) throws IOException {

and produced the compile-time error:

httpArgParse.java:11: myudfs.httpArgParse is not abstract and does not
override abstract method exec(org.apache.pig.data.DataBag)

clumsily trying at sticking an @Override before it (as I noticed was
done in TOKENIZE), instead produced:

httpArgParse.java:12: method does not override or implement a method
from a supertype

I have a hunch that this isn't too awful a problem for someone who's
done much Java coding in the past decade, but I've been away from the
language for about 12 years now, and it seems to have changed
significantly in that time.

Cheers,
Kris

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3

Search Discussions

  • Gerrit Jansen van Vuuren at Nov 26, 2010 at 10:02 pm
    Hi,

    Many of the pig (and hadoop) classes use java generic type definition.

    The method signature for exec is:

    abstract public T exec(Tuple input) throws IOException;

    So if your method is returning InternalMap you need to define your eval
    function class as such:

    public class MyEvalFunc<InternalMap>{

    @Override
    public InternalMap exec(Tuple input) throws IOException{

    DataBag bag = (DataBag) input.get(0);

    }

    @Override
    public Schema outputSchema(Schema input) {
    return new Schema(new Schema.FieldSchema(null, DataType.BAG));
    }
    }

    @Override is a great way of letting your IDE now that you think this method
    is overriding something. If it complains with an error it means you've
    missed out something in the method signature that is not 100%.

    The argument to the function must be a Tuple.
    The tuple is populated with the arguments passed to your udf.
    So that if you have in the pig script MyUDF('a','b','c')
    The input Tuple will contain values

    (String)input.get(0) == 'a'
    (String)input.get(1) == 'b'
    (String)input.get(2) == 'c'

    In your case the input.get(0) will contain a type DataBag.

    Cheers,
    Gerrit


    -----Original Message-----
    From: Kris Coward
    Sent: Friday, November 26, 2010 6:27 PM
    To: user@pig.apache.org
    Subject: Difficulty extending EvalFunc

    Hi,

    I'm trying to write a UDF to take a bag of http get arguments (e.g.
    {(s=556477989), (ts=1265964662)} ) and turn them into a map (e.g.
    [s#556477989, ts#1265964662] ), and have written a class extending
    EvalFunc, but am having no end of trouble declaring/defining the exec
    method. My first shot at compiling it had code like this:

    public InternalMap exec(DataBag input) throws IOException {

    and produced the compile-time error:

    httpArgParse.java:11: myudfs.httpArgParse is not abstract and does not
    override abstract method exec(org.apache.pig.data.DataBag)

    clumsily trying at sticking an @Override before it (as I noticed was
    done in TOKENIZE), instead produced:

    httpArgParse.java:12: method does not override or implement a method
    from a supertype

    I have a hunch that this isn't too awful a problem for someone who's
    done much Java coding in the past decade, but I've been away from the
    language for about 12 years now, and it seems to have changed
    significantly in that time.

    Cheers,
    Kris

    --
    Kris Coward http://unripe.melon.org/
    GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3
  • Dmitriy Ryaboy at Nov 26, 2010 at 10:34 pm
    As an aside, there is a reason the InternalMap class is called Internal.
    It's not meant for use by code that's not core Pig, and is subject to all
    kinds of changes, including deletion, between releases. Don't use it. Use
    Map<String, Object> (which can be backed by any standard Map implementation
    in the JDK).

    -D
    On Fri, Nov 26, 2010 at 2:01 PM, Gerrit Jansen van Vuuren wrote:

    Hi,

    Many of the pig (and hadoop) classes use java generic type definition.

    The method signature for exec is:

    abstract public T exec(Tuple input) throws IOException;

    So if your method is returning InternalMap you need to define your eval
    function class as such:

    public class MyEvalFunc<InternalMap>{

    @Override
    public InternalMap exec(Tuple input) throws IOException{

    DataBag bag = (DataBag) input.get(0);

    }

    @Override
    public Schema outputSchema(Schema input) {
    return new Schema(new Schema.FieldSchema(null, DataType.BAG));
    }
    }

    @Override is a great way of letting your IDE now that you think this method
    is overriding something. If it complains with an error it means you've
    missed out something in the method signature that is not 100%.

    The argument to the function must be a Tuple.
    The tuple is populated with the arguments passed to your udf.
    So that if you have in the pig script MyUDF('a','b','c')
    The input Tuple will contain values

    (String)input.get(0) == 'a'
    (String)input.get(1) == 'b'
    (String)input.get(2) == 'c'

    In your case the input.get(0) will contain a type DataBag.

    Cheers,
    Gerrit


    -----Original Message-----
    From: Kris Coward
    Sent: Friday, November 26, 2010 6:27 PM
    To: user@pig.apache.org
    Subject: Difficulty extending EvalFunc

    Hi,

    I'm trying to write a UDF to take a bag of http get arguments (e.g.
    {(s=556477989), (ts=1265964662)} ) and turn them into a map (e.g.
    [s#556477989, ts#1265964662] ), and have written a class extending
    EvalFunc, but am having no end of trouble declaring/defining the exec
    method. My first shot at compiling it had code like this:

    public InternalMap exec(DataBag input) throws IOException {

    and produced the compile-time error:

    httpArgParse.java:11: myudfs.httpArgParse is not abstract and does not
    override abstract method exec(org.apache.pig.data.DataBag)

    clumsily trying at sticking an @Override before it (as I noticed was
    done in TOKENIZE), instead produced:

    httpArgParse.java:12: method does not override or implement a method
    from a supertype

    I have a hunch that this isn't too awful a problem for someone who's
    done much Java coding in the past decade, but I've been away from the
    language for about 12 years now, and it seems to have changed
    significantly in that time.

    Cheers,
    Kris

    --
    Kris Coward http://unripe.melon.org/
    GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedNov 26, '10 at 6:27p
activeNov 26, '10 at 10:34p
posts3
users3
websitepig.apache.org

People

Translate

site design / logo © 2022 Grokbase