Grokbase Groups Pig user January 2010
FAQ
Hi
I want to create UDF which compares a tuple with a string value like this.

public class IsEqual extends FilterFunc {
public Boolean exec(Tuple input,String str) throws IOException {
// binary compary of AND is performed here;
// if result of AND is not zero it will return true;
return true;
}


is it possible with pig UDF ??;

Actually i want to compare two binary type data with AND operation as fallows

Table data is

ramana 1010101
krishna 1000010
venkata 1101010
......
load 'data' as name,category using PigStorate("\t");
cameusers = filter data by IsEqual(category,"1000010");
store cameusers;
-----------------------------------
result i am expected is ...............
krishna 1000010


Is there any other solution for this operation without UDF? can we
compare category column with binary data?


Please give responce..


thanks
ramanaiah

Search Discussions

  • Jeff Zhang at Jan 29, 2010 at 1:12 am
    Ramana,

    Actually, there's no binary type in Pig. If you do not specify the type in
    load statement, the default type is byte array. I'm afraid you have to
    write a UDF to do the binary comparison. In the UDF, you should first
    convert the byte array to binary and then compare the two binaries.



    On Wed, Jan 27, 2010 at 11:38 PM, Ramana Venkata wrote:

    Hi
    I want to create UDF which compares a tuple with a string value like this.

    public class IsEqual extends FilterFunc {
    public Boolean exec(Tuple input,String str) throws IOException {
    // binary compary of AND is performed here;
    // if result of AND is not zero it will return true;
    return true;
    }


    is it possible with pig UDF ??;

    Actually i want to compare two binary type data with AND operation as
    fallows

    Table data is

    ramana 1010101
    krishna 1000010
    venkata 1101010
    ......
    load 'data' as name,category using PigStorate("\t");
    cameusers = filter data by IsEqual(category,"1000010");
    store cameusers;
    -----------------------------------
    result i am expected is ...............
    krishna 1000010


    Is there any other solution for this operation without UDF? can we
    compare category column with binary data?


    Please give responce..


    thanks
    ramanaiah


    --
    Best Regards

    Jeff Zhang
  • Mridul Muralidharan at Jan 29, 2010 at 9:50 am
    There are two ways to handle this.
    You can pass it along as a parameter as you did in the script - though
    note that, in your udf, it will be a tuple with first field == category,
    second field == "1000010".

    public Boolean exec(Tuple _input) throws IOException {
    String input = (String)_input.get(0);
    String compareStr = (String)_input.get(1);
    ...
    }

    But that might be a tad bit more expensive : since each tuple which gets
    passed through the FilterFunc will need to have the static "1000010"
    added to it.


    A better alternative is to use "define" - and initialize your IsEqual
    class with the static param you need : by passing it through constructor.


    Something like :


    public class IsEqual extends FilterFunc {
    private String compareStr;

    public IsEqual(String compareStr){
    this.compareStr = compareStr;
    }

    public Boolean exec(Tuple input) throws IOException {
    String input = (String)_input.get(0);
    ...
    }
    }


    You use it by :


    define MY_EQUAL IsEqual("1000010");

    load 'data' as name,category using PigStorate("\t");
    cameusers = filter data by MY_EQUAL(category);
    store cameusers;





    Hope this helps.
    Regards,
    Mridul


    Ramana Venkata wrote:
    Hi
    I want to create UDF which compares a tuple with a string value like this.

    public class IsEqual extends FilterFunc {
    public Boolean exec(Tuple input,String str) throws IOException {
    // binary compary of AND is performed here;
    // if result of AND is not zero it will return true;
    return true;
    }


    is it possible with pig UDF ??;

    Actually i want to compare two binary type data with AND operation as fallows

    Table data is

    ramana 1010101
    krishna 1000010
    venkata 1101010
    ......
    load 'data' as name,category using PigStorate("\t");
    cameusers = filter data by IsEqual(category,"1000010");
    store cameusers;
    -----------------------------------
    result i am expected is ...............
    krishna 1000010


    Is there any other solution for this operation without UDF? can we
    compare category column with binary data?


    Please give responce..


    thanks
    ramanaiah
  • Mridul Muralidharan at Jan 29, 2010 at 9:55 am
    There is an error in the basic script - which I propagated in my copy
    paste - corrected below.


    Regards,
    Mridul

    Mridul Muralidharan wrote:
    There are two ways to handle this.
    You can pass it along as a parameter as you did in the script - though
    note that, in your udf, it will be a tuple with first field == category,
    second field == "1000010".

    public Boolean exec(Tuple _input) throws IOException {
    String input = (String)_input.get(0);
    String compareStr = (String)_input.get(1);
    ...
    }

    But that might be a tad bit more expensive : since each tuple which gets
    passed through the FilterFunc will need to have the static "1000010"
    added to it.


    A better alternative is to use "define" - and initialize your IsEqual
    class with the static param you need : by passing it through constructor.


    Something like :


    public class IsEqual extends FilterFunc {
    private String compareStr;

    public IsEqual(String compareStr){
    this.compareStr = compareStr;
    }
    I assumed that first param is gonna by a String - which need not be the
    case (since it is not defined in the load schema), in which case "String
    input" gets replaced with appropriate datatype.
    public Boolean exec(Tuple input) throws IOException {
    String input = (String)_input.get(0);
    ...
    }
    }


    You use it by :


    define MY_EQUAL IsEqual("1000010");

    load 'data' as name,category using PigStorate("\t");
    cameusers = filter data by MY_EQUAL(category);
    store cameusers;

    define MY_EQUAL IsEqual('1000010');

    data = load 'data' using PigStorate('\t') as (name,category);
    cameusers = filter data by MY_EQUAL(category);
    store cameusers;



    If you need the name and category to be string's, (which I suspect you
    do), then use "data = load 'data' using PigStorate('\t') as
    (name:chararray,category:chararray);"


    Regards,
    Mridul



    Hope this helps.
    Regards,
    Mridul


    Ramana Venkata wrote:
    Hi
    I want to create UDF which compares a tuple with a string value like this.

    public class IsEqual extends FilterFunc {
    public Boolean exec(Tuple input,String str) throws IOException {
    // binary compary of AND is performed here;
    // if result of AND is not zero it will return true;
    return true;
    }


    is it possible with pig UDF ??;

    Actually i want to compare two binary type data with AND operation as fallows

    Table data is

    ramana 1010101
    krishna 1000010
    venkata 1101010
    ......
    load 'data' as name,category using PigStorate("\t");
    cameusers = filter data by IsEqual(category,"1000010");
    store cameusers;
    -----------------------------------
    result i am expected is ...............
    krishna 1000010


    Is there any other solution for this operation without UDF? can we
    compare category column with binary data?


    Please give responce..


    thanks
    ramanaiah

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJan 28, '10 at 6:16p
activeJan 29, '10 at 9:55a
posts4
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase