Grokbase Groups Pig user March 2010
FAQ
Hi All,

Is there a way to get current InputSplit in a UDF (more specifically,
a filter function)?

I have a filter function that validates input rows according to
certain criteria and I would like to report the source of failures (if
any).

Thanks in advance.

- Sandesh

Search Discussions

  • Ashutosh Chauhan at Mar 31, 2010 at 3:36 am
    Try:

    PigSplit pigSplit =
    ((PigSplit)((Context)PigMapReduce.sJobContext).getInputSplit());
    InputSplit is = pigSplit.getWrappedSplit();

    Ashutosh
    On Tue, Mar 30, 2010 at 13:52, Sandesh Devaraju wrote:
    Hi All,

    Is there a way to get current InputSplit in a UDF (more specifically,
    a filter function)?

    I have a filter function that validates input rows according to
    certain criteria and I would like to report the source of failures (if
    any).

    Thanks in advance.

    - Sandesh
  • Mridul Muralidharan at Mar 31, 2010 at 6:25 am
    You might want to be careful with this ... the udf could get used in
    both map & reduce side, no ?

    Regards,
    Mridul
    On Wednesday 31 March 2010 02:22 AM, Sandesh Devaraju wrote:
    Hi All,

    Is there a way to get current InputSplit in a UDF (more specifically,
    a filter function)?

    I have a filter function that validates input rows according to
    certain criteria and I would like to report the source of failures (if
    any).

    Thanks in advance.

    - Sandesh
  • Ashutosh Chauhan at Mar 31, 2010 at 3:05 pm
    Yes, this works only if udf is running in Map. From Sandesh's mail it
    does look like his udf will run in map.
    Also note that this is highly specific internal implementation detail
    of Pig, which may change in future.

    Ashutosh
    On Tue, Mar 30, 2010 at 23:24, Mridul Muralidharan
    wrote:
    You might want to be careful with this ... the udf could get used in both
    map & reduce side, no ?

    Regards,
    Mridul
    On Wednesday 31 March 2010 02:22 AM, Sandesh Devaraju wrote:

    Hi All,

    Is there a way to get current InputSplit in a UDF (more specifically,
    a filter function)?

    I have a filter function that validates input rows according to
    certain criteria and I would like to report the source of failures (if
    any).

    Thanks in advance.

    - Sandesh

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMar 30, '10 at 8:53p
activeMar 31, '10 at 3:05p
posts4
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase