FAQ
Hi,

I need to get the position of the key being processed in a mapper task.
My inputFile is a sequence file ....

I tried the Context, but the best i could get was the inputsplit
position and the
file name ....


My other option is to start recording the pos in the key value while generating
the sequence file.
But that would mean rewriting all the files i already have :(

any thoughts?

ishwar

Search Discussions

  • Ahad Rana at Oct 9, 2009 at 5:22 am
    Hi Ishwar,
    You can implement a custom MapRunner and retrieve the position from the
    reader before calling your map function. Be aware though, that for block
    compressed files, the position returned represents block start position, not
    the individual record position.

    Ahad.
    On Thu, Oct 8, 2009 at 4:23 PM, ishwar ramani wrote:

    Hi,

    I need to get the position of the key being processed in a mapper task.
    My inputFile is a sequence file ....

    I tried the Context, but the best i could get was the inputsplit
    position and the
    file name ....


    My other option is to start recording the pos in the key value while
    generating
    the sequence file.
    But that would mean rewriting all the files i already have :(

    any thoughts?

    ishwar
  • Ahad Rana at Oct 9, 2009 at 5:45 am
    Oops, memory fails me. To correct my previous statement, for block
    compressed files, getPosition reflects the position in the input stream of
    the NEXT compressed block of data, so you have to watch for the change in
    position after reading the key/value to capture a block transition.
    Ahad.
    On Thu, Oct 8, 2009 at 10:22 PM, Ahad Rana wrote:

    Hi Ishwar,
    You can implement a custom MapRunner and retrieve the position from the
    reader before calling your map function. Be aware though, that for block
    compressed files, the position returned represents block start position, not
    the individual record position.

    Ahad.

    On Thu, Oct 8, 2009 at 4:23 PM, ishwar ramani wrote:

    Hi,

    I need to get the position of the key being processed in a mapper task.
    My inputFile is a sequence file ....

    I tried the Context, but the best i could get was the inputsplit
    position and the
    file name ....


    My other option is to start recording the pos in the key value while
    generating
    the sequence file.
    But that would mean rewriting all the files i already have :(

    any thoughts?

    ishwar
  • Ishwar ramani at Oct 12, 2009 at 5:23 pm
    thanks. that worked fine ....

    On Thu, Oct 8, 2009 at 10:45 PM, Ahad Rana wrote:
    Oops, memory fails me. To correct my previous statement, for block
    compressed files, getPosition reflects the position in the input stream of
    the NEXT compressed block of data, so you have to watch for the change in
    position after reading the key/value to capture a block transition.
    Ahad.
    On Thu, Oct 8, 2009 at 10:22 PM, Ahad Rana wrote:

    Hi Ishwar,
    You can implement a custom MapRunner and retrieve the position from the
    reader before calling your map function. Be aware though, that for block
    compressed files, the position returned represents block start position, not
    the individual record position.

    Ahad.

    On Thu, Oct 8, 2009 at 4:23 PM, ishwar ramani wrote:

    Hi,

    I need to get the position of the key being processed in a mapper task.
    My inputFile is a sequence file ....

    I tried the Context, but the best i could get was the inputsplit
    position and the
    file name ....


    My other option is to start recording the pos in the key value while
    generating
    the sequence file.
    But that would mean rewriting all the files i already have :(

    any thoughts?

    ishwar

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 8, '09 at 11:24p
activeOct 12, '09 at 5:23p
posts4
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Ishwar ramani: 2 posts Ahad Rana: 2 posts

People

Translate

site design / logo © 2022 Grokbase