FAQ
HADOOP-6218 exposed the internal "Location" object as a global Record
Sequence Number (RecNum). The feature is useful in a number of ways:
(1) support progress reporting for upper layers (object file, zebra);
(2) use RecNum as cursor by a secondary index; (3) support aligned
split across multiple parallel TFiles. Given that TFile is still at
its early stage of being adopted, I suggest that we port the patch
back to hadoop 0.20/0.21 now.

-Hong

Search Discussions

  • Devaraj Das at Oct 12, 2009 at 10:59 pm
    After an offline discussion with Hong and others on this subject, it seems to make sense. +1


    On 10/12/09 3:55 PM, "Hong Tang" wrote:

    HADOOP-6218 exposed the internal "Location" object as a global Record
    Sequence Number (RecNum). The feature is useful in a number of ways:
    (1) support progress reporting for upper layers (object file, zebra);
    (2) use RecNum as cursor by a secondary index; (3) support aligned
    split across multiple parallel TFiles. Given that TFile is still at
    its early stage of being adopted, I suggest that we port the patch
    back to hadoop 0.20/0.21 now.

    -Hong
  • Owen O'Malley at Oct 12, 2009 at 11:38 pm

    On Oct 12, 2009, at 3:55 PM, Hong Tang wrote:

    I suggest that we port the patch back to hadoop 0.20/0.21 now.
    This seems like a low risk change, which only affects users of TFile
    and enables some very desirable functionality.

    +1

    -- Owen
  • Raghu Angadi at Oct 13, 2009 at 5:46 am
    +1. risk is low since it does not involve any change to on-disk format.
    On Mon, Oct 12, 2009 at 4:37 PM, Owen O'Malley wrote:


    On Oct 12, 2009, at 3:55 PM, Hong Tang wrote:

    I suggest that we port the patch back to hadoop 0.20/0.21 now.
    This seems like a low risk change, which only affects users of TFile and
    enables some very desirable functionality.

    +1

    -- Owen
  • Hong Tang at Oct 16, 2009 at 10:00 pm
    With 3 +1's and 0 -1's, the vote passed.

    -Hong
    On Oct 12, 2009, at 3:55 PM, Hong Tang wrote:

    HADOOP-6218 exposed the internal "Location" object as a global
    Record Sequence Number (RecNum). The feature is useful in a number
    of ways: (1) support progress reporting for upper layers (object
    file, zebra); (2) use RecNum as cursor by a secondary index; (3)
    support aligned split across multiple parallel TFiles. Given that
    TFile is still at its early stage of being adopted, I suggest that
    we port the patch back to hadoop 0.20/0.21 now.

    -Hong

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedOct 12, '09 at 10:56p
activeOct 16, '09 at 10:00p
posts5
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase