FAQ
Dear all,

About splitting support for bzip2, I checked on the JIRA list and found
HADOOP-7386 marked as "Won't fix"; I also found some work done in
branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not
integrated/migrated into branch-1, so I guess we don't support contatenated
bzip2 in branch-1, correct? If so, is there any special reason? Many thanks!

--
Best Regards,
Li Yu

Search Discussions

  • Harsh J at Dec 3, 2012 at 11:43 am
    Hi Yu Li,

    The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus
    MR support for it, into branch-1, and it is already available in the
    1.1.x releases out currently.

    Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet
    (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have
    fixed it - so can you try and report back?
    On Mon, Dec 3, 2012 at 3:19 PM, Yu Li wrote:
    Dear all,

    About splitting support for bzip2, I checked on the JIRA list and found
    HADOOP-7386 marked as "Won't fix"; I also found some work done in
    branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not
    integrated/migrated into branch-1, so I guess we don't support contatenated
    bzip2 in branch-1, correct? If so, is there any special reason? Many thanks!

    --
    Best Regards,
    Li Yu


    --
    Harsh J
  • Yu Li at Dec 3, 2012 at 3:52 pm
    Hi Harsh,

    Thanks a lot for the information!

    My fault not looking into HADOOP-4012 carefully, will try and veriry
    whether HADOOP-7823 has resolved the issue on both write and read side, and
    report back.
    On 3 December 2012 19:42, Harsh J wrote:

    Hi Yu Li,

    The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus
    MR support for it, into branch-1, and it is already available in the
    1.1.x releases out currently.

    Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet
    (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have
    fixed it - so can you try and report back?
    On Mon, Dec 3, 2012 at 3:19 PM, Yu Li wrote:
    Dear all,

    About splitting support for bzip2, I checked on the JIRA list and found
    HADOOP-7386 marked as "Won't fix"; I also found some work done in
    branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not
    integrated/migrated into branch-1, so I guess we don't support
    contatenated
    bzip2 in branch-1, correct? If so, is there any special reason? Many thanks!
    --
    Best Regards,
    Li Yu


    --
    Harsh J


    --
    Best Regards,
    Li Yu
  • Harsh J at Dec 4, 2012 at 4:08 am
    Thanks Yu, will appreciate if you can post your observances over
    https://issues.apache.org/jira/browse/HADOOP-7386.
    On Mon, Dec 3, 2012 at 9:22 PM, Yu Li wrote:
    Hi Harsh,

    Thanks a lot for the information!

    My fault not looking into HADOOP-4012 carefully, will try and veriry
    whether HADOOP-7823 has resolved the issue on both write and read side, and
    report back.
    On 3 December 2012 19:42, Harsh J wrote:

    Hi Yu Li,

    The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus
    MR support for it, into branch-1, and it is already available in the
    1.1.x releases out currently.

    Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet
    (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have
    fixed it - so can you try and report back?
    On Mon, Dec 3, 2012 at 3:19 PM, Yu Li wrote:
    Dear all,

    About splitting support for bzip2, I checked on the JIRA list and found
    HADOOP-7386 marked as "Won't fix"; I also found some work done in
    branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not
    integrated/migrated into branch-1, so I guess we don't support
    contatenated
    bzip2 in branch-1, correct? If so, is there any special reason? Many thanks!
    --
    Best Regards,
    Li Yu


    --
    Harsh J


    --
    Best Regards,
    Li Yu


    --
    Harsh J
  • Yu Li at Dec 10, 2012 at 2:17 pm
    Hi Hash,

    Sorry for the a little late response, busy doing some other work these
    days. I have pasted my test steps and result onto HADOOP-7386, and if the
    way of my testing is correct, I think concatenated BZip2 file support is
    implemented and already in branch-1. I also did some sanity testing and
    confirmed splitting BZip2 support also in branch-1. Please let me know if
    any comments, thanks.
    On 4 December 2012 12:07, Harsh J wrote:

    Thanks Yu, will appreciate if you can post your observances over
    https://issues.apache.org/jira/browse/HADOOP-7386.
    On Mon, Dec 3, 2012 at 9:22 PM, Yu Li wrote:
    Hi Harsh,

    Thanks a lot for the information!

    My fault not looking into HADOOP-4012 carefully, will try and veriry
    whether HADOOP-7823 has resolved the issue on both write and read side, and
    report back.
    On 3 December 2012 19:42, Harsh J wrote:

    Hi Yu Li,

    The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus
    MR support for it, into branch-1, and it is already available in the
    1.1.x releases out currently.

    Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet
    (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have
    fixed it - so can you try and report back?
    On Mon, Dec 3, 2012 at 3:19 PM, Yu Li wrote:
    Dear all,

    About splitting support for bzip2, I checked on the JIRA list and
    found
    HADOOP-7386 marked as "Won't fix"; I also found some work done in
    branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not
    integrated/migrated into branch-1, so I guess we don't support
    contatenated
    bzip2 in branch-1, correct? If so, is there any special reason? Many thanks!
    --
    Best Regards,
    Li Yu


    --
    Harsh J


    --
    Best Regards,
    Li Yu


    --
    Harsh J


    --
    Best Regards,
    Li Yu

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedDec 3, '12 at 9:49a
activeDec 10, '12 at 2:17p
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Yu Li: 3 posts Harsh J: 2 posts

People

Translate

site design / logo © 2021 Grokbase