Grokbase Groups HBase dev June 2011
FAQ
Hi folks, I was wondering if there was any movement on any of these HDFS tickets for HBase. The umbrella ticket is HDFS-1599, but the last comment from stack back in Feb highlighted interest in several tickets:


1) HDFS-918 (use single selector)

a. Last comment Jan 2011



2) HDFS-941 (reuse of connection)

a. Patch available as of April 2011

b. But ticket still unresolved.



3) HDFS-347 (local reads)

a. Discussion seemed to end in March 2011 with a huge comment saying that there was no performance benefit.

b. I'm working my way through this comment/report, but intuitively it seems like it would be a good idea since as the other comments in the ticket stated the RS reads locally just about every time.


Doug Meil
Chief Software Architect, Explorys
doug.meil@explorys.com

Search Discussions

  • Andrew Purtell at Jun 3, 2011 at 11:38 am
    Regarding HDFS-347, I believe the following to be true:

    - The "bastard" option, i.e. Ryan's patch against 0.20 that just does local reads via File, does lower latency enough to make a difference in HBase random read latencies as measured. I forget the magnitude of the difference offhand but seem to remember something like at least 2x. Can't say about the FD-passing variant because I don't think any HBasers have used it. I want to test both myself, but am limited to EC2 based testbeds so will have a lot of difficulty (to say the least) correcting for platform variability, so it's pretty far down the to-do list as a result.

    - HBASE-SEARCH (HBASE-3529) uses this to make Lucene embedding work: https://github.com/jasonrutherglen/HBASE-SEARCH

    - Andy
    From: Doug Meil <doug.meil@explorysmedical.com>
    Subject: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Date: Thursday, June 2, 2011, 2:00 PM
    Hi folks, I was wondering if there
    was any movement on any of these HDFS tickets for
    HBase.  The umbrella ticket is HDFS-1599, but the last
    comment from stack back in Feb highlighted interest in
    several tickets:


    1)      HDFS-918 (use single selector)

    a.       Last comment Jan 2011



    2)      HDFS-941 (reuse of connection)

    a.       Patch available as of
    April 2011

    b.      But ticket still unresolved.



    3)      HDFS-347 (local reads)

    a.       Discussion seemed to end
    in March 2011 with a huge comment saying that there was no
    performance benefit.

    b.      I'm working my way through this
    comment/report, but intuitively it seems like it would be a
    good idea since as the other comments in the ticket stated
    the RS reads locally just about every time.


    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com
  • Kihwal Lee at Jun 3, 2011 at 2:12 pm
    HDFS-941
    The trunk has moved on so the patch won't apply. There has been significant changes in HDFS lately, so it will require more than simple rebase/merge. If the original assignee is busy, I am willing to help.

    HDFS-347
    The analysis is pointing out that local socket communication is actually not the problem. The initial assumption of local socket being slow should be ignored and the design should be revisited.

    I agree that improving local pread performance is critical. Based on my experiments, HDFS-941 helps a lot and the communication channel became no longer the bottleneck.

    Kihwal


    On 6/2/11 4:00 PM, "Doug Meil" wrote:

    Hi folks, I was wondering if there was any movement on any of these HDFS tickets for HBase. The umbrella ticket is HDFS-1599, but the last comment from stack back in Feb highlighted interest in several tickets:


    1) HDFS-918 (use single selector)

    a. Last comment Jan 2011



    2) HDFS-941 (reuse of connection)

    a. Patch available as of April 2011

    b. But ticket still unresolved.



    3) HDFS-347 (local reads)

    a. Discussion seemed to end in March 2011 with a huge comment saying that there was no performance benefit.

    b. I'm working my way through this comment/report, but intuitively it seems like it would be a good idea since as the other comments in the ticket stated the RS reads locally just about every time.


    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com
  • Ryan Rawson at Jun 3, 2011 at 6:08 pm
    Could you explain your HDFS-347 comment more? I dont think people
    suggested that the socket itself was the primary issue, but dealing
    with the datanode and the socket and everything was really slow. It's
    hard to separate concerns and test only 1 thing at a time - for
    example you said 'local socket comm isnt the problem', but there is no
    way to build a test that uses a local socket but not the datanode.

    The basic fact is that datanode adds a lot of overhead, and under high
    concurrency that overhead grows.


    On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee wrote:
    HDFS-941
    The trunk has moved on so the patch won't apply.  There has been significant changes in HDFS lately, so it will require more than simple rebase/merge.  If the original assignee is busy, I am willing to help.

    HDFS-347
    The analysis is pointing out that local socket communication is actually not the problem. The initial assumption of local socket being slow should be ignored and the design should be revisited.

    I agree that improving local pread performance is critical.  Based on my experiments, HDFS-941 helps a lot and the communication channel became no longer the bottleneck.

    Kihwal


    On 6/2/11 4:00 PM, "Doug Meil" wrote:

    Hi folks, I was wondering if there was any movement on any of these HDFS tickets for HBase.  The umbrella ticket is HDFS-1599, but the last comment from stack back in Feb highlighted interest in several tickets:


    1)      HDFS-918 (use single selector)

    a.       Last comment Jan 2011



    2)      HDFS-941 (reuse of connection)

    a.       Patch available as of April 2011

    b.      But ticket still unresolved.



    3)      HDFS-347 (local reads)

    a.       Discussion seemed to end in March 2011 with a huge comment saying that there was no performance benefit.

    b.      I'm working my way through this comment/report, but intuitively it seems like it would be a good idea since as the other comments in the ticket stated the RS reads locally just about every time.


    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com

  • Dhruba Borthakur at Jun 3, 2011 at 7:00 pm
    I completely agree with Ryan. Most of the measurements in HDFS-347 are point
    comparisions.... data rate over socket, single-threaded sequential read from
    datanode, single-threaded random read form datanode, etc. These measurements
    are good, but when you run the entire Hbase system at load, you definitely
    see a 3X performance improvement when reading data locally (instead of going
    through the datanode).

    -dhruba
    On Fri, Jun 3, 2011 at 11:08 AM, Ryan Rawson wrote:

    Could you explain your HDFS-347 comment more? I dont think people
    suggested that the socket itself was the primary issue, but dealing
    with the datanode and the socket and everything was really slow. It's
    hard to separate concerns and test only 1 thing at a time - for
    example you said 'local socket comm isnt the problem', but there is no
    way to build a test that uses a local socket but not the datanode.

    The basic fact is that datanode adds a lot of overhead, and under high
    concurrency that overhead grows.


    On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee wrote:
    HDFS-941
    The trunk has moved on so the patch won't apply. There has been
    significant changes in HDFS lately, so it will require more than simple
    rebase/merge. If the original assignee is busy, I am willing to help.
    HDFS-347
    The analysis is pointing out that local socket communication is actually
    not the problem. The initial assumption of local socket being slow should be
    ignored and the design should be revisited.
    I agree that improving local pread performance is critical. Based on my
    experiments, HDFS-941 helps a lot and the communication channel became no
    longer the bottleneck.
    Kihwal


    On 6/2/11 4:00 PM, "Doug Meil" wrote:

    Hi folks, I was wondering if there was any movement on any of these HDFS
    tickets for HBase. The umbrella ticket is HDFS-1599, but the last comment
    from stack back in Feb highlighted interest in several tickets:

    1) HDFS-918 (use single selector)

    a. Last comment Jan 2011



    2) HDFS-941 (reuse of connection)

    a. Patch available as of April 2011

    b. But ticket still unresolved.



    3) HDFS-347 (local reads)

    a. Discussion seemed to end in March 2011 with a huge comment
    saying that there was no performance benefit.
    b. I'm working my way through this comment/report, but intuitively
    it seems like it would be a good idea since as the other comments in the
    ticket stated the RS reads locally just about every time.

    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com



    --
    Connect to me at http://www.facebook.com/dhruba
  • Doug Meil at Jun 3, 2011 at 7:49 pm
    Thanks everybody for commenting on this thread.

    We'd certainly like to lobby for movement on these two tickets, and although we don't have anybody that is familiar with the source code we'd be happy to perform some tests get some performance numbers.

    Per Kihwal's comments, it sounds like HDFS-941 needs to get re-worked because the patch is stale.

    The patch for HDFS-347 sounds like it's still usable.

    So what else is needed to push this effort forward? Is it beneficial to get more numbers on HDFS-347 and keep lobbying on the ticket, and/or is there another path that should be taken (plying with beer, free Cleveland Indians tickets, harassing phone calls, etc.)?



    -----Original Message-----
    From: Dhruba Borthakur
    Sent: Friday, June 03, 2011 3:00 PM
    To: dev@hbase.apache.org
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)

    I completely agree with Ryan. Most of the measurements in HDFS-347 are point comparisions.... data rate over socket, single-threaded sequential read from datanode, single-threaded random read form datanode, etc. These measurements are good, but when you run the entire Hbase system at load, you definitely see a 3X performance improvement when reading data locally (instead of going through the datanode).

    -dhruba
    On Fri, Jun 3, 2011 at 11:08 AM, Ryan Rawson wrote:

    Could you explain your HDFS-347 comment more? I dont think people
    suggested that the socket itself was the primary issue, but dealing
    with the datanode and the socket and everything was really slow. It's
    hard to separate concerns and test only 1 thing at a time - for
    example you said 'local socket comm isnt the problem', but there is no
    way to build a test that uses a local socket but not the datanode.

    The basic fact is that datanode adds a lot of overhead, and under high
    concurrency that overhead grows.


    On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee wrote:
    HDFS-941
    The trunk has moved on so the patch won't apply. There has been
    significant changes in HDFS lately, so it will require more than
    simple rebase/merge. If the original assignee is busy, I am willing to help.
    HDFS-347
    The analysis is pointing out that local socket communication is
    actually
    not the problem. The initial assumption of local socket being slow
    should be ignored and the design should be revisited.
    I agree that improving local pread performance is critical. Based
    on my
    experiments, HDFS-941 helps a lot and the communication channel became
    no longer the bottleneck.
    Kihwal


    On 6/2/11 4:00 PM, "Doug Meil" wrote:

    Hi folks, I was wondering if there was any movement on any of these
    HDFS
    tickets for HBase. The umbrella ticket is HDFS-1599, but the last
    comment from stack back in Feb highlighted interest in several tickets:

    1) HDFS-918 (use single selector)

    a. Last comment Jan 2011



    2) HDFS-941 (reuse of connection)

    a. Patch available as of April 2011

    b. But ticket still unresolved.



    3) HDFS-347 (local reads)

    a. Discussion seemed to end in March 2011 with a huge comment
    saying that there was no performance benefit.
    b. I'm working my way through this comment/report, but intuitively
    it seems like it would be a good idea since as the other comments in
    the ticket stated the RS reads locally just about every time.

    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com



    --
    Connect to me at http://www.facebook.com/dhruba
  • Todd Lipcon at Jun 3, 2011 at 8:15 pm

    On Fri, Jun 3, 2011 at 12:50 PM, Doug Meil wrote:
    Thanks everybody for commenting on this thread.

    We'd certainly like to lobby for movement on these two tickets, and although we don't have anybody that is familiar with the source code we'd be happy to perform some tests get some performance numbers.

    Per Kihwal's comments, it sounds like HDFS-941 needs to get re-worked because the patch is stale.
    Yes - bc Wong, the originally contributor, works with me but on
    unrelated projects. HDFS-941 was something he did as part of a
    "hackathon" but only gets occasional time to circle back on it. As we
    last left it, there were just a few things that had to be addressed.
    If someone wants to finish it up, and volunteer to test it under some
    real load, I'd be happy to review and commit.
    The patch for HDFS-347 sounds like it's still usable.
    The current patch for 347 is unworkable since it doesn't do checksums
    or security. The FD-passing approach was working at some point but
    basically needs to be re-done on trunk.

    I think doing HDFS-941 and HDFS-918 first is best, then more drastic
    things like 347 can be considered.
    So what else is needed to push this effort forward?  Is it beneficial to get more numbers on HDFS-347 and keep lobbying on the ticket, and/or is there another path that should be taken (plying with beer, free Cleveland Indians tickets, harassing phone calls, etc.)?
    Testing! The thing that's scariest about HDFS-941 is that it was
    passing all of its unit tests, and then when I tried it under load for
    2-3 days with YCSB on a 5 node cluster, I saw a couple of checksum
    exceptions come out of it. That implies there's a bug lurking
    somewhere. It may be fixed by now, but I'm hesitant to commit it
    unless there has been testing of that sort.

    -Todd


    -----Original Message-----
    From: Dhruba Borthakur
    Sent: Friday, June 03, 2011 3:00 PM
    To: dev@hbase.apache.org
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)

    I completely agree with Ryan. Most of the measurements in HDFS-347 are point comparisions.... data rate over socket, single-threaded sequential read from datanode, single-threaded random read form datanode, etc. These measurements are good, but when you run the entire Hbase system at load, you definitely see a 3X performance improvement when reading data locally (instead of going through the datanode).

    -dhruba
    On Fri, Jun 3, 2011 at 11:08 AM, Ryan Rawson wrote:

    Could you explain your HDFS-347 comment more?  I dont think people
    suggested that the socket itself was the primary issue, but dealing
    with the datanode and the socket and everything was really slow.  It's
    hard to separate concerns and test only 1 thing at a time - for
    example you said 'local socket comm isnt the problem', but there is no
    way to build a test that uses a local socket but not the datanode.

    The basic fact is that datanode adds a lot of overhead, and under high
    concurrency that overhead grows.


    On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee wrote:
    HDFS-941
    The trunk has moved on so the patch won't apply.  There has been
    significant changes in HDFS lately, so it will require more than
    simple rebase/merge.  If the original assignee is busy, I am willing to help.
    HDFS-347
    The analysis is pointing out that local socket communication is
    actually
    not the problem. The initial assumption of local socket being slow
    should be ignored and the design should be revisited.
    I agree that improving local pread performance is critical.  Based
    on my
    experiments, HDFS-941 helps a lot and the communication channel became
    no longer the bottleneck.
    Kihwal


    On 6/2/11 4:00 PM, "Doug Meil" wrote:

    Hi folks, I was wondering if there was any movement on any of these
    HDFS
    tickets for HBase.  The umbrella ticket is HDFS-1599, but the last
    comment from stack back in Feb highlighted interest in several tickets:

    1)      HDFS-918 (use single selector)

    a.       Last comment Jan 2011



    2)      HDFS-941 (reuse of connection)

    a.       Patch available as of April 2011

    b.      But ticket still unresolved.



    3)      HDFS-347 (local reads)

    a.       Discussion seemed to end in March 2011 with a huge comment
    saying that there was no performance benefit.
    b.      I'm working my way through this comment/report, but intuitively
    it seems like it would be a good idea since as the other comments in
    the ticket stated the RS reads locally just about every time.

    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com



    --
    Connect to me at http://www.facebook.com/dhruba


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Andrew Purtell at Jun 3, 2011 at 10:47 pm
    Yes, and though I have patches, and I'm happy to provide them if you want...

    Indeed, 347 doesn't do security or checksums so needs work to say the least. We use it with HBase given a privileged role such that it shares group-readable DFS data directories with the DataNodes. It works for us, though checksumming is on the to do list.

    And I agree 947 is scary. However I did pull the last incarnation of 947 attached to the jira into CDH3U0 for some ongoing testing with real load, combined with 918, which we did put into production.

    - Andy

    --- On Fri, 6/3/11, Todd Lipcon wrote:
    From: Todd Lipcon <todd@cloudera.com>
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: dev@hbase.apache.org
    Date: Friday, June 3, 2011, 1:09 PM
    On Fri, Jun 3, 2011 at 12:50 PM, Doug
    Meil
    wrote:
    Thanks everybody for commenting on this thread.

    We'd certainly like to lobby for movement on these two
    tickets, and although we don't have anybody that is familiar
    with the source code we'd be happy to perform some tests get
    some performance numbers.
    Per Kihwal's comments, it sounds like HDFS-941 needs
    to get re-worked because the patch is stale.
    Yes - bc Wong, the originally contributor, works with me but on
    unrelated projects. HDFS-941 was something he did as part of a
    "hackathon" but only gets occasional time to circle back on it. As we
    last left it, there were just a few things that had to be addressed.
    If someone wants to finish it up, and volunteer to test it under some
    real load, I'd be happy to review and commit.
    The patch for HDFS-347 sounds like it's still usable.
    The current patch for 347 is unworkable since it doesn't do
    checksums or security. The FD-passing approach was working at some
    point but basically needs to be re-done on trunk.

    I think doing HDFS-941 and HDFS-918 first is best, then more drastic
    things like 347 can be considered.
  • Kihwal Lee at Jun 3, 2011 at 10:58 pm
    When I tried HDFS-941, the new bottleneck was checksum. So the performance may drop significantly if checksum is added and enabled in HDFS-347.

    Kihwal


    On 6/3/11 5:46 PM, "Andrew Purtell" wrote:

    Yes, and though I have patches, and I'm happy to provide them if you want...

    Indeed, 347 doesn't do security or checksums so needs work to say the least. We use it with HBase given a privileged role such that it shares group-readable DFS data directories with the DataNodes. It works for us, though checksumming is on the to do list.

    And I agree 947 is scary. However I did pull the last incarnation of 947 attached to the jira into CDH3U0 for some ongoing testing with real load, combined with 918, which we did put into production.

    - Andy

    --- On Fri, 6/3/11, Todd Lipcon wrote:
    From: Todd Lipcon <todd@cloudera.com>
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: dev@hbase.apache.org
    Date: Friday, June 3, 2011, 1:09 PM
    On Fri, Jun 3, 2011 at 12:50 PM, Doug
    Meil
    wrote:
    Thanks everybody for commenting on this thread.

    We'd certainly like to lobby for movement on these two
    tickets, and although we don't have anybody that is familiar
    with the source code we'd be happy to perform some tests get
    some performance numbers.
    Per Kihwal's comments, it sounds like HDFS-941 needs
    to get re-worked because the patch is stale.
    Yes - bc Wong, the originally contributor, works with me but on
    unrelated projects. HDFS-941 was something he did as part of a
    "hackathon" but only gets occasional time to circle back on it. As we
    last left it, there were just a few things that had to be addressed.
    If someone wants to finish it up, and volunteer to test it under some
    real load, I'd be happy to review and commit.
    The patch for HDFS-347 sounds like it's still usable.
    The current patch for 347 is unworkable since it doesn't do
    checksums or security. The FD-passing approach was working at some
    point but basically needs to be re-done on trunk.

    I think doing HDFS-941 and HDFS-918 first is best, then more drastic
    things like 347 can be considered.
  • Jason Rutherglen at Jun 3, 2011 at 11:42 pm
    I think one'd need to checksum only once on the first file system
    instantiation, or first access of the file? As mentioned in
    HDFS-2004, HBase's usage of HDFS is outside of the initial design
    motivation. Eg, the rules may need to be bent in order to enable
    performant use of HBase with HDFS. The idea of working with HDFS at
    the block level becomes [likely] more important.
    On Fri, Jun 3, 2011 at 3:57 PM, Kihwal Lee wrote:
    When I tried HDFS-941, the new bottleneck was checksum. So the performance may drop significantly if checksum is added and enabled in HDFS-347.

    Kihwal


    On 6/3/11 5:46 PM, "Andrew Purtell" wrote:

    Yes, and though I have patches, and I'm happy to provide them if you want...

    Indeed, 347 doesn't do security or checksums so needs work to say the least. We use it with HBase given a privileged role such that it shares group-readable DFS data directories with the DataNodes. It works for us, though checksumming is on the to do list.

    And I agree 947 is scary. However I did pull the last incarnation of 947 attached to the jira into CDH3U0 for some ongoing testing with real load, combined with 918, which we did put into production.

    - Andy

    --- On Fri, 6/3/11, Todd Lipcon wrote:
    From: Todd Lipcon <todd@cloudera.com>
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: dev@hbase.apache.org
    Date: Friday, June 3, 2011, 1:09 PM
    On Fri, Jun 3, 2011 at 12:50 PM, Doug
    Meil
    <doug.meil@explorysmedical.com>
    wrote:
    Thanks everybody for commenting on this thread.

    We'd certainly like to lobby for movement on these two
    tickets, and although we don't have anybody that is familiar
    with the source code we'd be happy to perform some tests get
    some performance numbers.
    Per Kihwal's comments, it sounds like HDFS-941 needs
    to get re-worked because the patch is stale.
    Yes - bc Wong, the originally contributor, works with me but on
    unrelated projects. HDFS-941 was something he did as part of a
    "hackathon" but only gets occasional time to circle back on it. As we
    last left it, there were just a few things that had to be addressed.
    If someone wants to finish it up, and volunteer to test it under some
    real load, I'd be happy to review and commit.
    The patch for HDFS-347 sounds like it's still usable.
    The current patch for 347 is unworkable since it doesn't do
    checksums or security. The FD-passing approach was working at some
    point but basically needs to be re-done on trunk.

    I think doing HDFS-941 and HDFS-918 first is best, then more drastic
    things like 347 can be considered.
  • Stack at Jun 3, 2011 at 11:58 pm
    An hdfs-347 that checksums is over in a the hadoop branch that fb
    published over on github (Dhruba and Jon pointed me at it); i've been
    meaning to put the patch up in the hdfs-347 issue.

    St.Ack


    On Fri, Jun 3, 2011 at 4:42 PM, Jason Rutherglen
    wrote:
    I think one'd need to checksum only once on the first file system
    instantiation, or first access of the file?  As mentioned in
    HDFS-2004, HBase's usage of HDFS is outside of the initial design
    motivation.  Eg, the rules may need to be bent in order to enable
    performant use of HBase with HDFS.  The idea of working with HDFS at
    the block level becomes [likely] more important.
    On Fri, Jun 3, 2011 at 3:57 PM, Kihwal Lee wrote:
    When I tried HDFS-941, the new bottleneck was checksum. So the performance may drop significantly if checksum is added and enabled in HDFS-347.

    Kihwal


    On 6/3/11 5:46 PM, "Andrew Purtell" wrote:

    Yes, and though I have patches, and I'm happy to provide them if you want...

    Indeed, 347 doesn't do security or checksums so needs work to say the least. We use it with HBase given a privileged role such that it shares group-readable DFS data directories with the DataNodes. It works for us, though checksumming is on the to do list.

    And I agree 947 is scary. However I did pull the last incarnation of 947 attached to the jira into CDH3U0 for some ongoing testing with real load, combined with 918, which we did put into production.

    - Andy

    --- On Fri, 6/3/11, Todd Lipcon wrote:
    From: Todd Lipcon <todd@cloudera.com>
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: dev@hbase.apache.org
    Date: Friday, June 3, 2011, 1:09 PM
    On Fri, Jun 3, 2011 at 12:50 PM, Doug
    Meil
    <doug.meil@explorysmedical.com>
    wrote:
    Thanks everybody for commenting on this thread.

    We'd certainly like to lobby for movement on these two
    tickets, and although we don't have anybody that is familiar
    with the source code we'd be happy to perform some tests get
    some performance numbers.
    Per Kihwal's comments, it sounds like HDFS-941 needs
    to get re-worked because the patch is stale.
    Yes - bc Wong, the originally contributor, works with me but on
    unrelated projects. HDFS-941 was something he did as part of a
    "hackathon" but only gets occasional time to circle back on it. As we
    last left it, there were just a few things that had to be addressed.
    If someone wants to finish it up, and volunteer to test it under some
    real load, I'd be happy to review and commit.
    The patch for HDFS-347 sounds like it's still usable.
    The current patch for 347 is unworkable since it doesn't do
    checksums or security. The FD-passing approach was working at some
    point but basically needs to be re-done on trunk.

    I think doing HDFS-941 and HDFS-918 first is best, then more drastic
    things like 347 can be considered.
  • Todd Lipcon at Jun 4, 2011 at 12:02 am
    Not to be too mean and discouraging to everyone passing around patches
    against CDH3 and/or 0.20-append, but just an FYI: there is no chance
    that these things will get committed to an 0.20 branch without first
    going through trunk. Sharing patches and testing them on real
    workloads in 20 is a nice step in that direction, but if you're
    discouraged that they aren't checked in yet, please help on getting
    them up to date on trunk, finishing out pending review comments, and
    making sure they also work in trunk :)

    (this might sound hypocritical coming from someone who spent the
    better part of the last two years backporting things onto 0.20 :) I
    think we've all realized it was a mistake to set up shop on 20 for so
    long, so trunk is the way forward)
    On Fri, Jun 3, 2011 at 4:57 PM, Stack wrote:
    An hdfs-347 that checksums is over in a the hadoop branch that fb
    published over on github (Dhruba and Jon pointed me at it); i've been
    meaning to put the patch up in the hdfs-347 issue.

    St.Ack


    On Fri, Jun 3, 2011 at 4:42 PM, Jason Rutherglen
    wrote:
    I think one'd need to checksum only once on the first file system
    instantiation, or first access of the file?  As mentioned in
    HDFS-2004, HBase's usage of HDFS is outside of the initial design
    motivation.  Eg, the rules may need to be bent in order to enable
    performant use of HBase with HDFS.  The idea of working with HDFS at
    the block level becomes [likely] more important.
    On Fri, Jun 3, 2011 at 3:57 PM, Kihwal Lee wrote:
    When I tried HDFS-941, the new bottleneck was checksum. So the performance may drop significantly if checksum is added and enabled in HDFS-347.

    Kihwal


    On 6/3/11 5:46 PM, "Andrew Purtell" wrote:

    Yes, and though I have patches, and I'm happy to provide them if you want...

    Indeed, 347 doesn't do security or checksums so needs work to say the least. We use it with HBase given a privileged role such that it shares group-readable DFS data directories with the DataNodes. It works for us, though checksumming is on the to do list.

    And I agree 947 is scary. However I did pull the last incarnation of 947 attached to the jira into CDH3U0 for some ongoing testing with real load, combined with 918, which we did put into production.

    - Andy

    --- On Fri, 6/3/11, Todd Lipcon wrote:
    From: Todd Lipcon <todd@cloudera.com>
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: dev@hbase.apache.org
    Date: Friday, June 3, 2011, 1:09 PM
    On Fri, Jun 3, 2011 at 12:50 PM, Doug
    Meil
    <doug.meil@explorysmedical.com>
    wrote:
    Thanks everybody for commenting on this thread.

    We'd certainly like to lobby for movement on these two
    tickets, and although we don't have anybody that is familiar
    with the source code we'd be happy to perform some tests get
    some performance numbers.
    Per Kihwal's comments, it sounds like HDFS-941 needs
    to get re-worked because the patch is stale.
    Yes - bc Wong, the originally contributor, works with me but on
    unrelated projects. HDFS-941 was something he did as part of a
    "hackathon" but only gets occasional time to circle back on it. As we
    last left it, there were just a few things that had to be addressed.
    If someone wants to finish it up, and volunteer to test it under some
    real load, I'd be happy to review and commit.
    The patch for HDFS-347 sounds like it's still usable.
    The current patch for 347 is unworkable since it doesn't do
    checksums or security. The FD-passing approach was working at some
    point but basically needs to be re-done on trunk.

    I think doing HDFS-941 and HDFS-918 first is best, then more drastic
    things like 347 can be considered.


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Andrew Purtell at Jun 4, 2011 at 8:46 am

    From: Todd Lipcon <todd@cloudera.com>
    Not to be too mean and discouraging to everyone passing around patches
    against CDH3 and/or 0.20-append, but just an FYI: there is no chance
    that these things will get committed to an 0.20 branch without first
    going through trunk. Sharing patches and testing them on real
    workloads in 20 is a nice step in that direction, but if you're
    discouraged that they aren't checked in yet, please help on getting
    them up to date on trunk, finishing out pending review comments, and
    making sure they also work in trunk :)
    This is not discouraging. :-)

    HBasers patch CDH because trunk -- anything > 0.20 actually -- is not trusted by consensus if you look at all of the production deployments. Does ANYONE run trunk under anything approaching "production"? And trunk/upstream has a history of ignoring any HBase specific concern. So the use of and trading of patches will probably continue for a while, maybe forever.

    Part of the problem is the expectation that any patch provided against trunk may generate months of back and forth, as we have seen, which presents difficulities to a potential contributor who does not work on e.g. HDFS matters full time. Alternatively it may pick up a committer as sponsor and then be vetoed by Yahoo because they're mad at Cloudera over some unrelated issue and a patch appears to have a Cloudera sponsor and/or or vice versa. Now, that situation I describe _is_ discouraging. It's not enough to say that we must contribute through trunk. Trunk needs to earn back our trust.

    I believe I recently saw discussion that append should be removed or disabled by default on 0.22 or trunk. Did you see anything like this? If I am mistaken, fine. If not, this is going in the wrong direction, for example.

    - Andy
  • Todd Lipcon at Jun 5, 2011 at 10:43 pm

    On Sat, Jun 4, 2011 at 1:46 AM, Andrew Purtell wrote:

    This is not discouraging. :-)

    HBasers patch CDH because trunk -- anything > 0.20 actually -- is not
    trusted by consensus if you look at all of the production deployments. Does
    ANYONE run trunk under anything approaching "production"? And trunk/upstream
    has a history of ignoring any HBase specific concern. So the use of and
    trading of patches will probably continue for a while, maybe forever.
    Right - I wasn't suggesting that you run trunk in production as of yet. But
    there has been very little activity in terms of HBase people running trunk
    in dev/test clusters in the past. Stack has done some awesome work here in
    the last few weeks, so that should open it up for some more people to jump
    on board.

    I agree that HBase has been treated as a second-class citizen in recent
    years from HDFS's performance, but I think that has changed. All of the
    major HDFS contributors now have serious stakes in HBase, and so long as
    there are tests with sufficient testing that apply against trunk, I don't
    see a reason they wouldn't be included.

    Part of the problem is the expectation that any patch provided against
    trunk may generate months of back and forth, as we have seen, which presents
    difficulities to a potential contributor who does not work on e.g. HDFS
    matters full time. Alternatively it may pick up a committer as sponsor and
    then be vetoed by Yahoo because they're mad at Cloudera over some unrelated
    issue and a patch appears to have a Cloudera sponsor and/or or vice versa.
    Now, that situation I describe _is_ discouraging. It's not enough to say
    that we must contribute through trunk. Trunk needs to earn back our trust.
    Yes, there have been some unfortunate things in the past. There have also
    been some half-finished or untested patches proposed, and you can't blame
    HDFS folks for not taking a big patch that doesn't have a lot of confidence
    behind it.

    I've been thinking about this this afternoon, and have an idea. It may prove
    to be an awful one, but maybe it's a good one, only time will tell :) I'll
    create a branch off of HDFS trunk specifically for HBase performance work.
    We can commit these "90% done" patches there, which will make it easier for
    others to test and gain confidence. Branches also can make it easier to
    maintain patches over time with a changing trunk.

    How does this sound to the HBase community? If it seems like a good idea,
    *and* there are some people who would be willing to set it up on some small
    dev clusters and run load tests, I'll move forward with it.

    I believe I recently saw discussion that append should be removed or
    disabled by default on 0.22 or trunk. Did you see anything like this? If I
    am mistaken, fine. If not, this is going in the wrong direction, for
    example.
    Not sure what you're referring to - I don't remember any discussion like
    this.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Doug Meil at Jun 5, 2011 at 11:06 pm
    Re: "*and* there are some people who would be willing to set it up on some small dev clusters and run load tests, I'll move forward with it."

    Count us in.

    -----Original Message-----
    From: Todd Lipcon
    Sent: Sunday, June 05, 2011 6:41 PM
    To: dev@hbase.apache.org; apurtell@apache.org
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)
    On Sat, Jun 4, 2011 at 1:46 AM, Andrew Purtell wrote:

    This is not discouraging. :-)

    HBasers patch CDH because trunk -- anything > 0.20 actually -- is not
    trusted by consensus if you look at all of the production deployments.
    Does ANYONE run trunk under anything approaching "production"? And
    trunk/upstream has a history of ignoring any HBase specific concern.
    So the use of and trading of patches will probably continue for a while, maybe forever.
    Right - I wasn't suggesting that you run trunk in production as of yet. But there has been very little activity in terms of HBase people running trunk in dev/test clusters in the past. Stack has done some awesome work here in the last few weeks, so that should open it up for some more people to jump on board.

    I agree that HBase has been treated as a second-class citizen in recent years from HDFS's performance, but I think that has changed. All of the major HDFS contributors now have serious stakes in HBase, and so long as there are tests with sufficient testing that apply against trunk, I don't see a reason they wouldn't be included.

    Part of the problem is the expectation that any patch provided against
    trunk may generate months of back and forth, as we have seen, which
    presents difficulities to a potential contributor who does not work on
    e.g. HDFS matters full time. Alternatively it may pick up a committer
    as sponsor and then be vetoed by Yahoo because they're mad at Cloudera
    over some unrelated issue and a patch appears to have a Cloudera sponsor and/or or vice versa.
    Now, that situation I describe _is_ discouraging. It's not enough to
    say that we must contribute through trunk. Trunk needs to earn back our trust.
    Yes, there have been some unfortunate things in the past. There have also been some half-finished or untested patches proposed, and you can't blame HDFS folks for not taking a big patch that doesn't have a lot of confidence behind it.

    I've been thinking about this this afternoon, and have an idea. It may prove to be an awful one, but maybe it's a good one, only time will tell :) I'll create a branch off of HDFS trunk specifically for HBase performance work.
    We can commit these "90% done" patches there, which will make it easier for others to test and gain confidence. Branches also can make it easier to maintain patches over time with a changing trunk.

    How does this sound to the HBase community? If it seems like a good idea,
    *and* there are some people who would be willing to set it up on some small dev clusters and run load tests, I'll move forward with it.

    I believe I recently saw discussion that append should be removed or
    disabled by default on 0.22 or trunk. Did you see anything like this?
    If I am mistaken, fine. If not, this is going in the wrong direction,
    for example.
    Not sure what you're referring to - I don't remember any discussion like this.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Andrew Purtell at Jun 8, 2011 at 4:01 pm
    Sure, ok, us too, but it will have to be EC2 based clusters, alas. Better than nothing one hopes.

    - Andy


    --- On Sun, 6/5/11, Doug Meil wrote:
    From: Doug Meil <doug.meil@explorysmedical.com>
    Subject: RE: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: "dev@hbase.apache.org" <dev@hbase.apache.org>, "apurtell@apache.org" <apurtell@apache.org>
    Date: Sunday, June 5, 2011, 4:07 PM

    Re:  "*and* there are some people who would be willing
    to set it up on some small dev clusters and run load tests,
    I'll move forward with it."

    Count us in.

    -----Original Message-----
    From: Todd Lipcon

    Sent: Sunday, June 05, 2011 6:41 PM
    To: dev@hbase.apache.org;
    apurtell@apache.org
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve
    HBase)
    On Sat, Jun 4, 2011 at 1:46 AM, Andrew Purtell wrote:

    This is not discouraging. :-)

    HBasers patch CDH because trunk -- anything > 0.20
    actually -- is not
    trusted by consensus if you look at all of the
    production deployments.
    Does ANYONE run trunk under anything approaching
    "production"? And
    trunk/upstream has a history of ignoring any HBase
    specific concern.
    So the use of and trading of patches will probably
    continue for a while, maybe forever.
    Right - I wasn't suggesting that you run trunk in
    production as of yet. But there has been very little
    activity in terms of HBase people running trunk in dev/test
    clusters in the past. Stack has done some awesome work here
    in the last few weeks, so that should open it up for some
    more people to jump on board.

    I agree that HBase has been treated as a second-class
    citizen in recent years from HDFS's performance, but I think
    that has changed. All of the major HDFS contributors now
    have serious stakes in HBase, and so long as there are tests
    with sufficient testing that apply against trunk, I don't
    see a reason they wouldn't be included.

    Part of the problem is the expectation that any patch
    provided against
    trunk may generate months of back and forth, as we
    have seen, which
    presents difficulities to a potential contributor who
    does not work on
    e.g. HDFS matters full time. Alternatively it may pick
    up a committer
    as sponsor and then be vetoed by Yahoo because they're
    mad at Cloudera
    over some unrelated issue and a patch appears to have
    a Cloudera sponsor and/or or vice versa.
    Now, that situation I describe _is_ discouraging. It's
    not enough to
    say that we must contribute through trunk. Trunk needs
    to earn back our trust.
    Yes, there have been some unfortunate things in the past.
    There have also been some half-finished or untested patches
    proposed, and you can't blame HDFS folks for not taking a
    big patch that doesn't have a lot of confidence behind it.

    I've been thinking about this this afternoon, and have an
    idea. It may prove to be an awful one, but maybe it's a good
    one, only time will tell :) I'll create a branch off of HDFS
    trunk specifically for HBase performance work.
    We can commit these "90% done" patches there, which will
    make it easier for others to test and gain confidence.
    Branches also can make it easier to maintain patches over
    time with a changing trunk.

    How does this sound to the HBase community? If it seems
    like a good idea,
    *and* there are some people who would be willing to set it
    up on some small dev clusters and run load tests, I'll move
    forward with it.

    I believe I recently saw discussion that append should
    be removed or
    disabled by default on 0.22 or trunk. Did you see
    anything like this?
    If I am mistaken, fine. If not, this is going in the
    wrong direction,
    for example.
    Not sure what you're referring to - I don't remember any
    discussion like this.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Todd Lipcon at Jun 6, 2011 at 5:10 am
    OK guys, I did my part: I rebased HDFS-941 and HDFS-1148 to trunk. Also
    attempted to rebase HDFS-1323 (nee 918) but it's failing tests.

    If you want to help, please take a look at the failing tests on HDFS-1323
    and see if you can understand what might be going wrong.

    Unfortunately this isn't my top priority at work right now (HA for the
    namenode is), but I'm happy to spend some nights and weekends to help push
    these through if they really work.

    -Todd
    On Sun, Jun 5, 2011 at 3:40 PM, Todd Lipcon wrote:
    On Sat, Jun 4, 2011 at 1:46 AM, Andrew Purtell wrote:

    This is not discouraging. :-)

    HBasers patch CDH because trunk -- anything > 0.20 actually -- is not
    trusted by consensus if you look at all of the production deployments. Does
    ANYONE run trunk under anything approaching "production"? And trunk/upstream
    has a history of ignoring any HBase specific concern. So the use of and
    trading of patches will probably continue for a while, maybe forever.
    Right - I wasn't suggesting that you run trunk in production as of yet. But
    there has been very little activity in terms of HBase people running trunk
    in dev/test clusters in the past. Stack has done some awesome work here in
    the last few weeks, so that should open it up for some more people to jump
    on board.

    I agree that HBase has been treated as a second-class citizen in recent
    years from HDFS's performance, but I think that has changed. All of the
    major HDFS contributors now have serious stakes in HBase, and so long as
    there are tests with sufficient testing that apply against trunk, I don't
    see a reason they wouldn't be included.

    Part of the problem is the expectation that any patch provided against
    trunk may generate months of back and forth, as we have seen, which presents
    difficulities to a potential contributor who does not work on e.g. HDFS
    matters full time. Alternatively it may pick up a committer as sponsor and
    then be vetoed by Yahoo because they're mad at Cloudera over some unrelated
    issue and a patch appears to have a Cloudera sponsor and/or or vice versa.
    Now, that situation I describe _is_ discouraging. It's not enough to say
    that we must contribute through trunk. Trunk needs to earn back our trust.
    Yes, there have been some unfortunate things in the past. There have also
    been some half-finished or untested patches proposed, and you can't blame
    HDFS folks for not taking a big patch that doesn't have a lot of confidence
    behind it.

    I've been thinking about this this afternoon, and have an idea. It may
    prove to be an awful one, but maybe it's a good one, only time will tell :)
    I'll create a branch off of HDFS trunk specifically for HBase performance
    work. We can commit these "90% done" patches there, which will make it
    easier for others to test and gain confidence. Branches also can make it
    easier to maintain patches over time with a changing trunk.

    How does this sound to the HBase community? If it seems like a good idea,
    *and* there are some people who would be willing to set it up on some small
    dev clusters and run load tests, I'll move forward with it.

    I believe I recently saw discussion that append should be removed or
    disabled by default on 0.22 or trunk. Did you see anything like this? If I
    am mistaken, fine. If not, this is going in the wrong direction, for
    example.
    Not sure what you're referring to - I don't remember any discussion like
    this.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Stack at Jun 9, 2011 at 9:03 pm
    HDFS-941 just got committed to TRUNK in a coordinated effort led by
    our man Todd (Hopefully it makes it into 0.22!). HDFS-1148 is next!
    St.Ack
    On Sun, Jun 5, 2011 at 10:09 PM, Todd Lipcon wrote:
    OK guys, I did my part: I rebased HDFS-941 and HDFS-1148 to trunk. Also
    attempted to rebase HDFS-1323 (nee 918) but it's failing tests.

    If you want to help, please take a look at the failing tests on HDFS-1323
    and see if you can understand what might be going wrong.

    Unfortunately this isn't my top priority at work right now (HA for the
    namenode is), but I'm happy to spend some nights and weekends to help push
    these through if they really work.

    -Todd
    On Sun, Jun 5, 2011 at 3:40 PM, Todd Lipcon wrote:
    On Sat, Jun 4, 2011 at 1:46 AM, Andrew Purtell wrote:

    This is not discouraging. :-)

    HBasers patch CDH because trunk -- anything > 0.20 actually -- is not
    trusted by consensus if you look at all of the production deployments. Does
    ANYONE run trunk under anything approaching "production"? And trunk/upstream
    has a history of ignoring any HBase specific concern. So the use of and
    trading of patches will probably continue for a while, maybe forever.
    Right - I wasn't suggesting that you run trunk in production as of yet. But
    there has been very little activity in terms of HBase people running trunk
    in dev/test clusters in the past. Stack has done some awesome work here in
    the last few weeks, so that should open it up for some more people to jump
    on board.

    I agree that HBase has been treated as a second-class citizen in recent
    years from HDFS's performance, but I think that has changed. All of the
    major HDFS contributors now have serious stakes in HBase, and so long as
    there are tests with sufficient testing that apply against trunk, I don't
    see a reason they wouldn't be included.

    Part of the problem is the expectation that any patch provided against
    trunk may generate months of back and forth, as we have seen, which presents
    difficulities to a potential contributor who does not work on e.g. HDFS
    matters full time. Alternatively it may pick up a committer as sponsor and
    then be vetoed by Yahoo because they're mad at Cloudera over some unrelated
    issue and a patch appears to have a Cloudera sponsor and/or or vice versa.
    Now, that situation I describe _is_ discouraging. It's not enough to say
    that we must contribute through trunk. Trunk needs to earn back our trust.
    Yes, there have been some unfortunate things in the past. There have also
    been some half-finished or untested patches proposed, and you can't blame
    HDFS folks for not taking a big patch that doesn't have a lot of confidence
    behind it.

    I've been thinking about this this afternoon, and have an idea. It may
    prove to be an awful one, but maybe it's a good one, only time will tell :)
    I'll create a branch off of HDFS trunk specifically for HBase performance
    work. We can commit these "90% done" patches there, which will make it
    easier for others to test and gain confidence. Branches also can make it
    easier to maintain patches over time with a changing trunk.

    How does this sound to the HBase community? If it seems like a good idea,
    *and* there are some people who would be willing to set it up on some small
    dev clusters and run load tests, I'll move forward with it.

    I believe I recently saw discussion that append should be removed or
    disabled by default on 0.22 or trunk. Did you see anything like this? If I
    am mistaken, fine. If not, this is going in the wrong direction, for
    example.
    Not sure what you're referring to - I don't remember any discussion like
    this.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Doug Meil at Jun 9, 2011 at 9:20 pm
    Great work, guys!

    -----Original Message-----
    From: saint.ack@gmail.com On Behalf Of Stack
    Sent: Thursday, June 09, 2011 5:03 PM
    To: dev@hbase.apache.org
    Cc: apurtell@apache.org
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve HBase)

    HDFS-941 just got committed to TRUNK in a coordinated effort led by our man Todd (Hopefully it makes it into 0.22!). HDFS-1148 is next!
    St.Ack
    On Sun, Jun 5, 2011 at 10:09 PM, Todd Lipcon wrote:
    OK guys, I did my part: I rebased HDFS-941 and HDFS-1148 to trunk.
    Also attempted to rebase HDFS-1323 (nee 918) but it's failing tests.

    If you want to help, please take a look at the failing tests on
    HDFS-1323 and see if you can understand what might be going wrong.

    Unfortunately this isn't my top priority at work right now (HA for the
    namenode is), but I'm happy to spend some nights and weekends to help
    push these through if they really work.

    -Todd
    On Sun, Jun 5, 2011 at 3:40 PM, Todd Lipcon wrote:
    On Sat, Jun 4, 2011 at 1:46 AM, Andrew Purtell wrote:

    This is not discouraging. :-)

    HBasers patch CDH because trunk -- anything > 0.20 actually -- is
    not trusted by consensus if you look at all of the production
    deployments. Does ANYONE run trunk under anything approaching
    "production"? And trunk/upstream has a history of ignoring any HBase
    specific concern. So the use of and trading of patches will probably continue for a while, maybe forever.
    Right - I wasn't suggesting that you run trunk in production as of
    yet. But there has been very little activity in terms of HBase people
    running trunk in dev/test clusters in the past. Stack has done some
    awesome work here in the last few weeks, so that should open it up
    for some more people to jump on board.

    I agree that HBase has been treated as a second-class citizen in
    recent years from HDFS's performance, but I think that has changed.
    All of the major HDFS contributors now have serious stakes in HBase,
    and so long as there are tests with sufficient testing that apply
    against trunk, I don't see a reason they wouldn't be included.

    Part of the problem is the expectation that any patch provided
    against trunk may generate months of back and forth, as we have
    seen, which presents difficulities to a potential contributor who
    does not work on e.g. HDFS matters full time. Alternatively it may
    pick up a committer as sponsor and then be vetoed by Yahoo because
    they're mad at Cloudera over some unrelated issue and a patch appears to have a Cloudera sponsor and/or or vice versa.
    Now, that situation I describe _is_ discouraging. It's not enough to
    say that we must contribute through trunk. Trunk needs to earn back our trust.
    Yes, there have been some unfortunate things in the past. There have
    also been some half-finished or untested patches proposed, and you
    can't blame HDFS folks for not taking a big patch that doesn't have a
    lot of confidence behind it.

    I've been thinking about this this afternoon, and have an idea. It
    may prove to be an awful one, but maybe it's a good one, only time
    will tell :) I'll create a branch off of HDFS trunk specifically for
    HBase performance work. We can commit these "90% done" patches there,
    which will make it easier for others to test and gain confidence.
    Branches also can make it easier to maintain patches over time with a changing trunk.

    How does this sound to the HBase community? If it seems like a good
    idea,
    *and* there are some people who would be willing to set it up on some
    small dev clusters and run load tests, I'll move forward with it.

    I believe I recently saw discussion that append should be removed or
    disabled by default on 0.22 or trunk. Did you see anything like
    this? If I am mistaken, fine. If not, this is going in the wrong
    direction, for example.
    Not sure what you're referring to - I don't remember any discussion
    like this.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Andrew Purtell at Jun 3, 2011 at 10:38 pm
    I have patches for HDFS-347 and HDFS-941 (and HDFS-918) for CDH3U0.

    - Andy
    From: Doug Meil <doug.meil@explorysmedical.com>
    Subject: RE: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Date: Friday, June 3, 2011, 12:50 PM
    Thanks everybody for commenting on
    this thread.

    We'd certainly like to lobby for movement on these two
    tickets, and although we don't have anybody that is familiar
    with the source code we'd be happy to perform some tests get
    some performance numbers.

    Per Kihwal's comments, it sounds like HDFS-941 needs to get
    re-worked because the patch is stale.

    The patch for HDFS-347 sounds like it's still usable.

    So what else is needed to push this effort forward?
    Is it beneficial to get more numbers on HDFS-347 and keep
    lobbying on the ticket, and/or is there another path that
    should be taken (plying with beer, free Cleveland Indians
    tickets, harassing phone calls, etc.)?



    -----Original Message-----
    From: Dhruba Borthakur

    Sent: Friday, June 03, 2011 3:00 PM
    To: dev@hbase.apache.org
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve
    HBase)

    I completely agree with Ryan. Most of the measurements in
    HDFS-347 are point comparisions.... data rate over socket,
    single-threaded sequential read from datanode,
    single-threaded random read form datanode, etc. These
    measurements are good, but when you run the entire Hbase
    system at load, you definitely see a 3X performance
    improvement when reading data locally (instead of going
    through the datanode).

    -dhruba
    On Fri, Jun 3, 2011 at 11:08 AM, Ryan Rawson wrote:

    Could you explain your HDFS-347 comment more?  I
    dont think people
    suggested that the socket itself was the primary
    issue, but dealing
    with the datanode and the socket and everything was
    really slow.  It's
    hard to separate concerns and test only 1 thing at a
    time - for
    example you said 'local socket comm isnt the problem',
    but there is no
    way to build a test that uses a local socket but not
    the datanode.
    The basic fact is that datanode adds a lot of
    overhead, and under high
    concurrency that overhead grows.


    On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee wrote:
    HDFS-941
    The trunk has moved on so the patch won't
    apply.  There has been
    significant changes in HDFS lately, so it will require more than
    simple rebase/merge.  If the original assignee is
    busy, I am willing to help.
    HDFS-347
    The analysis is pointing out that local socket
    communication is
    actually
    not the problem. The initial assumption of local
    socket being slow
    should be ignored and the design should be revisited.
    I agree that improving local pread performance is
    critical.  Based
    on my
    experiments, HDFS-941 helps a lot and the
    communication channel became
    no longer the bottleneck.
    Kihwal


    On 6/2/11 4:00 PM, "Doug Meil" wrote:

    Hi folks, I was wondering if there was any
    movement on any of these
    HDFS
    tickets for HBase.  The umbrella ticket is
    HDFS-1599, but the last
    comment from stack back in Feb highlighted interest in
    several tickets:

    1)      HDFS-918 (use single
    selector)
    a.       Last comment
    Jan 2011


    2)      HDFS-941 (reuse of
    connection)
    a.       Patch available
    as of April 2011
    b.      But ticket still
    unresolved.


    3)      HDFS-347 (local reads)

    a.       Discussion
    seemed to end in March 2011 with a huge comment
    saying that there was no performance benefit.
    b.      I'm working my way through
    this comment/report, but intuitively
    it seems like it would be a good idea since as the
    other comments in
    the ticket stated the RS reads locally just about
    every time.

    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com



    --
    Connect to me at http://www.facebook.com/dhruba
  • Todd Lipcon at Jun 3, 2011 at 10:40 pm

    On Fri, Jun 3, 2011 at 3:38 PM, Andrew Purtell wrote:
    I have patches for HDFS-347 and HDFS-941 (and HDFS-918) for CDH3U0.
    Does your 347 patch do security? or just the one where it sneaks around back?

    Have you tested the others under real load for a couple days?
    - Andy
    From: Doug Meil <doug.meil@explorysmedical.com>
    Subject: RE: HDFS-1599 status? (HDFS tickets to improve HBase)
    To: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Date: Friday, June 3, 2011, 12:50 PM
    Thanks everybody for commenting on
    this thread.

    We'd certainly like to lobby for movement on these two
    tickets, and although we don't have anybody that is familiar
    with the source code we'd be happy to perform some tests get
    some performance numbers.

    Per Kihwal's comments, it sounds like HDFS-941 needs to get
    re-worked because the patch is stale.

    The patch for HDFS-347 sounds like it's still usable.

    So what else is needed to push this effort forward?
    Is it beneficial to get more numbers on HDFS-347 and keep
    lobbying on the ticket, and/or is there another path that
    should be taken (plying with beer, free Cleveland Indians
    tickets, harassing phone calls, etc.)?



    -----Original Message-----
    From: Dhruba Borthakur

    Sent: Friday, June 03, 2011 3:00 PM
    To: dev@hbase.apache.org
    Subject: Re: HDFS-1599 status? (HDFS tickets to improve
    HBase)

    I completely agree with Ryan. Most of the measurements in
    HDFS-347 are point comparisions.... data rate over socket,
    single-threaded sequential read from datanode,
    single-threaded random read form datanode, etc. These
    measurements are good, but when you run the entire Hbase
    system at load, you definitely see a 3X performance
    improvement when reading data locally (instead of going
    through the datanode).

    -dhruba

    On Fri, Jun 3, 2011 at 11:08 AM, Ryan Rawson <ryanobjc@gmail.com>
    wrote:
    Could you explain your HDFS-347 comment more?  I
    dont think people
    suggested that the socket itself was the primary
    issue, but dealing
    with the datanode and the socket and everything was
    really slow.  It's
    hard to separate concerns and test only 1 thing at a
    time - for
    example you said 'local socket comm isnt the problem',
    but there is no
    way to build a test that uses a local socket but not
    the datanode.
    The basic fact is that datanode adds a lot of
    overhead, and under high
    concurrency that overhead grows.



    On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee <kihwal@yahoo-inc.com>
    wrote:
    HDFS-941
    The trunk has moved on so the patch won't
    apply.  There has been
    significant changes in HDFS lately, so it will require more than
    simple rebase/merge.  If the original assignee is
    busy, I am willing to help.
    HDFS-347
    The analysis is pointing out that local socket
    communication is
    actually
    not the problem. The initial assumption of local
    socket being slow
    should be ignored and the design should be revisited.
    I agree that improving local pread performance is
    critical.  Based
    on my
    experiments, HDFS-941 helps a lot and the
    communication channel became
    no longer the bottleneck.
    Kihwal


    On 6/2/11 4:00 PM, "Doug Meil" <doug.meil@explorysmedical.com>
    wrote:
    Hi folks, I was wondering if there was any
    movement on any of these
    HDFS
    tickets for HBase.  The umbrella ticket is
    HDFS-1599, but the last
    comment from stack back in Feb highlighted interest in
    several tickets:

    1)      HDFS-918 (use single
    selector)
    a.       Last comment
    Jan 2011


    2)      HDFS-941 (reuse of
    connection)
    a.       Patch available
    as of April 2011
    b.      But ticket still
    unresolved.


    3)      HDFS-347 (local reads)

    a.       Discussion
    seemed to end in March 2011 with a huge comment
    saying that there was no performance benefit.
    b.      I'm working my way through
    this comment/report, but intuitively
    it seems like it would be a good idea since as the
    other comments in
    the ticket stated the RS reads locally just about
    every time.

    Doug Meil
    Chief Software Architect, Explorys
    doug.meil@explorys.com



    --
    Connect to me at http://www.facebook.com/dhruba


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Andrew Purtell at Jun 3, 2011 at 10:51 pm

    From: Todd Lipcon <todd@cloudera.com>
    I have patches for HDFS-347 and HDFS-941 (and
    HDFS-918) for CDH3U0.
    Does your 347 patch do security? or just the one where it
    sneaks around back?

    Have you tested the others under real load for a couple
    days?
    We use the sneaky 347 and, sure, it's a complete hack. We put DataNodes _and_ RegionSevers into the TCB.

    We only use 918, which indeed was tested under load over a period of time.

    I put 947 into our "next" branch so it is up for evaluation next, among other changes (like snappy integration). I could very well end up pulling it, depending.

    - Andy

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedJun 2, '11 at 9:00p
activeJun 9, '11 at 9:20p
posts22
users8
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase