Grokbase Groups HBase dev May 2012
FAQ
The third 0.94.0 RC is available for download here: http://people.apache.org/~larsh/hbase-0.94.0-rc3/
(My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

HBase 0.94 is a performance release, and there are some interesting new features as well.

It is wire compatible with 0.92.x. 0.92 clients should work with 0.94 servers and vice versa.

You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.

The full list of changes is available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419

Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by May 8th on whether we should release this as 0.94.0.

Thanks.

-- Lars

Search Discussions

  • Todd Lipcon at May 2, 2012 at 4:35 am
    +1 from me, I took it for a spin on the local filesystem with some YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a source tar
    (ie an svn export) since we now produce multiple binaries, and it's easier
    to verify that a source tar matches SVN, etc.

    -Todd

    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:

    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.

    The full list of changes is available here:

    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419

    Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Lars hofhansl at May 2, 2012 at 5:21 am
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available for download

    +1 from me, I took it for a spin on the local filesystem with some YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a source tar
    (ie an svn export) since we now produce multiple binaries, and it's easier
    to verify that a source tar matches SVN, etc.

    -Todd

    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:

    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.

    The full list of changes is available here:

    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419

    Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 2, 2012 at 10:00 pm
    I ran some tests of local filesystem YCSB. I used the 0.90 client for
    0.90.6. For the rest of the tests I used 0.92 clients. The results are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll run some
    tests on a cluster later today.
    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl wrote:

    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    +1 from me, I took it for a spin on the local filesystem with some YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a source tar
    (ie an svn export) since we now produce multiple binaries, and it's easier
    to verify that a source tar matches SVN, etc.

    -Todd

    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:

    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.
    The full list of changes is available here:

    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Ted Yu at May 2, 2012 at 10:02 pm
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.
    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark wrote:

    I ran some tests of local filesystem YCSB. I used the 0.90 client for
    0.90.6. For the rest of the tests I used 0.92 clients. The results are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll run some
    tests on a cluster later today.
    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl wrote:

    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    +1 from me, I took it for a spin on the local filesystem with some YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a source tar
    (ie an svn export) since we now produce multiple binaries, and it's easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.
    The full list of changes is available here:

    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 2, 2012 at 10:07 pm
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu wrote:

    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90 client for
    0.90.6. For the rest of the tests I used 0.92 clients. The results are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll run some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    +1 from me, I took it for a spin on the local filesystem with some YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a source tar
    (ie an svn export) since we now produce multiple binaries, and it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Ted Yu at May 2, 2012 at 10:11 pm
    I am surprised to see 0.92.1 exhibit such unfavorable performance profile.
    Let's see whether cluster testing gives us similar results.
    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark wrote:

    Sure, sorry about that.

    http://imgur.com/waxlS

    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu wrote:

    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90 client for
    0.90.6. For the rest of the tests I used 0.92 clients. The results are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem with some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a source
    tar
    (ie an svn export) since we now produce multiple binaries, and it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Mikael Sitruk at May 3, 2012 at 7:16 am
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution even with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3 versions
    configured we have = 3(version)*100,000(keys)*1KB (size of record) = 300 MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20 store
    file, even taking factor of 10 for metadata (record key, in store files) we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile to a
    single small one.
    What I try to say is - if the maths are correct - (please note that i did
    not take into account compression which just make things better), can we
    relate on such scenario for performance benchmark with such small dataset
    and such distribution?

    Regards
    Mikael.S
    On Thu, May 3, 2012 at 1:10 AM, Ted Yu wrote:

    I am surprised to see 0.92.1 exhibit such unfavorable performance profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu wrote:

    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90 client for
    0.90.6. For the rest of the tests I used 0.92 clients. The results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem with some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a source
    tar
    (ie an svn export) since we now produce multiple binaries, and it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 3, 2012 at 5:37 pm
    I agree it was just a micro benchmark with no guarantee that it relates to
    real world. With it just being standalone I didn't think anyone should take
    the numbers as 100% representative. Really I was just trying to shake out
    any weird behaviors and the fact that we got a big speed up was
    interesting.
    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk wrote:

    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution even with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3 versions
    configured we have = 3(version)*100,000(keys)*1KB (size of record) = 300 MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20 store
    file, even taking factor of 10 for metadata (record key, in store files) we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile to a
    single small one.
    What I try to say is - if the maths are correct - (please note that i did
    not take into account compression which just make things better), can we
    relate on such scenario for performance benchmark with such small dataset
    and such distribution?

    Regards
    Mikael.S
    On Thu, May 3, 2012 at 1:10 AM, Ted Yu wrote:

    I am surprised to see 0.92.1 exhibit such unfavorable performance profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu wrote:

    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90 client
    for
    0.90.6. For the rest of the tests I used 0.92 clients. The results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem with some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a
    source
    tar
    (ie an svn export) since we now produce multiple binaries, and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 4, 2012 at 10:05 pm
    So I got 94.0rc3 up on a cluster and tried to break it, Killing masters and
    killing rs. Everything seems good. hbck reports everything is good. And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running. Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark wrote:

    I agree it was just a micro benchmark with no guarantee that it relates to
    real world. With it just being standalone I didn't think anyone should take
    the numbers as 100% representative. Really I was just trying to shake out
    any weird behaviors and the fact that we got a big speed up was
    interesting.

    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk wrote:

    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution even with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3 versions
    configured we have = 3(version)*100,000(keys)*1KB (size of record) = 300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20 store
    file, even taking factor of 10 for metadata (record key, in store files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile to a
    single small one.
    What I try to say is - if the maths are correct - (please note that i did
    not take into account compression which just make things better), can we
    relate on such scenario for performance benchmark with such small dataset
    and such distribution?

    Regards
    Mikael.S
    On Thu, May 3, 2012 at 1:10 AM, Ted Yu wrote:

    I am surprised to see 0.92.1 exhibit such unfavorable performance profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu wrote:

    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90 client
    for
    0.90.6. For the rest of the tests I used 0.92 clients. The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a
    source
    tar
    (ie an svn export) since we now produce multiple binaries, and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Ted Yu at May 4, 2012 at 10:07 pm
    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better measuring
    performance, from my experience.

    Cheers
    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark wrote:

    So I got 94.0rc3 up on a cluster and tried to break it, Killing masters and
    killing rs. Everything seems good. hbck reports everything is good. And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running. Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that it relates to
    real world. With it just being standalone I didn't think anyone should take
    the numbers as 100% representative. Really I was just trying to shake out
    any weird behaviors and the fact that we got a big speed up was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution even
    with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of record) = 300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20 store
    file, even taking factor of 10 for metadata (record key, in store files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile to a
    single small one.
    What I try to say is - if the maths are correct - (please note that i
    did
    not take into account compression which just make things better), can we
    relate on such scenario for performance benchmark with such small
    dataset
    and such distribution?

    Regards
    Mikael.S
    On Thu, May 3, 2012 at 1:10 AM, Ted Yu wrote:

    I am surprised to see 0.92.1 exhibit such unfavorable performance profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu wrote:

    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92 clients. The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement. I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a
    source
    tar
    (ie an svn export) since we now produce multiple binaries, and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 4, 2012 at 10:14 pm
    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster. I was saturating the network of the test
    machine. So I switched over to pe; while it doesn't have a realistic work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing masters and
    killing rs. Everything seems good. hbck reports everything is good. And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running. Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that it relates to
    real world. With it just being standalone I didn't think anyone should take
    the numbers as 100% representative. Really I was just trying to shake out
    any weird behaviors and the fact that we got a big speed up was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution even
    with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20 store
    file, even taking factor of 10 for metadata (record key, in store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile to a
    single small one.
    What I try to say is - if the maths are correct - (please note that i
    did
    not take into account compression which just make things better), can
    we
    relate on such scenario for performance benchmark with such small
    dataset
    and such distribution?

    Regards
    Mikael.S
    On Thu, May 3, 2012 at 1:10 AM, Ted Yu wrote:

    I am surprised to see 0.92.1 exhibit such unfavorable performance profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu wrote:

    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92 clients. The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting against a
    source
    tar
    (ie an svn export) since we now produce multiple binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id:
    7CA45750)
    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc, and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Ted Yu at May 5, 2012 at 2:43 am
    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI
    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark wrote:

    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster. I was saturating the network of the test
    machine. So I switched over to pe; while it doesn't have a realistic work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing masters and
    killing rs. Everything seems good. hbck reports everything is good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running. Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that it
    relates
    to
    real world. With it just being standalone I didn't think anyone
    should
    take
    the numbers as 100% representative. Really I was just trying to
    shake
    out
    any weird behaviors and the fact that we got a big speed up was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution even
    with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please note that
    i
    did
    not take into account compression which just make things better),
    can
    we
    relate on such scenario for performance benchmark with such small
    dataset
    and such distribution?

    Regards
    Mikael.S
    On Thu, May 3, 2012 at 1:10 AM, Ted Yu wrote:

    I am surprised to see 0.92.1 exhibit such unfavorable performance profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92 clients. The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an
    SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id:
    7CA45750)
    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 7, 2012 at 5:43 pm
    http://www.scribd.com/eclark847297/d/92715238-0-94-0-RC3-Cluster-Perf
    On Fri, May 4, 2012 at 7:42 PM, Ted Yu wrote:

    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI

    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster. I was saturating the network of the test
    machine. So I switched over to pe; while it doesn't have a realistic work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better
    measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing
    masters
    and
    killing rs. Everything seems good. hbck reports everything is good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running.
    Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that it
    relates
    to
    real world. With it just being standalone I didn't think anyone
    should
    take
    the numbers as 100% representative. Really I was just trying to
    shake
    out
    any weird behaviors and the fact that we got a big speed up was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution
    even
    with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of
    record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please note
    that
    i
    did
    not take into account compression which just make things better),
    can
    we
    relate on such scenario for performance benchmark with such small
    dataset
    and such distribution?

    Regards
    Mikael.S
    On Thu, May 3, 2012 at 1:10 AM, Ted Yu wrote:

    I am surprised to see 0.92.1 exhibit such unfavorable
    performance
    profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92 clients.
    The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an
    SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release
    candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple
    binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id:
    7CA45750)
    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase
    up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Todd Lipcon at May 7, 2012 at 5:47 pm
    Is higher better or worse? :) Any idea what happened on the "Write 5" test?
    On Mon, May 7, 2012 at 10:42 AM, Elliott Clark wrote:
    http://www.scribd.com/eclark847297/d/92715238-0-94-0-RC3-Cluster-Perf
    On Fri, May 4, 2012 at 7:42 PM, Ted Yu wrote:

    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI

    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster.  I was saturating the network of the test
    machine.  So I switched over to pe; while it doesn't have a realistic work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better
    measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing
    masters
    and
    killing rs.  Everything seems good. hbck reports everything is good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running.
    Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that it
    relates
    to
    real world. With it just being standalone I didn't think anyone
    should
    take
    the numbers as 100% representative.  Really I was just trying to
    shake
    out
    any weird behaviors and the fact that we got a big speed up was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution
    even
    with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with 3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of
    record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please note
    that
    i
    did
    not take into account compression which just make things better),
    can
    we
    relate on such scenario for performance benchmark with such small
    dataset
    and such distribution?

    Regards
    Mikael.S

    On Thu, May 3, 2012 at 1:10 AM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    I am surprised to see 0.92.1 exhibit such unfavorable
    performance
    profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the 0.90
    client
    for
    0.90.6.  For the rest of the tests I used 0.92 clients.
    The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against an
    SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release
    candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple
    binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id:
    7CA45750)
    HBase 0.94 is a performance release, and there are some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase
    up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc, etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 7, 2012 at 6:08 pm
    Sorry everything is in elapsed time as reported by Elapsed time in
    milliseconds. So higher is worse.

    The standard deviation on 0.92.1 writes is 4,591,384 so Write 5 is a little
    outside of 1 std dev. Not really sure what happened on that test, but it
    does appear that PE is very noisy.
    On Mon, May 7, 2012 at 10:47 AM, Todd Lipcon wrote:

    Is higher better or worse? :) Any idea what happened on the "Write 5" test?
    On Mon, May 7, 2012 at 10:42 AM, Elliott Clark wrote:
    http://www.scribd.com/eclark847297/d/92715238-0-94-0-RC3-Cluster-Perf
    On Fri, May 4, 2012 at 7:42 PM, Ted Yu wrote:

    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI

    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster. I was saturating the network of the
    test
    machine. So I switched over to pe; while it doesn't have a realistic work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better
    measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing
    masters
    and
    killing rs. Everything seems good. hbck reports everything is
    good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running.
    Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that it
    relates
    to
    real world. With it just being standalone I didn't think anyone
    should
    take
    the numbers as 100% representative. Really I was just trying to
    shake
    out
    any weird behaviors and the fact that we got a big speed up was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution
    even
    with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with
    3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of
    record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size
    (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in
    store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the
    storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please note
    that
    i
    did
    not take into account compression which just make things
    better),
    can
    we
    relate on such scenario for performance benchmark with such
    small
    dataset
    and such distribution?

    Regards
    Mikael.S

    On Thu, May 3, 2012 at 1:10 AM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    I am surprised to see 0.92.1 exhibit such unfavorable
    performance
    profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <
    yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the
    0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92 clients.
    The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance
    improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against
    an
    SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release
    candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local
    filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple
    binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id:
    7CA45750)
    HBase 0.94 is a performance release, and there are
    some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients
    should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x
    HBase
    up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc,
    etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Enis Söztutar at May 7, 2012 at 8:43 pm
    Elliot, any plan on running the same on 0.90.x?

    Enis
    On Mon, May 7, 2012 at 11:07 AM, Elliott Clark wrote:

    Sorry everything is in elapsed time as reported by Elapsed time in
    milliseconds. So higher is worse.

    The standard deviation on 0.92.1 writes is 4,591,384 so Write 5 is a little
    outside of 1 std dev. Not really sure what happened on that test, but it
    does appear that PE is very noisy.
    On Mon, May 7, 2012 at 10:47 AM, Todd Lipcon wrote:

    Is higher better or worse? :) Any idea what happened on the "Write 5" test?
    On Mon, May 7, 2012 at 10:42 AM, Elliott Clark <eclark@stumbleupon.com>
    wrote:
    http://www.scribd.com/eclark847297/d/92715238-0-94-0-RC3-Cluster-Perf
    On Fri, May 4, 2012 at 7:42 PM, Ted Yu wrote:

    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI

    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster. I was saturating the network of the
    test
    machine. So I switched over to pe; while it doesn't have a
    realistic
    work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better
    measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing
    masters
    and
    killing rs. Everything seems good. hbck reports everything is
    good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running.
    Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that
    it
    relates
    to
    real world. With it just being standalone I didn't think
    anyone
    should
    take
    the numbers as 100% representative. Really I was just trying
    to
    shake
    out
    any weird behaviors and the fact that we got a big speed up
    was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian
    distribution
    even
    with
    5M operations count, the same keys are updated again and
    again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even
    with
    3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of
    record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6
    (op
    count)*1KB(record size)/256MB(max store file size
    (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in
    store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the
    storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please
    note
    that
    i
    did
    not take into account compression which just make things
    better),
    can
    we
    relate on such scenario for performance benchmark with such
    small
    dataset
    and such distribution?

    Regards
    Mikael.S

    On Thu, May 3, 2012 at 1:10 AM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    I am surprised to see 0.92.1 exhibit such unfavorable
    performance
    profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <
    yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the
    0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92
    clients.
    The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance
    improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going
    forward.
    For that, would it be sufficient to just vote
    against
    an
    SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release
    candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local
    filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple
    binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download
    here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key
    id:
    7CA45750)
    HBase 0.94 is a performance release, and there are
    some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients
    should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x
    HBase
    up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc,
    etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as
    0.94.0.
    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 8, 2012 at 5:43 pm
    Probably not. The number for on cluster are just so close that it looks
    like the only large differences in perf were on standalone installs. Since
    everywhere that talks about standalone talks about how it's not to be used
    as a basis for performance evaluation I think things are fine. 0.94 looks
    great and I would +1 it if I had a vote.

    Also testing on 0.90 is harder to get everything set up; There's no
    presplit in pe, so I'm not 100% sure that the numbers would be reliable.
    On Mon, May 7, 2012 at 1:43 PM, Enis Söztutar wrote:

    Elliot, any plan on running the same on 0.90.x?

    Enis

    On Mon, May 7, 2012 at 11:07 AM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    Sorry everything is in elapsed time as reported by Elapsed time in
    milliseconds. So higher is worse.

    The standard deviation on 0.92.1 writes is 4,591,384 so Write 5 is a little
    outside of 1 std dev. Not really sure what happened on that test, but it
    does appear that PE is very noisy.
    On Mon, May 7, 2012 at 10:47 AM, Todd Lipcon wrote:

    Is higher better or worse? :) Any idea what happened on the "Write 5" test?
    On Mon, May 7, 2012 at 10:42 AM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    http://www.scribd.com/eclark847297/d/92715238-0-94-0-RC3-Cluster-Perf
    On Fri, May 4, 2012 at 7:42 PM, Ted Yu wrote:

    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI

    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    With the cluster size that I'm testing YCSB was stressing the
    client
    machine more than the cluster. I was saturating the network of
    the
    test
    machine. So I switched over to pe; while it doesn't have a
    realistic
    work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better
    measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it,
    Killing
    masters
    and
    killing rs. Everything seems good. hbck reports everything is
    good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done
    running.
    Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that
    it
    relates
    to
    real world. With it just being standalone I didn't think
    anyone
    should
    take
    the numbers as 100% representative. Really I was just
    trying
    to
    shake
    out
    any weird behaviors and the fact that we got a big speed up
    was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian
    distribution
    even
    with
    5M operations count, the same keys are updated again and
    again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even
    with
    3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of
    record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6
    (op
    count)*1KB(record size)/256MB(max store file size
    (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in
    store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the
    storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please
    note
    that
    i
    did
    not take into account compression which just make things
    better),
    can
    we
    relate on such scenario for performance benchmark with such
    small
    dataset
    and such distribution?

    Regards
    Mikael.S

    On Thu, May 3, 2012 at 1:10 AM, Ted Yu <
    yuzhihong@gmail.com>
    wrote:
    I am surprised to see 0.92.1 exhibit such unfavorable
    performance
    profile.
    Let's see whether cluster testing gives us similar
    results.
    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <
    yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used
    the
    0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92
    clients.
    The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance
    improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going
    forward.
    For that, would it be sufficient to just vote
    against
    an
    SVN
    tag?
    Tarballs can then be pulled straight from that
    tag.
    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release
    candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local
    filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the
    voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple
    binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download
    here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key
    id:
    7CA45750)
    HBase 0.94 is a performance release, and there
    are
    some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients
    should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x
    HBase
    up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the
    doc,
    etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as
    0.94.0.
    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Lars hofhansl at May 8, 2012 at 8:22 pm
    Hmm... So our "performance release" is slightly slower than 0.92.
    With all the optimizations that went into 0.94 I find that a bit hard to believe.

    Can you tell us more about the testing? How many machines, setup, was that test IO or CPU bound, etc?
    Anything else of note?

    Thanks for doing this!

    -- Lars

    ________________________________
    From: Elliott Clark <eclark@stumbleupon.com>
    To: dev@hbase.apache.org
    Sent: Monday, May 7, 2012 11:07 AM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available for download

    Sorry everything is in elapsed time as reported by Elapsed time in
    milliseconds.  So higher is worse.

    The standard deviation on 0.92.1 writes is 4,591,384 so Write 5 is a little
    outside of 1 std dev.  Not really sure what happened on that test, but it
    does appear that PE is very noisy.
    On Mon, May 7, 2012 at 10:47 AM, Todd Lipcon wrote:

    Is higher better or worse? :) Any idea what happened on the "Write 5" test?
    On Mon, May 7, 2012 at 10:42 AM, Elliott Clark wrote:
    http://www.scribd.com/eclark847297/d/92715238-0-94-0-RC3-Cluster-Perf
    On Fri, May 4, 2012 at 7:42 PM, Ted Yu wrote:

    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI

    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster.  I was saturating the network of the
    test
    machine.  So I switched over to pe; while it doesn't have a realistic work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better
    measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing
    masters
    and
    killing rs.  Everything seems good. hbck reports everything is
    good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running.
    Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that it
    relates
    to
    real world. With it just being standalone I didn't think anyone
    should
    take
    the numbers as 100% representative.  Really I was just trying to
    shake
    out
    any weird behaviors and the fact that we got a big speed up was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian distribution
    even
    with
    5M operations count, the same keys are updated again and again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even with
    3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of
    record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6 (op
    count)*1KB(record size)/256MB(max store file size
    (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in
    store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the
    storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please note
    that
    i
    did
    not take into account compression which just make things
    better),
    can
    we
    relate on such scenario for performance benchmark with such
    small
    dataset
    and such distribution?

    Regards
    Mikael.S

    On Thu, May 3, 2012 at 1:10 AM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    I am surprised to see 0.92.1 exhibit such unfavorable
    performance
    profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <
    yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the
    0.90
    client
    for
    0.90.6.  For the rest of the tests I used 0.92 clients.
    The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance
    improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going forward.

    For that, would it be sufficient to just vote against
    an
    SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release
    candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local
    filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple
    binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id:
    7CA45750)
    HBase 0.94 is a performance release, and there are
    some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients
    should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x
    HBase
    up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc,
    etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Elliott Clark at May 8, 2012 at 9:47 pm
    14 machine


    1 Master (nn 2nn jt hmaster)
    13 slaves (dn tt rs) -> 4hhd's each with 2xQuad Core intel w/HT

    Sampling of Configs:
    -Xmx10G -XX:CMSInitiatingOccupancyFraction=75 -XX:NewSize=256m
    -XX:MaxNewSize=256m

    hbase.hregion.memstore.flush.size = 2147483648
    hbase.hregion.max.filesize = 2147483648
    hbase.rpc.compression = snappy
    dfs.client.read.shortcircuit = true
    hbase.ipc.client.tcpnodelay = false

    mapred.tasktracker.map.tasks.maximum = 17

    The commands run to create the tables and run the tests should be in the
    previous sheets. It seem like the PerformanceEvaluation tests are pretty
    noisy so I wouldn't trust the smaller runs on page 1; that's why I did the
    larger runs on page 2.
    On Tue, May 8, 2012 at 1:21 PM, lars hofhansl wrote:

    Hmm... So our "performance release" is slightly slower than 0.92.
    With all the optimizations that went into 0.94 I find that a bit hard to
    believe.

    Can you tell us more about the testing? How many machines, setup, was that
    test IO or CPU bound, etc?
    Anything else of note?

    Thanks for doing this!

    -- Lars

    ________________________________
    From: Elliott Clark <eclark@stumbleupon.com>
    To: dev@hbase.apache.org
    Sent: Monday, May 7, 2012 11:07 AM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    Sorry everything is in elapsed time as reported by Elapsed time in
    milliseconds. So higher is worse.

    The standard deviation on 0.92.1 writes is 4,591,384 so Write 5 is a little
    outside of 1 std dev. Not really sure what happened on that test, but it
    does appear that PE is very noisy.
    On Mon, May 7, 2012 at 10:47 AM, Todd Lipcon wrote:

    Is higher better or worse? :) Any idea what happened on the "Write 5" test?
    On Mon, May 7, 2012 at 10:42 AM, Elliott Clark <eclark@stumbleupon.com>
    wrote:
    http://www.scribd.com/eclark847297/d/92715238-0-94-0-RC3-Cluster-Perf
    On Fri, May 4, 2012 at 7:42 PM, Ted Yu wrote:

    0.94 also has LoadTestTool (from FB)

    I have used it to do some cluster load testing.

    Just FYI

    On Fri, May 4, 2012 at 3:14 PM, Elliott Clark <eclark@stumbleupon.com
    wrote:
    With the cluster size that I'm testing YCSB was stressing the client
    machine more than the cluster. I was saturating the network of the
    test
    machine. So I switched over to pe; while it doesn't have a
    realistic
    work
    load it is better than nothing.
    On Fri, May 4, 2012 at 3:07 PM, Ted Yu wrote:

    Thanks for the update, Elliot.

    If I read your post correctly, you're using PE. ycsb is better
    measuring
    performance, from my experience.

    Cheers

    On Fri, May 4, 2012 at 3:04 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    So I got 94.0rc3 up on a cluster and tried to break it, Killing
    masters
    and
    killing rs. Everything seems good. hbck reports everything is
    good.
    And
    all my reads succeed.

    I'll post cluster benchmark numbers once they are done running.
    Should
    only be a couple more hours of pe runs.

    Looks great to me.
    On Thu, May 3, 2012 at 10:36 AM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I agree it was just a micro benchmark with no guarantee that
    it
    relates
    to
    real world. With it just being standalone I didn't think
    anyone
    should
    take
    the numbers as 100% representative. Really I was just trying
    to
    shake
    out
    any weird behaviors and the fact that we got a big speed up
    was
    interesting.


    On Thu, May 3, 2012 at 12:15 AM, Mikael Sitruk <
    mikael.sitruk@gmail.com
    wrote:
    Hi guys
    Looking at the posted slide/pictures for the benchmark the
    following intriguing me:
    1. The recordcount is only 100,000
    2. workoloada is: read 50%, update 50% and zipfian
    distribution
    even
    with
    5M operations count, the same keys are updated again and
    again.
    3. heap size 10G

    Therefore it might be that the dataset is too small (even
    with
    3
    versions
    configured we have = 3(version)*100,000(keys)*1KB (size of
    record) =
    300
    MB
    of "live" dataset ?
    And approximately the number of store files will be 5x10^6
    (op
    count)*1KB(record size)/256MB(max store file size
    (Default))=>20
    store
    file, even taking factor of 10 for metadata (record key, in
    store
    files)
    we
    will get 200 files.
    if a major compaction is running it will shrink all the
    storefile
    to a
    single small one.
    What I try to say is - if the maths are correct - (please
    note
    that
    i
    did
    not take into account compression which just make things
    better),
    can
    we
    relate on such scenario for performance benchmark with such
    small
    dataset
    and such distribution?

    Regards
    Mikael.S

    On Thu, May 3, 2012 at 1:10 AM, Ted Yu <yuzhihong@gmail.com>
    wrote:
    I am surprised to see 0.92.1 exhibit such unfavorable
    performance
    profile.
    Let's see whether cluster testing gives us similar results.

    On Wed, May 2, 2012 at 3:07 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    Sure, sorry about that.

    http://imgur.com/waxlS
    http://www.scribd.com/eclark847297/d/92151092-Hbase-0-94-0-RC3-Local-YCSB-Perf
    On Wed, May 2, 2012 at 3:01 PM, Ted Yu <
    yuzhihong@gmail.com>
    wrote:
    Elliot:
    Thanks for the report.
    Can you publish results somewhere else ?
    Attachments were stripped off.

    On Wed, May 2, 2012 at 2:59 PM, Elliott Clark <
    eclark@stumbleupon.com
    wrote:
    I ran some tests of local filesystem YCSB. I used the
    0.90
    client
    for
    0.90.6. For the rest of the tests I used 0.92
    clients.
    The
    results
    are
    attached.

    0.90 -> 0.94.0RC3 13% faster
    0.92 -> 0.94.0RC3 50% faster

    This seems to be a pretty large performance
    improvement.
    I'll
    run
    some
    tests on a cluster later today.

    On Tue, May 1, 2012 at 10:20 PM, lars hofhansl <
    lhofhansl@yahoo.com
    wrote:
    Thanks Todd.

    I agree with doing source code releases going
    forward.
    For that, would it be sufficient to just vote
    against
    an
    SVN
    tag?
    Tarballs can then be pulled straight from that tag.

    -- Lars



    ----- Original Message -----
    From: Todd Lipcon <todd@cloudera.com>
    To: dev@hbase.apache.org; lars hofhansl <
    lhofhansl@yahoo.com
    Cc:
    Sent: Tuesday, May 1, 2012 9:35 PM
    Subject: Re: ANN: The third hbase 0.94.0 release
    candidate
    is
    available
    for download

    +1 from me, I took it for a spin on the local
    filesystem
    with
    some
    YCSB
    load.

    Here is my signature on the non-secure tarball.


    I didn't check out the secure tarball.

    I think for future releases we should do the voting
    against a
    source
    tar
    (ie an svn export) since we now produce multiple
    binaries,
    and
    it's
    easier
    to verify that a source tar matches SVN, etc.

    -Todd


    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl <
    lhofhansl@yahoo.com>
    wrote:
    The third 0.94.0 RC is available for download
    here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key
    id:
    7CA45750)
    HBase 0.94 is a performance release, and there are
    some
    interesting
    new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients
    should
    work
    with
    0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x
    HBase
    up
    on
    this
    0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419
    Please take this RC for a spin, check out the doc,
    etc,
    and
    vote
    +1/-1
    by
    May 8th on whether we should release this as
    0.94.0.
    Thanks.

    -- Lars


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Lars hofhansl at May 10, 2012 at 12:57 am
    Gentle reminder to please provide your votes.
    (and actually this is 4th RC, rather than the third)

    -- Lars
    ________________________________
    From: lars hofhansl <lhofhansl@yahoo.com>
    To: hbase-dev <dev@hbase.apache.org>
    Sent: Tuesday, May 1, 2012 4:26 PM
    Subject: ANN: The third hbase 0.94.0 release candidate is available for download

    The third 0.94.0 RC is available for download here: http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94 servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.

    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419

    Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars
  • Ramkrishna.S.Vasudevan at May 10, 2012 at 3:46 pm
    Hi Devs

    I discussed this with Lars, thought it would be better to get the opinion of
    the dev list. It is regarding the 0.94 RC

    HBASE-5964 seems to be needed if we want to run with 0.94.0 and hadoop 2.0
    (latest). Also the Guava jar related issue HBASE-5955 may also be needed
    i.e HBASE-5739 patch backport is needed to solve that problem. All this
    happens if we take the latest hadoop 2.0. A little older version of hadoop
    2.0 may not cause this problem.
    But older version has a filehandler leak in DN side for which Todd has
    provided fix in hadoop side. Refer to HDFS-3359. We can easily reproduce
    this problem if we have a multiple column family table and frequent flushes
    happen and it impacts HBase heavily.

    So we felt, all these issue needs to go in 0.94.0 release as we claim 0.94
    supports 0.23.

    What do you think? If you feel this is needed then we might need to go with
    one more RC sinking the current one.

    Regards
    Ram

    -----Original Message-----
    From: lars hofhansl
    Sent: Thursday, May 10, 2012 6:27 AM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    Gentle reminder to please provide your votes.
    (and actually this is 4th RC, rather than the third)

    -- Lars
    ________________________________
    From: lars hofhansl <lhofhansl@yahoo.com>
    To: hbase-dev <dev@hbase.apache.org>
    Sent: Tuesday, May 1, 2012 4:26 PM
    Subject: ANN: The third hbase 0.94.0 release candidate is available for
    download

    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this
    0.94.0RC.

    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=123107
    53&version=12316419

    Please take this RC for a spin, check out the doc, etc, and vote +1/-1
    by May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars
  • Andrew Purtell at May 10, 2012 at 4:33 pm
    In my opinion this can be handled in a point release. It's not critical though important and not new functionality.

    Best regards,

    - Andy

    On May 10, 2012, at 8:41 AM, "Ramkrishna.S.Vasudevan" wrote:

    Hi Devs

    I discussed this with Lars, thought it would be better to get the opinion of
    the dev list. It is regarding the 0.94 RC

    HBASE-5964 seems to be needed if we want to run with 0.94.0 and hadoop 2.0
    (latest). Also the Guava jar related issue HBASE-5955 may also be needed
    i.e HBASE-5739 patch backport is needed to solve that problem. All this
    happens if we take the latest hadoop 2.0. A little older version of hadoop
    2.0 may not cause this problem.
    But older version has a filehandler leak in DN side for which Todd has
    provided fix in hadoop side. Refer to HDFS-3359. We can easily reproduce
    this problem if we have a multiple column family table and frequent flushes
    happen and it impacts HBase heavily.

    So we felt, all these issue needs to go in 0.94.0 release as we claim 0.94
    supports 0.23.

    What do you think? If you feel this is needed then we might need to go with
    one more RC sinking the current one.

    Regards
    Ram

    -----Original Message-----
    From: lars hofhansl
    Sent: Thursday, May 10, 2012 6:27 AM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    Gentle reminder to please provide your votes.
    (and actually this is 4th RC, rather than the third)

    -- Lars
    ________________________________
    From: lars hofhansl <lhofhansl@yahoo.com>
    To: hbase-dev <dev@hbase.apache.org>
    Sent: Tuesday, May 1, 2012 4:26 PM
    Subject: ANN: The third hbase 0.94.0 release candidate is available for
    download

    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this
    0.94.0RC.

    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=123107
    53&version=12316419

    Please take this RC for a spin, check out the doc, etc, and vote +1/-1
    by May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars
  • Todd Lipcon at May 10, 2012 at 5:02 pm

    On Thu, May 10, 2012 at 8:41 AM, Ramkrishna.S.Vasudevan wrote:
    But older version has a filehandler leak in DN side for which Todd has
    provided fix in hadoop side. Refer to HDFS-3359.  We can easily reproduce
    this problem if we have a multiple column family table and frequent flushes
    happen and it impacts HBase heavily.
    Have you seen actual impact from this? Or just a lot of xceivers? We
    noticed and fixed the bug because we saw a lot of xceivers at a
    customer site, but it turned out that the problem they were
    experiencing was actually due to a different issue. The high number of
    xceivers is "unhealthy" but didn't cause errors/downtime.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Rama krishna at May 10, 2012 at 5:18 pm
    HiI will let you know which build exactly produced that problem. When we upgraded the build with that of May 8th or May 7th(exact date i forgot)then we did not get that problem.
    But one thing is flushes were very frequent and the system was heavily loaded.
    RegardsRam
    From: todd@cloudera.com
    Date: Thu, 10 May 2012 10:01:20 -0700
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available for download
    To: dev@hbase.apache.org
    CC: lhofhansl@yahoo.com; ram_krish_86@hotmail.com

    On Thu, May 10, 2012 at 8:41 AM, Ramkrishna.S.Vasudevan
    wrote:
    But older version has a filehandler leak in DN side for which Todd has
    provided fix in hadoop side. Refer to HDFS-3359. We can easily reproduce
    this problem if we have a multiple column family table and frequent flushes
    happen and it impacts HBase heavily.
    Have you seen actual impact from this? Or just a lot of xceivers? We
    noticed and fixed the bug because we saw a lot of xceivers at a
    customer site, but it turned out that the problem they were
    experiencing was actually due to a different issue. The high number of
    xceivers is "unhealthy" but didn't cause errors/downtime.

    -Todd
    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Michael Stack at May 11, 2012 at 4:16 am

    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:
    The third 0.94.0 RC is available for download here: http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)
    I'm +1 on this RC going out as 0.94.0. On the hadoop 2.0.0 issues,
    I'm w/ Andy that we can fix in a point release.

    I loaded a small cluster of 5 nodes running the tip of 0.92 branch
    with 10B rows using the goraci tool (so via gora). I verified that
    there no dangling references with the goraci Verify step. I then
    killed the master and stood up the 0.94.0rc3 master and reran the
    Verify step (a full scan of the table). Did some other poking around
    to make sure all was good w/ a 0.94.0 master in a 0.92 cluster. I
    killed a regionserver and brought up a 0.94.0 and made sure all was
    good. I stopped the cluster and brought it all up 0.94.0. I'd left a
    Verify running during the restart and the MR job completed anyways.
    The Verify step on a clean 0.94.0, a big scan, seems to have run to
    completion in about 15% less time than did 0.92. I checked out the
    logs. Nothing untoward.

    St.Ack
  • Mikael Sitruk at May 11, 2012 at 5:05 am
    Stack do you have latency graph during the time the RS and HMaster were
    down? (did you see a big variance in latency)?
    BTW this test is a MR/scan test or you also have update and delete?

    Thanks
    Mikael.S
    On Fri, May 11, 2012 at 7:15 AM, Stack wrote:
    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:
    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)
    I'm +1 on this RC going out as 0.94.0. On the hadoop 2.0.0 issues,
    I'm w/ Andy that we can fix in a point release.

    I loaded a small cluster of 5 nodes running the tip of 0.92 branch
    with 10B rows using the goraci tool (so via gora). I verified that
    there no dangling references with the goraci Verify step. I then
    killed the master and stood up the 0.94.0rc3 master and reran the
    Verify step (a full scan of the table). Did some other poking around
    to make sure all was good w/ a 0.94.0 master in a 0.92 cluster. I
    killed a regionserver and brought up a 0.94.0 and made sure all was
    good. I stopped the cluster and brought it all up 0.94.0. I'd left a
    Verify running during the restart and the MR job completed anyways.
    The Verify step on a clean 0.94.0, a big scan, seems to have run to
    completion in about 15% less time than did 0.92. I checked out the
    logs. Nothing untoward.

    St.Ack
  • Michael Stack at May 11, 2012 at 5:18 am

    On Thu, May 10, 2012 at 10:04 PM, Mikael Sitruk wrote:
    Stack do you have latency graph during the time the RS and HMaster were
    down? (did you see a big variance in latency)?
    Not sure I follow Mikael. The master is not in the read/write path so
    its restart wouldn't impinge on latency.

    If you are asking about the cluster restart underneath the mapreduce
    job, yes, latency went off the charts since no reads were being served
    while the cluster being rebooted.

    (Pardon me if I am misunderstanding your question)

    BTW this test is a MR/scan test or you also have update and delete?
    Its just a scan (via gora).

    I have not done evaluation beyond what I described. What would you
    like to see Mikael?

    St.Ack
  • Mikael Sitruk at May 11, 2012 at 10:25 am
    Stack hi

    Sorry for not being precise enough.
    The point is that i'm trying to check the impact of HA scenarios. one of
    them is when the master goes down.
    That is true that the Master is not it the critical path of read/write
    unless (please correct me if i'm wrong):
    1. new client are trying to connect
    2. split/merge occurs
    3. another node fails.
    So in case the master goes down and start back, i'm interested to
    understand how long of unavailability the system will be (under the
    scenario above)

    For the second case where a RS is down, there is certainly impact on
    performance/latency since now other RS need to handle the region of the RS
    that is down.
    Again i'm interested to understand the impact on performance/latency of the
    system under failure of a component and for get/put and scan.

    I hope it is clearer and make sense to you.
    BTW if you have such graph can you post them??

    Thanks
    Mikael.S

    On Fri, May 11, 2012 at 8:18 AM, Stack wrote:
    On Thu, May 10, 2012 at 10:04 PM, Mikael Sitruk wrote:
    Stack do you have latency graph during the time the RS and HMaster were
    down? (did you see a big variance in latency)?
    Not sure I follow Mikael. The master is not in the read/write path so
    its restart wouldn't impinge on latency.

    If you are asking about the cluster restart underneath the mapreduce
    job, yes, latency went off the charts since no reads were being served
    while the cluster being rebooted.

    (Pardon me if I am misunderstanding your question)

    BTW this test is a MR/scan test or you also have update and delete?
    Its just a scan (via gora).

    I have not done evaluation beyond what I described. What would you
    like to see Mikael?

    St.Ack
  • Michael Stack at May 12, 2012 at 5:03 am

    On Fri, May 11, 2012 at 3:24 AM, Mikael Sitruk wrote:
    Sorry for not being precise enough.
    The point is that i'm trying to check the impact of HA scenarios. one of
    them is when the master goes down.
    That is true that the Master is not it the critical path of read/write
    unless (please correct me if i'm wrong):
    1. new client are trying to connect
    Clients don't go to the master, not unless they are trying to do
    administrative ops.
    2. split/merge occurs
    3. another node fails.
    If master is down, these events are not processed. On reboot of
    master, it'll finish the processing of these event types.

    So in case the master goes down and start back, i'm interested to
    understand how long of unavailability the system will be (under the
    scenario above)
    For 2. from above, splits shouldn't cause off-line'd-ness (excuse the
    neologism). The regionserver edits .META. on split offlining parent
    and onlining the split daughters. It only bothers to tell the master
    about the splits so master can keep current the state of the cluster
    it keeps in its 'head' (when new master comes online, first thing it
    does is reconstitute this image).

    For item 3. above, its the master that runs the distributed log split.
    If no master, no one to run the split so those regions will be
    offline until a master comes online again, finds the offine server and
    runs a spit of its logs.

    St.Ack
  • Mikael Sitruk at May 12, 2012 at 5:14 pm
    Thanks for the clarifications St.Ack.
    Still I have some questions in regards of 3 in scenario discussed - when a
    region is offline it means that client operation are not possible on it
    (even read)?
    In case a second master is up (in an environment with multiple master), i
    presume all this occurs unless the second master (slave) become the master,
    right? how long those it take for a "slave" master to become a master??

    Mikael.S
    On Sat, May 12, 2012 at 8:02 AM, Stack wrote:
    On Fri, May 11, 2012 at 3:24 AM, Mikael Sitruk wrote:
    Sorry for not being precise enough.
    The point is that i'm trying to check the impact of HA scenarios. one of
    them is when the master goes down.
    That is true that the Master is not it the critical path of read/write
    unless (please correct me if i'm wrong):
    1. new client are trying to connect
    Clients don't go to the master, not unless they are trying to do
    administrative ops.
    2. split/merge occurs
    3. another node fails.
    If master is down, these events are not processed. On reboot of
    master, it'll finish the processing of these event types.

    So in case the master goes down and start back, i'm interested to
    understand how long of unavailability the system will be (under the
    scenario above)
    For 2. from above, splits shouldn't cause off-line'd-ness (excuse the
    neologism). The regionserver edits .META. on split offlining parent
    and onlining the split daughters. It only bothers to tell the master
    about the splits so master can keep current the state of the cluster
    it keeps in its 'head' (when new master comes online, first thing it
    does is reconstitute this image).

    For item 3. above, its the master that runs the distributed log split.
    If no master, no one to run the split so those regions will be
    offline until a master comes online again, finds the offine server and
    runs a spit of its logs.

    St.Ack
  • Michael Stack at May 12, 2012 at 9:51 pm

    On Sat, May 12, 2012 at 10:14 AM, Mikael Sitruk wrote:
    Thanks for the clarifications St.Ack.
    Still I have some questions in regards of 3 in scenario discussed -  when a
    region is offline it means that client operation are not possible on it
    (even read)? Correct.
    In case a second master is up (in an environment with multiple master), i
    presume all this occurs unless the second master (slave) become the master,
    right? how long those it take for a "slave" master to become a master??
    It takes seconds roughly for new master to assume master role and to
    figure the state of the cluster.

    The processing of a failed server though can take seconds, minutes, or
    even hours at an extreme where the server was running with
    pathological configs. How long to process WALs is a function of the
    number of WAL files the server was carrying in need of replay and the
    number of servers available to participate in the distributed log
    splitting affair.

    Ask more questions Mikael,
    St.Ack
  • Mikael Sitruk at May 12, 2012 at 10:15 pm
    Hi St.Ack
    You asked for it :-)
    So in case a RS goes down, the master will split the log and reassign the
    regions to other RS, then each RS will replay the log, during this step the
    regions are unavailable, and clients will got exceptions.
    1. how the master will choose a RS to assign a region?
    2. how many RS will be involved in this reassignment
    3. client that got exception should renew their connections or they can
    reuse the same one?
    4. is there a way to figure out how long this split+replay will take
    (either by formula at the design time of a deployment, or at runtime via
    API asking the master for example)???

    Thanks again
    Mikael.S

    On Sun, May 13, 2012 at 12:50 AM, Stack wrote:
    On Sat, May 12, 2012 at 10:14 AM, Mikael Sitruk wrote:
    Thanks for the clarifications St.Ack.
    Still I have some questions in regards of 3 in scenario discussed - when a
    region is offline it means that client operation are not possible on it
    (even read)? Correct.
    In case a second master is up (in an environment with multiple master), i
    presume all this occurs unless the second master (slave) become the master,
    right? how long those it take for a "slave" master to become a master??
    It takes seconds roughly for new master to assume master role and to
    figure the state of the cluster.

    The processing of a failed server though can take seconds, minutes, or
    even hours at an extreme where the server was running with
    pathological configs. How long to process WALs is a function of the
    number of WAL files the server was carrying in need of replay and the
    number of servers available to participate in the distributed log
    splitting affair.

    Ask more questions Mikael,
    St.Ack
  • Lars hofhansl at May 12, 2012 at 5:26 am
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a 0.94 point release.

    I would like to see a few more +1's before I declare this the official 0.94.0 release.

    Thanks.


    -- Lars

    ________________________________
    From: Stack <stack@duboce.net>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Sent: Thursday, May 10, 2012 9:15 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available for download
    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:
    The third 0.94.0 RC is available for download here: http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)
    I'm +1 on this RC going out as 0.94.0.  On the hadoop 2.0.0 issues,
    I'm w/ Andy that we can fix in a point release.

    I loaded a small cluster of 5 nodes running the tip of 0.92 branch
    with 10B rows using the goraci tool (so via gora).  I verified that
    there no dangling references with the goraci Verify step.  I then
    killed the master and stood up the 0.94.0rc3 master and reran the
    Verify step (a full scan of the table).  Did some other poking around
    to make sure all was good w/ a 0.94.0 master in a 0.92 cluster.  I
    killed a regionserver and brought up a 0.94.0 and made sure all was
    good.  I stopped the cluster and brought it all up 0.94.0.  I'd left a
    Verify running during the restart and the MR job completed anyways.
    The Verify step on a clean 0.94.0, a big scan, seems to have run to
    completion in about 15% less time than did 0.92.  I checked out the
    logs.  Nothing untoward.

    St.Ack
  • Michael Stack at May 12, 2012 at 5:39 am

    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count? Usually the RMs does, if they +1 it. So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a 0.94 point release.

    I would like to see a few more +1's before I declare this the official 0.94.0 release.
    You might be waiting a while (smile). Fellas seem to be busy...

    Good on you Lars,
    St.Ack
  • Lars hofhansl at May 13, 2012 at 5:23 pm
    OK, I'll change my tactic :)

    If there are no -1's by Wed, May 16th, I'll release RC4 as 0.94.0.

    -- Lars



    ----- Original Message -----
    From: Stack <stack@duboce.net>
    To: lars hofhansl <lhofhansl@yahoo.com>
    Cc: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Sent: Friday, May 11, 2012 10:39 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available for download
    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count?  Usually the RMs does, if they +1 it.  So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a 0.94 point release.

    I would like to see a few more +1's before I declare this the official 0.94.0 release.
    You might be waiting a while (smile).  Fellas seem to be busy...

    Good on you Lars,
    St.Ack
  • Ramkrishna.S.Vasudevan at May 14, 2012 at 6:21 am
    Hi
    We (it includes the test team here) 0.94 RC and carried out various
    operations on it.
    Puts, Scans, and all the restart scenarios (using kill -9 also). Even the
    encoding stuffs were tested and carried out our basic test scenarios. Seems
    to work fine.

    Did not test rolling restart with 0.92. By this week we may try to do some
    performance comparison with 0.92.
    Also Lars and I agreed for a point release too.
    So I am +1 on the RC.

    Regards
    Ram
    -----Original Message-----
    From: lars hofhansl
    Sent: Sunday, May 13, 2012 10:53 PM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    OK, I'll change my tactic :)

    If there are no -1's by Wed, May 16th, I'll release RC4 as 0.94.0.

    -- Lars



    ----- Original Message -----
    From: Stack <stack@duboce.net>
    To: lars hofhansl <lhofhansl@yahoo.com>
    Cc: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Sent: Friday, May 11, 2012 10:39 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download
    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count?  Usually the RMs does, if they +1 it.  So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a
    0.94 point release.
    I would like to see a few more +1's before I declare this the
    official 0.94.0 release.
    You might be waiting a while (smile).  Fellas seem to be busy...

    Good on you Lars,
    St.Ack
  • Ramkrishna.S.Vasudevan at May 14, 2012 at 1:59 pm
    Hi

    One small observation after giving +1 on the RC.
    The WAL compression feature causes OOME and causes Full GC.

    The problem is, if we have 1500 regions and I need to create recovered.edits
    for each of the region (I don’t have much data in the regions (~300MB)).
    Now when I try to build the dictionary there is a Node object getting
    created.
    Each node object occupies 32 bytes.
    We have 5 such dictionaries.

    Initially we create indexToNodes array and its size is 32767.

    So now we have 32*5*32767 = ~5MB.

    Now I have 1500 regions.

    So 5MB*1500 = ~7GB.(Excluding actual data). This seems to a very high
    initial memory foot print and this never allows me to split the logs and I
    am not able to make the cluster up at all.

    Our configured heap size was 8GB, tested in 3 node cluster with 5000
    regions, very less data( 1GB in hdfs cluster including replication), some
    small data is spread evenly across all regions.

    The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
    objects)*noofregions.

    I think this initial memory needs to be documented (documentation should do
    for now)or has to be fixed with some workarounds.

    So pls give your thoughts on this.

    Regards
    Ram



    -----Original Message-----
    From: Ramkrishna.S.Vasudevan
    Sent: Monday, May 14, 2012 11:48 AM
    To: dev@hbase.apache.org; 'lars hofhansl'
    Subject: RE: ANN: The third hbase 0.94.0 release candidate is available
    for download

    Hi
    We (it includes the test team here) 0.94 RC and carried out various
    operations on it.
    Puts, Scans, and all the restart scenarios (using kill -9 also). Even
    the
    encoding stuffs were tested and carried out our basic test scenarios.
    Seems
    to work fine.

    Did not test rolling restart with 0.92. By this week we may try to do
    some
    performance comparison with 0.92.
    Also Lars and I agreed for a point release too.
    So I am +1 on the RC.

    Regards
    Ram
    -----Original Message-----
    From: lars hofhansl
    Sent: Sunday, May 13, 2012 10:53 PM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    OK, I'll change my tactic :)

    If there are no -1's by Wed, May 16th, I'll release RC4 as 0.94.0.

    -- Lars



    ----- Original Message -----
    From: Stack <stack@duboce.net>
    To: lars hofhansl <lhofhansl@yahoo.com>
    Cc: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Sent: Friday, May 11, 2012 10:39 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count?  Usually the RMs does, if they +1 it.  So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a
    0.94 point release.
    I would like to see a few more +1's before I declare this the
    official 0.94.0 release.
    You might be waiting a while (smile).  Fellas seem to be busy...

    Good on you Lars,
    St.Ack
  • Ted Yu at May 14, 2012 at 2:17 pm
    Thanks for sharing this information, Ramkrishna.

    Dictionary WAL compression makes replication not functional - see details
    in https://issues.apache.org/jira/browse/HBASE-5778

    I would vote for the removal of Dictionary WAL compression until we make it
    more robust and consuming much less memory.
    On Mon, May 14, 2012 at 6:59 AM, Ramkrishna.S.Vasudevan wrote:

    Hi

    One small observation after giving +1 on the RC.
    The WAL compression feature causes OOME and causes Full GC.

    The problem is, if we have 1500 regions and I need to create
    recovered.edits
    for each of the region (I don’t have much data in the regions (~300MB)).
    Now when I try to build the dictionary there is a Node object getting
    created.
    Each node object occupies 32 bytes.
    We have 5 such dictionaries.

    Initially we create indexToNodes array and its size is 32767.

    So now we have 32*5*32767 = ~5MB.

    Now I have 1500 regions.

    So 5MB*1500 = ~7GB.(Excluding actual data). This seems to a very high
    initial memory foot print and this never allows me to split the logs and I
    am not able to make the cluster up at all.

    Our configured heap size was 8GB, tested in 3 node cluster with 5000
    regions, very less data( 1GB in hdfs cluster including replication), some
    small data is spread evenly across all regions.

    The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
    objects)*noofregions.

    I think this initial memory needs to be documented (documentation should do
    for now)or has to be fixed with some workarounds.

    So pls give your thoughts on this.

    Regards
    Ram



    -----Original Message-----
    From: Ramkrishna.S.Vasudevan
    Sent: Monday, May 14, 2012 11:48 AM
    To: dev@hbase.apache.org; 'lars hofhansl'
    Subject: RE: ANN: The third hbase 0.94.0 release candidate is available
    for download

    Hi
    We (it includes the test team here) 0.94 RC and carried out various
    operations on it.
    Puts, Scans, and all the restart scenarios (using kill -9 also). Even
    the
    encoding stuffs were tested and carried out our basic test scenarios.
    Seems
    to work fine.

    Did not test rolling restart with 0.92. By this week we may try to do
    some
    performance comparison with 0.92.
    Also Lars and I agreed for a point release too.
    So I am +1 on the RC.

    Regards
    Ram
    -----Original Message-----
    From: lars hofhansl
    Sent: Sunday, May 13, 2012 10:53 PM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    OK, I'll change my tactic :)

    If there are no -1's by Wed, May 16th, I'll release RC4 as 0.94.0.

    -- Lars



    ----- Original Message -----
    From: Stack <stack@duboce.net>
    To: lars hofhansl <lhofhansl@yahoo.com>
    Cc: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Sent: Friday, May 11, 2012 10:39 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count? Usually the RMs does, if they +1 it. So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a
    0.94 point release.
    I would like to see a few more +1's before I declare this the
    official 0.94.0 release.
    You might be waiting a while (smile). Fellas seem to be busy...

    Good on you Lars,
    St.Ack
  • Lars hofhansl at May 14, 2012 at 3:16 pm
    It's default off. I'd say we just say it's an experimental feature in the release notes.


    Are you saying we should have another RC?
    There was other stuff that went into 0.94 after I cut the RC, so that would potentially need to stabilize if I cut a new RC now.

    -- Lars

    ________________________________
    From: Ted Yu <yuzhihong@gmail.com>
    To: dev@hbase.apache.org
    Sent: Monday, May 14, 2012 7:17 AM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available for download

    Thanks for sharing this information, Ramkrishna.

    Dictionary WAL compression makes replication not functional - see details
    in https://issues.apache.org/jira/browse/HBASE-5778

    I would vote for the removal of Dictionary WAL compression until we make it
    more robust and consuming much less memory.
    On Mon, May 14, 2012 at 6:59 AM, Ramkrishna.S.Vasudevan wrote:

    Hi

    One small observation after giving +1 on the RC.
    The WAL compression feature causes OOME and causes Full GC.

    The problem is, if we have 1500 regions and I need to create
    recovered.edits
    for each of the region (I don’t have much data in the regions (~300MB)).
    Now when I try to build the dictionary there is a Node object getting
    created.
    Each node object occupies 32 bytes.
    We have 5 such dictionaries.

    Initially we create indexToNodes array and its size is 32767.

    So now we have 32*5*32767 = ~5MB.

    Now I have 1500 regions.

    So 5MB*1500 = ~7GB.(Excluding actual data).  This seems to a very high
    initial memory foot print and this never allows me to split the logs and I
    am not able to make the cluster up at all.

    Our configured heap size was 8GB, tested in 3 node cluster with 5000
    regions, very less data( 1GB in hdfs cluster including replication), some
    small data is spread evenly across all regions.

    The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
    objects)*noofregions.

    I think this initial memory needs to be documented (documentation should do
    for now)or has to be fixed with some workarounds.

    So pls give your thoughts on this.

    Regards
    Ram



    -----Original Message-----
    From: Ramkrishna.S.Vasudevan
    Sent: Monday, May 14, 2012 11:48 AM
    To: dev@hbase.apache.org; 'lars hofhansl'
    Subject: RE: ANN: The third hbase 0.94.0 release candidate is available
    for download

    Hi
    We (it includes the test team here) 0.94 RC and carried out various
    operations on it.
    Puts, Scans, and all the restart scenarios (using kill -9 also).  Even
    the
    encoding stuffs were tested and carried out our basic test scenarios.
    Seems
    to work fine.

    Did not test rolling restart with 0.92. By this week we may try to do
    some
    performance comparison with 0.92.
    Also Lars and I agreed for a point release too.
    So I am +1 on the RC.

    Regards
    Ram
    -----Original Message-----
    From: lars hofhansl
    Sent: Sunday, May 13, 2012 10:53 PM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    OK, I'll change my tactic :)

    If there are no -1's by Wed, May 16th, I'll release RC4 as 0.94.0.

    -- Lars



    ----- Original Message -----
    From: Stack <stack@duboce.net>
    To: lars hofhansl <lhofhansl@yahoo.com>
    Cc: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Sent: Friday, May 11, 2012 10:39 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count?  Usually the RMs does, if they +1 it.  So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a
    0.94 point release.
    I would like to see a few more +1's before I declare this the
    official 0.94.0 release.
    You might be waiting a while (smile).  Fellas seem to be busy...

    Good on you Lars,
    St.Ack
  • Todd Lipcon at May 14, 2012 at 3:18 pm

    On Mon, May 14, 2012 at 8:15 AM, lars hofhansl wrote:
    It's default off. I'd say we just say it's an experimental feature in the release notes.
    +1 for calling it experimental in notes and docs, and not removing it.
    Replication was in an experimental state for quite some time, too, and
    we didn't rip that out - I think shipping things off-by-default with
    clear labeling is one of the best ways to sand down rough edges.

    Are you saying we should have another RC?
    There was other stuff that went into 0.94 after I cut the RC, so that would potentially need to stabilize if I cut a new RC now.

    -- Lars

    ________________________________
    From: Ted Yu <yuzhihong@gmail.com>
    To: dev@hbase.apache.org
    Sent: Monday, May 14, 2012 7:17 AM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available for download

    Thanks for sharing this information, Ramkrishna.

    Dictionary WAL compression makes replication not functional - see details
    in https://issues.apache.org/jira/browse/HBASE-5778

    I would vote for the removal of Dictionary WAL compression until we make it
    more robust and consuming much less memory.

    On Mon, May 14, 2012 at 6:59 AM, Ramkrishna.S.Vasudevan <
    ramkrishna.vasudevan@huawei.com> wrote:
    Hi

    One small observation after giving +1 on the RC.
    The WAL compression feature causes OOME and causes Full GC.

    The problem is, if we have 1500 regions and I need to create
    recovered.edits
    for each of the region (I don’t have much data in the regions (~300MB)).
    Now when I try to build the dictionary there is a Node object getting
    created.
    Each node object occupies 32 bytes.
    We have 5 such dictionaries.

    Initially we create indexToNodes array and its size is 32767.

    So now we have 32*5*32767 = ~5MB.

    Now I have 1500 regions.

    So 5MB*1500 = ~7GB.(Excluding actual data).  This seems to a very high
    initial memory foot print and this never allows me to split the logs and I
    am not able to make the cluster up at all.

    Our configured heap size was 8GB, tested in 3 node cluster with 5000
    regions, very less data( 1GB in hdfs cluster including replication), some
    small data is spread evenly across all regions.

    The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
    objects)*noofregions.

    I think this initial memory needs to be documented (documentation should do
    for now)or has to be fixed with some workarounds.

    So pls give your thoughts on this.

    Regards
    Ram



    -----Original Message-----
    From: Ramkrishna.S.Vasudevan
    Sent: Monday, May 14, 2012 11:48 AM
    To: dev@hbase.apache.org; 'lars hofhansl'
    Subject: RE: ANN: The third hbase 0.94.0 release candidate is available
    for download

    Hi
    We (it includes the test team here) 0.94 RC and carried out various
    operations on it.
    Puts, Scans, and all the restart scenarios (using kill -9 also).  Even
    the
    encoding stuffs were tested and carried out our basic test scenarios.
    Seems
    to work fine.

    Did not test rolling restart with 0.92. By this week we may try to do
    some
    performance comparison with 0.92.
    Also Lars and I agreed for a point release too.
    So I am +1 on the RC.

    Regards
    Ram
    -----Original Message-----
    From: lars hofhansl
    Sent: Sunday, May 13, 2012 10:53 PM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    OK, I'll change my tactic :)

    If there are no -1's by Wed, May 16th, I'll release RC4 as 0.94.0.

    -- Lars



    ----- Original Message -----
    From: Stack <stack@duboce.net>
    To: lars hofhansl <lhofhansl@yahoo.com>
    Cc: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Sent: Friday, May 11, 2012 10:39 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count?  Usually the RMs does, if they +1 it.  So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0 in a
    0.94 point release.
    I would like to see a few more +1's before I declare this the
    official 0.94.0 release.
    You might be waiting a while (smile).  Fellas seem to be busy...

    Good on you Lars,
    St.Ack


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Ramkrishna.S.Vasudevan at May 14, 2012 at 3:35 pm
    +1 on adding release notes. New RC is not required and even my intention
    was not to take new RC. Just a documentation on this would be enough.


    Regards
    Ram
    -----Original Message-----
    From: Todd Lipcon
    Sent: Monday, May 14, 2012 8:48 PM
    To: dev@hbase.apache.org; lars hofhansl
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download
    On Mon, May 14, 2012 at 8:15 AM, lars hofhansl wrote:
    It's default off. I'd say we just say it's an experimental feature in
    the release notes.

    +1 for calling it experimental in notes and docs, and not removing it.
    Replication was in an experimental state for quite some time, too, and
    we didn't rip that out - I think shipping things off-by-default with
    clear labeling is one of the best ways to sand down rough edges.

    Are you saying we should have another RC?
    There was other stuff that went into 0.94 after I cut the RC, so that
    would potentially need to stabilize if I cut a new RC now.
    -- Lars

    ________________________________
    From: Ted Yu <yuzhihong@gmail.com>
    To: dev@hbase.apache.org
    Sent: Monday, May 14, 2012 7:17 AM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is
    available for download
    Thanks for sharing this information, Ramkrishna.

    Dictionary WAL compression makes replication not functional - see details
    in https://issues.apache.org/jira/browse/HBASE-5778

    I would vote for the removal of Dictionary WAL compression until we make it
    more robust and consuming much less memory.

    On Mon, May 14, 2012 at 6:59 AM, Ramkrishna.S.Vasudevan <
    ramkrishna.vasudevan@huawei.com> wrote:
    Hi

    One small observation after giving +1 on the RC.
    The WAL compression feature causes OOME and causes Full GC.

    The problem is, if we have 1500 regions and I need to create
    recovered.edits
    for each of the region (I don’t have much data in the regions
    (~300MB)).
    Now when I try to build the dictionary there is a Node object
    getting
    created.
    Each node object occupies 32 bytes.
    We have 5 such dictionaries.

    Initially we create indexToNodes array and its size is 32767.

    So now we have 32*5*32767 = ~5MB.

    Now I have 1500 regions.

    So 5MB*1500 = ~7GB.(Excluding actual data).  This seems to a very
    high
    initial memory foot print and this never allows me to split the logs
    and I
    am not able to make the cluster up at all.

    Our configured heap size was 8GB, tested in 3 node cluster with 5000
    regions, very less data( 1GB in hdfs cluster including replication),
    some
    small data is spread evenly across all regions.

    The formula is 32(Node object size)*5(No of dictionary)*32767(no of
    node
    objects)*noofregions.

    I think this initial memory needs to be documented (documentation
    should do
    for now)or has to be fixed with some workarounds.

    So pls give your thoughts on this.

    Regards
    Ram



    -----Original Message-----
    From: Ramkrishna.S.Vasudevan
    Sent: Monday, May 14, 2012 11:48 AM
    To: dev@hbase.apache.org; 'lars hofhansl'
    Subject: RE: ANN: The third hbase 0.94.0 release candidate is
    available
    for download

    Hi
    We (it includes the test team here) 0.94 RC and carried out
    various
    operations on it.
    Puts, Scans, and all the restart scenarios (using kill -9 also).
    Even
    the
    encoding stuffs were tested and carried out our basic test
    scenarios.
    Seems
    to work fine.

    Did not test rolling restart with 0.92. By this week we may try to
    do
    some
    performance comparison with 0.92.
    Also Lars and I agreed for a point release too.
    So I am +1 on the RC.

    Regards
    Ram
    -----Original Message-----
    From: lars hofhansl
    Sent: Sunday, May 13, 2012 10:53 PM
    To: dev@hbase.apache.org
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    OK, I'll change my tactic :)

    If there are no -1's by Wed, May 16th, I'll release RC4 as
    0.94.0.
    -- Lars



    ----- Original Message -----
    From: Stack <stack@duboce.net>
    To: lars hofhansl <lhofhansl@yahoo.com>
    Cc: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Sent: Friday, May 11, 2012 10:39 PM
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    On Fri, May 11, 2012 at 10:26 PM, lars hofhansl
    <lhofhansl@yahoo.com>
    wrote:
    Thanks Stack.

    So that's two +1 (mine doesn't count I guess). And no -1.
    Why doesn't yours count?  Usually the RMs does, if they +1 it.
    So,
    that'd be 3x+1 + a non-binding +1.
    I talked to Ram offline, and we'll fix HBase with Hadoop 2.0.0
    in a
    0.94 point release.
    I would like to see a few more +1's before I declare this the
    official 0.94.0 release.
    You might be waiting a while (smile).  Fellas seem to be busy...

    Good on you Lars,
    St.Ack


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Jonathan Hsieh at May 16, 2012 at 4:05 pm
    [Apparently I'm a little late with this]

    +1 on from me with a +1 for a soon following 0.94.1 release.

    On 5 node cluster of vanilla hbase-0.94.0rc3 recompiled for and on top of
    cdh4b2's variant of hadoop 0.23.1:
    - Ran TestLoadAndVerify (bigtop), TestAcidGuarantees, hbck, and
    PerformanceEvaluation for over 24 hours. (sometimes concurrently)
    - Ran a few CopyTable jobs to and from a cdh4b2's hbase-0.92.1 cluster.
    (proxy for get/put client version compatibility)
    - 'mvn clean apache-rat:check' fails but all violations are autogenerated
    files (majority are docs)

    Found a problem with hbck against clusters with >50 regions. HBASE-6018.
    There is a workaround for this issue so shouldn't block release.

    There are at least one test case that fails consistently (TestImportExport)
    against hadoop 0.23.x but this is due to Hadoop 0.23.x MR Mini cluster
    incompatibliities/changes. Not enough to block release.

    Jon.
    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:

    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.

    The full list of changes is available here:

    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12316419

    Please take this RC for a spin, check out the doc, etc, and vote +1/-1 by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    // Jonathan Hsieh (shay)
    // Software Engineer, Cloudera
    // jon@cloudera.com
  • Ramkrishna.S.Vasudevan at May 16, 2012 at 4:29 pm
    See Inline..

    Regards
    Ram
    -----Original Message-----
    From: Jonathan Hsieh
    Sent: Wednesday, May 16, 2012 9:35 PM
    To: dev@hbase.apache.org; lars hofhansl
    Subject: Re: ANN: The third hbase 0.94.0 release candidate is available
    for download

    [Apparently I'm a little late with this]

    +1 on from me with a +1 for a soon following 0.94.1 release.

    On 5 node cluster of vanilla hbase-0.94.0rc3 recompiled for and on top
    of
    cdh4b2's variant of hadoop 0.23.1:
    - Ran TestLoadAndVerify (bigtop), TestAcidGuarantees, hbck, and
    PerformanceEvaluation for over 24 hours. (sometimes concurrently)
    - Ran a few CopyTable jobs to and from a cdh4b2's hbase-0.92.1 cluster.
    (proxy for get/put client version compatibility)
    - 'mvn clean apache-rat:check' fails but all violations are
    autogenerated
    files (majority are docs)

    Found a problem with hbck against clusters with >50 regions. HBASE-
    6018.
    There is a workaround for this issue so shouldn't block release.
    [Ram] We found the same problem yesterday. But just because we had a
    workaround we did not raise and alarm on that.
    There are at least one test case that fails consistently
    (TestImportExport)
    against hadoop 0.23.x but this is due to Hadoop 0.23.x MR Mini cluster
    incompatibliities/changes. Not enough to block release.

    Jon.
    On Tue, May 1, 2012 at 4:26 PM, lars hofhansl wrote:

    The third 0.94.0 RC is available for download here:
    http://people.apache.org/~larsh/hbase-0.94.0-rc3/
    (My gpg key is available from pgp.mit.edu. Key id: 7CA45750)

    HBase 0.94 is a performance release, and there are some interesting new
    features as well.

    It is wire compatible with 0.92.x. 0.92 clients should work with 0.94
    servers and vice versa.

    You can do a rolling restart to get your 0.92.x HBase up on this 0.94.0RC.
    The full list of changes is available here:
    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=123107
    53&version=12316419
    Please take this RC for a spin, check out the doc, etc, and vote +1/- 1 by
    May 8th on whether we should release this as 0.94.0.

    Thanks.

    -- Lars


    --
    // Jonathan Hsieh (shay)
    // Software Engineer, Cloudera
    // jon@cloudera.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedMay 1, '12 at 11:26p
activeMay 16, '12 at 4:29p
posts44
users11
websitehbase.apache.org

People

Translate

site design / logo © 2021 Grokbase