Grokbase Groups HBase dev June 2010

On Tue, Jun 22, 2010 at 9:05 PM, Stack wrote:
On Tue, Jun 22, 2010 at 6:35 PM, Todd Lipcon wrote:

Quick update on the development branch 0.89.20100621.
(Todd you want to explain the version number or do you want me to?)
Sure, here's an explanation, geared towards a general user audience. If
people think this looks good, I'll post it on the wiki (and maybe a blog
post next week):

"The last few releases of HBase have been in lockstep with Hadoop releases
(eg hbase 0.X.* would work with Hadoop 0.X.*). However, Hadoop's release
process has slowed down, and we have the desire to release HBase on a
different timeline from Hadoop, with each HBase release potentially being
compatible with multiple versions of Hadoop. To signify this departure from
lockstep releases, and since we intend for the next release of HBase to
continue to support Hadoop 0.20, we don't want to call the next release

There's also a general sentiment that, feature-wise, HBase is nearing a 1.0
level. The project has implemented the majority of the features described in
the Bigtable paper plus many more. It's also starting to take a more central
role in the production infrastructures of many companies. So, we'd really
like to do a 1.0 release some time in the coming year (no dates, but maybe
early 2011?) However, we're not at 1.0 yet. There are still stability bugs
to iron out, and some features in progress that we'd like to have done for

So, given that, we decided on the dev list that tne next stable release
would be called HBase 0.90. This indicates (a) a big step up from 0.20, (b)
we are nearing 1.0, and (c) we are not tied to Hadoop version numbering.

We're not ready to release 0.90 yet - there are a lot of blockers still open
affecting reliability and stability, and we haven't done extensive testing
of trunk under production workloads. But, the development community feels
there's a lot of stuff in trunk that's worth showing to the user community.
To that end, we are releasing a series of development builds cut from trunk
leading up to the 0.90 release. This development series will have version
numbers 0.89.YYYYMMDD, the last segment of the version number indicating the
date on which the release was branched from trunk. These releases won't have
followup patch releases, and will only go through basic cluster testing, but
provide usable snapshots of trunk development so that the community can
begin to work with the new code and provide early feedback based on their
use cases and testing. We're confident this will help make 0.90 the most
stable release yet.

The first release in the 0.89 series will be 0.89.20100621, due to be
released the week of 6/21/2010.

Is that wording clear and walk the right line between "you should help test
this release" and "this release may not work great"?

I think the following patches still need to go in:
- HBASE-2767 (failed tests building with HDFS-1209)
Thanks, will commit momentarily.
- HBASE-2729 (fix bug when flush hits IOE)
Please paste patch into issue so can review ( is down)
Patch posted... sorry about I was being cheap and using
spot instances, but apparently spot instances are the first thing to get
shut down when ec2 has capacity issues. I'll upgrade it to a real instance
and send the bill to the Cloudera bosses :)

- I'd also like to disable the TestAcidGuarantees test in this branch, since
that is a known bug.

- I'd like to commit a short KNOWN_BUGS file which describes a few of the
open issues that we're currently working on. Certainly doesn't have to be
exhaustive list of all open JIRAs, but just a few things that users may run
Yeah. Just call out the biggies and refer use to the short list
(ahem) of other issues we have against next major release.

Some performance issues were raised during testing at StumbleUpon -- any
luck figuring those out, Ryan/Stack? It would be good to address them for
the dev release, since it sounds like the RS barely makes progress when this
bug is triggered.

Yeah, its a beaut. Sucks all resources for some period of time until
it gets over first flush then its good to go for a while at least.
Still trying to figure it.
OK. I'll continue to do cluster testing and see if I can reproduce this one.
I'd love to fix it before release, but if we have to release with the bug I
think it's worth doing rather than delaying. Agreed? We'll call it out in
known bugs. Alternatively if it would fix it, we could patch out the CAS
spin loop in completeMemstoreInsert and mark a known bug that
read-your-writes consistency is lost under rare circumstances. Whichever you
think is better.

Given the above, I'd like to see if we can get the above two jiras reviewed
and committed later tonight, and I'll try to roll a release candidate before
I go to sleep. I think the correct way to release in this maven land is to
simply do an svn export and vote on that, and then separately do an
assembly:assembly tar as a binary release artifact (since assembly tar
doesn't include unpacked source, etc).
Why not just vote on the assembly? Thats what we'd run? If you do a
site before assembly:assembly, you'll even have docs (mvn install site
assembly;assembly). The source is there to review if wanted. Doing
an svn export will have to build ourselves.
The avro project had this discussion a bit recently, and what basically came
out of it is that Apache releases are source, not binary. It happens that in
the past our releases have had both source and binary in one tar, but it's
important that the actual *release* artifact contain the source. This allows
people to rebuild from the signed release tarball if they like, etc. Since
the mvn assembly tar isn't re-buildable, I think the release artifact has to
be a tarred svn export, and then we have a second release artifact which is
binary+site+docs like you say above. People can choose whether to download
the source release or the binary.

At least the above was my understanding - I'll ping Doug with this thread to
see if he can clarify/confirm.


Todd Lipcon
Software Engineer, Cloudera

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 10 | next ›
Discussion Overview
groupdev @
categorieshbase, hadoop
postedJun 23, '10 at 1:36a
activeJun 23, '10 at 11:36p



site design / logo © 2022 Grokbase