FAQ
I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?

-- Owen

Search Discussions

  • Eli Collins at Apr 30, 2011 at 2:21 am
    Hey Owen,

    I took a quick look at the changes in the branch (specifically the
    range of 200 or so changes where the first line of the commit doesn't
    reference a jira). Most of these look like backports of patches on
    jira, however there also seem to be changes that don't correspond to
    changes in trunk or patches on jiras. Some of these introduce new
    configuration options (eg hadoop.security.uid.cache.secs) or public
    classes (eg QueueProcessingStatistics) that don't exist in trunk.

    How do we ensure future releases won't violate compatibility with
    respect to this release? Do we plan to have jiras with patches against
    trunk for these changes, at least for the set of changes that affect
    public APIs? If so, should that come first?

    Thanks,
    Eli
    On Fri, Apr 29, 2011 at 4:09 PM, Owen O'Malley wrote:
    I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?

    -- Owen
  • Owen O'Malley at Apr 30, 2011 at 2:17 pm

    On Fri, Apr 29, 2011 at 7:21 PM, Eli Collins wrote:

    I took a quick look at the changes in the branch

    Thanks for taking a look. Please continue to inspect and test out the
    release and vote. I'm really excited that we'll have a release that has
    security and the "fred" user limits. Users desperately need these
    improvements. Furthermore, it is really important that Hadoop gets back on
    to a regular release cycle with releases coming out frequently. The current
    stable release of 0.20.2 was released a year ago, which is much much too
    long.

    How do we ensure future releases won't violate compatibility with
    respect to this release?

    We are still in catchup mode in terms of making sure that everything gets
    committed to trunk. Of course our users will correctly complain if the later
    releases have regressions relative to this release. Thanks for pointing out
    the issue.

    Do we plan to have jiras with patches against
    trunk for these changes, at least for the set of changes that affect
    public APIs?

    Yes, I'll work on ensuring the necessary patches get applied to trunk.

    If so, should that come first?
    The most import question is this release candidate a usable replacement and
    improvement on the current stable release 0.20.2. I believe it is a huge
    improvement and should be released.

    -- Owen
  • Eli Collins at May 1, 2011 at 2:19 am

    On Sat, Apr 30, 2011 at 7:17 AM, Owen O'Malley wrote:
    On Fri, Apr 29, 2011 at 7:21 PM, Eli Collins wrote:

    I took a quick look at the changes in the branch

    Thanks for taking a look. Please continue to inspect and test out the
    release and vote. I'm really excited that we'll have a release that has
    security and the "fred" user limits. Users desperately need these
    improvements. Furthermore, it is really important that Hadoop gets back on
    to a regular release cycle with releases coming out frequently. The current
    stable release of 0.20.2 was released a year ago, which is much much too
    long.
    Agree. Is branch-0.20 dead now? Ie the 20.3 and 20.4 fix versions
    will never be released? Are all the patches that have been committed
    there since 0.20.2 (expecting that they'd be released) in
    branch-0.20-security-203? Users that saw jiras closed out with issues
    committed to branch-0.20 with fix version 20.3 probably expect those
    come out in this release.
    How do we ensure future releases won't violate compatibility with
    respect to this release?

    We are still in catchup mode in terms of making sure that everything gets
    committed to trunk. Of course our users will correctly complain if the later
    releases have regressions relative to this release. Thanks for pointing out
    the issue.
    Seems like we need to do this before releasing 0.20.203 to prevent
    blocking the upcoming 0.22 release, due to it being a regression
    against the stable release (the sooner a release from trunk can
    replace a 0.20 based release the better).
    Do we plan to have jiras with patches against
    trunk for these changes, at least for the set of changes that affect
    public APIs?

    Yes, I'll work on ensuring the necessary patches get applied to trunk.
    Awesome, thanks.
    If so, should that come first?
    The most import question is this release candidate a usable replacement and
    improvement on the current stable release 0.20.2. I believe it is a huge
    improvement and should be released.
    Definitely a huge improvement, thanks for all the work. I had a
    specific concern that it might block 0.22, and a general concern that
    we don't want to set a precedent of releasing code that didn't go
    through the normal code change (patch review and vote on jira/list).
    Thank you for addressing this via getting the patches on trunk.

    Thanks,
    Eli
  • Arun C Murthy at May 2, 2011 at 2:21 am
    Eli,
    On Apr 30, 2011, at 7:19 PM, "Eli Collins" wrote:
    Seems like we need to do this before releasing 0.20.203 to prevent
    blocking the upcoming 0.22 release, due to it being a regression
    against the stable release (the sooner a release from trunk can
    replace a 0.20 based release the better).
    I don't see the issue - we can just mark the appropriate jiras as blockers on 0.22 or 0.23 as necessary and release 0.20.203, correct? The RMs for the releases and the rest of us can help make that distinction. As everyone agrees we need to get back into the habit of making timely and progressive releases, both 0.20.203 & 0.22 are steps in the same direction.

    thanks,
    Arun
  • Eli Collins at May 2, 2011 at 6:00 am

    On Sun, May 1, 2011 at 7:20 PM, Arun C Murthy wrote:
    Eli,
    On Apr 30, 2011, at 7:19 PM, "Eli Collins" wrote:
    Seems like we need to do this before releasing 0.20.203 to prevent
    blocking the upcoming 0.22 release, due to it being a regression
    against the stable release (the sooner a release from trunk can
    replace a 0.20 based release the better).
    I don't see the issue - we can just mark the appropriate jiras as blockers on 0.22 or 0.23 as necessary and release 0.20.203, correct? The RMs for the releases and the rest of us can help make that distinction. As everyone agrees we need to get back into the habit of making timely and progressive releases, both 0.20.203 & 0.22 are steps in the same direction.
    Marking those issues as blockers for the next release (0.22) slows
    down the next release. As you say we should be doing things that help
    us make timely and progressive releases from trunk, this does the
    opposite, if I understand correctly.

    Thanks,
    Eli
  • Devaraj Das at May 1, 2011 at 3:12 am
    +1 based on some single node tests I did (with security ON).


    On 4/29/11 4:09 PM, "Owen O'Malley" wrote:

    I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?

    -- Owen
  • Nigel Daley at May 2, 2011 at 3:53 am
    I would like to see CI setup on this branch before we release anything from it. I've copied the 0.20 build config and tried running it on this branch, but getting a native compile failure: https://builds.apache.org/hudson/view/G-L/view/Hadoop/job/Hadoop-0.20.203-Build/1/console

    Nige
    On Apr 30, 2011, at 8:11 PM, Devaraj Das wrote:

    +1 based on some single node tests I did (with security ON).


    On 4/29/11 4:09 PM, "Owen O'Malley" wrote:

    I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?

    -- Owen
  • Owen O'Malley at May 2, 2011 at 4:07 pm

    On May 1, 2011, at 8:52 PM, Nigel Daley wrote:

    I would like to see CI setup on this branch before we release anything from it. I've copied the 0.20 build config and tried running it on this branch, but getting a native compile failure: https://builds.apache.org/hudson/view/G-L/view/Hadoop/job/Hadoop-0.20.203-Build/1/console
    An Apache CI build is a nice to have, but clearly isn't required.

    It looks to be failing on the standard libc functions. Which distribution and version of Linux is it? Which version of gcc and libc are you using? You are probably going to need to log in and build by hand to see what is going on in that environment.

    I compiled it on:
    RedHat Enterprise Linux Server release 5.4
    gcc 4.1.2
    automake 1.9.6
    autoconf 2.59

    -- Owen
  • Nigel Daley at May 3, 2011 at 4:39 am

    On May 2, 2011, at 9:07 AM, Owen O'Malley wrote:

    On May 1, 2011, at 8:52 PM, Nigel Daley wrote:

    I would like to see CI setup on this branch before we release anything from it. I've copied the 0.20 build config and tried running it on this branch, but getting a native compile failure: https://builds.apache.org/hudson/view/G-L/view/Hadoop/job/Hadoop-0.20.203-Build/1/console
    An Apache CI build is a nice to have, but clearly isn't required.
    It looks to be failing on the standard libc functions. Which distribution and version of Linux is it? Which version of gcc and libc are you using? You are probably going to need to log in and build by hand to see what is going on in that environment.

    I compiled it on:
    RedHat Enterprise Linux Server release 5.4
    gcc 4.1.2
    automake 1.9.6
    autoconf 2.59

    The failing Hudson job was using the exact same machine as Hadoop 0.20 which passes fine.

    The build machine is using these:
    Ubuntu 9.04
    gcc 4.3.3
    automake 1.10.2
    autoconf 2.63

    FWIW, after today's commits to the branch, the build now progresses past this problem and now fails during eclipse-plugin compile. So I guess it was a real problem. I've now configured build failure messages on the branch to come to this list.

    Nige
  • Chris Douglas at May 1, 2011 at 11:55 pm
    +1

    Signature matches, md5/sha1 match. Also tried a basic HDFS upgrade
    from 0.20.2 to 0.20.203 with fresh configs on a single node: all OK,
    including rollback to 0.20.2. -C
    On Fri, Apr 29, 2011 at 4:09 PM, Owen O'Malley wrote:
    I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?

    -- Owen
  • Doug Cutting at May 2, 2011 at 5:58 pm

    On 04/29/2011 04:09 PM, Owen O'Malley wrote:
    I think everything is ready to go on the 0.20.203.0 release. It
    includes security and a lot of improvements in the capacity scheduler
    and JobTracker.
    This does not appear to include the 0.20-append work? So it's not
    advisable to use HBase with this revision, correct?
    The patch selection process for this branch did not appear to be a
    community process. A massive patch set was committed en-masse with no
    public discussion before or after about its specific composition.

    Long-term the users of a project benefit from a community that
    collaborates using open, interactive processes. If a particular set of
    patches, not created through such a process, is valuable to end users,
    then it can be distributed on github or elsewhere under a different
    name, but should not be granted the imprimatur of a community product.

    Doug
  • Eric Baldeschwieler at May 2, 2011 at 8:06 pm
    Hi folks,

    This strikes me as a bit odd. I think we have already discussed this at length and agreed that a release could proceed.

    Since then, Arun and Owen have worked actively to incorporated community feedback into this release.

    All parties making Hadoop releases other then Apache have already incorporated most of the patches in this release into their products, including doug's organization. I don't see how Hadoop's users benefit from Apache not incorporating them into an Apache release.

    As previously discussed, all parties are welcome to champion altenative releases from Apache if they want to invest in making Apache Hadoop better.

    Thanks!!

    E14

    ---
    E14 - typing on glass
    On May 2, 2011, at 12:16 PM, "Ian Holsman" wrote:

    moving this thread to general@
    On May 3, 2011, at 3:58 AM, Doug Cutting wrote:

    Should we release
    http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?
    The patch selection process for this branch did not appear to be a
    community process. A massive patch set was committed en-masse with no
    public discussion before or after about its specific composition.
    guys...
    1. do we agree this is an issue
    2. if it is, how we do get the communication & discussion on list?

    what do people think are the major issues that are stopping people talking about stuff on list are?
  • Eli Collins at May 2, 2011 at 9:02 pm
    Hey Eric,

    I don't have any objections to a release from
    branch-0.20-security-203. However when I examined the specific patch
    set I noticed the are important implications with respect to
    compatibility (of for 0.20.2 and 0.22), a question about project model
    (eg not reviewing patches on jira before committing them, not having
    patches go through trunk, etc), and some open questions for users (eg
    is this the next dot release of the stable branch?).

    I agree this is a valuable artifact, but that doesn't mean it's OK to
    ignore compatibility concerns, etc.

    I've listed specifics questions/comments here:
    http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201105.mbox/%[email protected]%3E

    Thanks,
    Eli

    On Mon, May 2, 2011 at 1:05 PM, Eric Baldeschwieler
    wrote:
    Hi folks,

    This strikes me as a bit odd. I think we have already discussed this at length and agreed that a release could proceed.

    Since then, Arun and Owen have worked actively to incorporated community feedback into this release.

    All parties making Hadoop releases other then Apache have already incorporated most of the patches in this release into their products, including doug's organization. I don't see how Hadoop's users benefit from Apache not incorporating them into an Apache release.

    As previously discussed, all parties are welcome to champion altenative releases from Apache if they want to invest in making Apache Hadoop better.

    Thanks!!

    E14

    ---
    E14 - typing on glass
    On May 2, 2011, at 12:16 PM, "Ian Holsman" wrote:

    moving this thread to general@
    On May 3, 2011, at 3:58 AM, Doug Cutting wrote:

    Should we release
    http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?
    The patch selection process for this branch did not appear to be a
    community process.  A massive patch set was committed en-masse with no
    public discussion before or after about its specific composition.
    guys...
    1. do we agree this is an issue
    2. if it is, how we do get the communication & discussion on list?

    what do people think are the major issues that are stopping people talking about stuff on list are?
  • Arun C Murthy at May 2, 2011 at 8:08 pm
    Doug,
    On May 2, 2011, at 10:58 AM, Doug Cutting wrote:
    The patch selection process for this branch did not appear to be a
    community process. A massive patch set was committed en-masse with no
    public discussion before or after about its specific composition.
    Lets review:

    # You proposed to release off the Yahoo security patchset first in
    April, 2010: http://s.apache.org/5Gv
    # I started this discussion again in Jan, 2011: http://s.apache.org/uf
    # We went through several iterations:
    - I first committed a jumbo patch upon which some reservations were
    expressed.
    - Owen went ahead and broke them up to commit individual patches to
    incorporate the provided feedback.
    # Roy clearly clarified the way forward: http://s.apache.org/tD4
    (which Owen has since incorporatedk by breaking into individual
    patches).

    Your current stance given the history, is surprising, to say the
    least... we have already discussed this. It is clear that the
    community (including downstream Apache projects like Pig, Hive and
    HCatalog) will substantially benefit from an Apache release of this
    improved codebase.

    thanks,
    Arun
  • Alan Gates at May 2, 2011 at 6:38 pm
    From the viewpoint of a downstream user, I'd like to see this
    released. Right now Hive 0.7 and soon HCatalog 0.1 have to depend on
    a Cloudera distribution because they need security. Having Apache
    products depend on 3rd party distributions of Apache products is
    bogus. The sooner this is out the sooner we can fix this.

    Alan.
    On Apr 29, 2011, at 4:09 PM, Owen O'Malley wrote:

    I think everything is ready to go on the 0.20.203.0 release. It
    includes security and a lot of improvements in the capacity
    scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?

    -- Owen
  • Doug Cutting at May 2, 2011 at 7:04 pm

    On 05/02/2011 11:37 AM, Alan Gates wrote:
    From the viewpoint of a downstream user, I'd like to see this released.
    Right now Hive 0.7 and soon HCatalog 0.1 have to depend on a Cloudera
    distribution because they need security. Having Apache products depend
    on 3rd party distributions of Apache products is bogus. The sooner this
    is out the sooner we can fix this.
    Alan,

    Cloudera could upload its CDH3 patchset to a branch in Apache subversion
    and call a release vote on it and I would vote against it. The
    interactive community process is to me what makes it Apache.

    Releases should branch from trunk or use an existing release branch. A
    release branch should be open for patches from the general community.
    Neither were the case here. This is neither a subset or a superset of
    the 0.20 branch that the community has invested in. The change log for
    this includes around 500 changes, yet only 24 issues are assigned to it
    in Jira, the community's issue tracker.

    Yes, the current situation is bad, but shortcutting the community
    process doesn't fix it, it just hides it.

    Cheers,

    Doug
  • Eli Collins at May 2, 2011 at 7:17 pm

    On Fri, Apr 29, 2011 at 4:09 PM, Owen O'Malley wrote:
    I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?
    Based on the discussion I still have the following questions:

    1. Does this release replace subsequent releases from branch-0.20? Ie
    is the goal to replace the 0.20.3 or 0.20.4 release with releases from
    the branch-0.20-security branches? If not, where does this release
    fit in? If so, I think we need to do the following before releasing
    from branch-0.20-security:

    - Make sure branch-0.20-security-203 contains the patches from 0.20.2,
    since this branch is based on 0.20.1 it's not clear that it doesn't
    regress against the current stable 0.20 release. Perhaps the best way
    to do this is via a rebase.

    - Make sure branch-0.20-security-203 (and future 0.20 based release
    branches) contain the patches that were checked in for 0.20.3 and
    0.20.4. These branches contain important bug fixes (eg HDFS-1258,
    HDFS-909, etc) that are not present in this branch, and should be. The
    expectation of people that checked in patches to branch-0.20 and the
    users who filed the jiras is that they be fixed in the next stable
    release.

    - Remove the 0.20.3 and 0.20.4 fix versions from jira to make it clear
    what the next release is.


    2. What are the compatibility implications? Specifically, do we need
    to block the next major release (0.22) on getting patches in this
    release committed to trunk? Should the pace of major version releases
    be slowed down by minor version releases?


    3. Patches normally go through jira, get reviewed, committed to trunk,
    and then merged to a release branch. Why not use the same process
    here? I'm concerned that we're setting a precedent that patches don't
    need to be reviewed and voted on.


    Given that we're releasing common, hdfs and mapreduce perhaps general@
    is a better place than common-dev@ for release discussion.

    Thanks,
    Eli
  • Tom White at May 2, 2011 at 7:32 pm

    On Mon, May 2, 2011 at 12:16 PM, Eli Collins wrote:
    On Fri, Apr 29, 2011 at 4:09 PM, Owen O'Malley wrote:
    I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?
    Based on the discussion I still have the following questions:

    1. Does this release replace subsequent releases from branch-0.20? Ie
    is the goal to replace the 0.20.3 or 0.20.4 release with releases from
    the branch-0.20-security branches?  If not, where does this release
    fit in?  If so, I think we need to do the following before releasing
    from branch-0.20-security:

    - Make sure branch-0.20-security-203 contains the patches from 0.20.2,
    since this branch is based on 0.20.1 it's not clear that it doesn't
    regress against the current stable 0.20 release. Perhaps the best way
    to do this is via a rebase.
    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.

    HADOOP-5611
    HADOOP-5612
    HADOOP-5623
    HADOOP-5759
    HADOOP-6269
    HADOOP-6315
    HADOOP-6386
    HADOOP-6428
    HADOOP-6575
    HADOOP-6576
    HDFS-579
    HDFS-596
    HDFS-723
    HDFS-732
    HDFS-792
    MAPREDUCE-623
    MAPREDUCE-1070
    MAPREDUCE-1163
    MAPREDUCE-1251
    - Make sure branch-0.20-security-203 (and future 0.20 based release
    branches) contain the patches that were checked in for 0.20.3 and
    0.20.4. These branches contain important bug fixes (eg HDFS-1258,
    HDFS-909, etc) that are not present in this branch, and should be. The
    expectation of people that checked in patches to branch-0.20 and the
    users who filed the jiras is that they be fixed in the next stable
    release.
    These JIRAs are the ones committed to the 0.20 branch (for 0.20.3) but
    are not marked as being in 0.20.203.0

    HADOOP-6724
    HADOOP-6833
    HADOOP-6881
    HADOOP-6923
    HADOOP-6928
    HADOOP-7116
    HDFS-1024
    HDFS-1041
    HDFS-1240
    HDFS-1258
    HDFS-1377
    HDFS-1404
    HDFS-1406
    HDFS-727
    HDFS-908
    HDFS-909
    MAPREDUCE-1280
    MAPREDUCE-1407
    MAPREDUCE-1734
    MAPREDUCE-1832
    MAPREDUCE-1880
    MAPREDUCE-2262

    Tom
    - Remove the 0.20.3 and 0.20.4 fix versions from jira to make it clear
    what the next release is.


    2. What are the compatibility implications?  Specifically, do we need
    to block the next major release (0.22) on getting patches in this
    release committed to trunk?  Should the pace of major version releases
    be slowed down by minor version releases?


    3. Patches normally go through jira, get reviewed, committed to trunk,
    and then merged to a release branch.  Why not use the same process
    here?  I'm concerned that we're setting a precedent that patches don't
    need to be reviewed and voted on.


    Given that we're releasing common, hdfs and mapreduce perhaps general@
    is a better place than common-dev@ for release discussion.

    Thanks,
    Eli
  • Stack at May 2, 2011 at 9:01 pm
    How hard would it be to get the patches Tom lists below into
    branch-0.20-security-203? I'd think it'd be an easier sell if it were
    a superset of all in 0.20, especially since it bears its name.

    Otherwise, glad to see the release candidate.

    St.Ack
  • Arun C Murthy at May 2, 2011 at 10:02 pm

    On May 2, 2011, at 12:31 PM, Tom White wrote:
    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163

    # Fixed, missing in CHANGES.txt

    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428

    # Build, not necessary
    * MAPREDUCE-1251 (fixed)

    Broken tests, fixed already
    * HADOOP-6575
    * HADOOP-6576
    * HDFS-792
    -- MAPREDUCE-623

    I'll work fix the 'remaining' ones.

    thanks,
    Arun
  • Arun C Murthy at May 2, 2011 at 11:57 pm
    On May 2, 2011, at 3:01 PM, Arun C Murthy wrote:
    On May 2, 2011, at 12:31 PM, Tom White wrote:
    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163
    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428
    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are relevant/
    necessary, I'll check with Cos. Other than that hadoop-0.20.203 now a
    superset of hadoop-0.20.2.

    thanks,
    Arun
  • Arun C Murthy at May 2, 2011 at 11:59 pm
    On May 2, 2011, at 4:56 PM, Arun C Murthy wrote:
    On May 2, 2011, at 3:01 PM, Arun C Murthy wrote:

    On May 2, 2011, at 12:31 PM, Tom White wrote:
    I just did a quick search, and these are the JIRAs that are in
    0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163
    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428
    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are relevant/
    necessary, I'll check with Cos. Other than that hadoop-0.20.203 now a
    superset of hadoop-0.20.2.
    Missed adding HADOOP-5759 to that list, I'll check with Amareshwari
    before committing.

    Arun
  • Ian Holsman at May 3, 2011 at 12:13 am

    On May 3, 2011, at 9:58 AM, Arun C Murthy wrote:


    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are relevant/
    necessary, I'll check with Cos. Other than that hadoop-0.20.203 now a
    superset of hadoop-0.20.2.
    Missed adding HADOOP-5759 to that list, I'll check with Amareshwari before committing.

    Arun
    Thanks for doing this so fast Arun.
  • Konstantin Shvachko at May 3, 2011 at 8:40 am
    I think its a good idea to release hadoop-0.20.203. It moves Apache Hadoop a
    step forward.

    Looks like the technical difficulties are resolved now with latest Arun's
    commits.
    Being a superset of hadoop-0.20.2 it can be considered based on one of the
    official Apache releases.
    I don't think there was a lack of discussions on the lists about the issues
    included in the release candidate. Todd did a thorough review of the entire
    security branch. Many developers participated in discussions.
    Agreeing with Stack I wish HBase was considered a primary target for Hadoop
    support. But it is not realistic to have it in hadoop-0.20.203.
    I have some experience running a version of this release candidate on a
    large cluster. It works. I would add a couple of patches, which make it run
    on Windows for me like HADOOP-7110, HADOOP-7126. But those are not blockers.

    Thanks,
    --Konstantin

    On Mon, May 2, 2011 at 5:12 PM, Ian Holsman wrote:

    On May 3, 2011, at 9:58 AM, Arun C Murthy wrote:


    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are relevant/
    necessary, I'll check with Cos. Other than that hadoop-0.20.203 now a
    superset of hadoop-0.20.2.
    Missed adding HADOOP-5759 to that list, I'll check with Amareshwari
    before committing.
    Arun
    Thanks for doing this so fast Arun.
  • Konstantin Shvachko at May 8, 2011 at 5:42 am
    -1 for rc1

    I downloaded and ran the test target 3 times.

    First run failed because my umask is defaulted to 0002, which is a known
    problem HADOOP-5050 committed to 0.21 but not 0.20.
    Set umask to 0022 and re-ran test twice. Both resulted in failure. Here is
    the list of failed tests:
    [junit] Test org.apache.hadoop.mapred.TestJobTrackerRestart FAILED
    (timeout)
    [junit] Test
    org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker FAILED
    [junit] Test org.apache.hadoop.mapred.TestJobTrackerSafeMode FAILED
    [junit] Test org.apache.hadoop.mapred.TestMiniMRMapRedDebugScript FAILED
    [junit] Test org.apache.hadoop.mapred.TestRecoveryManager FAILED
    [junit] Test org.apache.hadoop.mapred.TestTTMemoryReporting FAILED
    [junit] Test org.apache.hadoop.mapred.TestTaskTrackerLocalization FAILED
    [junit] Test org.apache.hadoop.hdfsproxy.TestHdfsProxy FAILED

    I am in favor of releasing hadoop-0.20.203.
    And we run a version of this release on a large cluster at eBay. I know it
    works.
    I understand the controversy behind it. I regret it hasn't been developed in
    a true community way.
    I think it nevertheless adds value to Apache Hadoop.
    Lets just make sure it passes the tests.

    Thanks,
    --Konstantin


    On Tue, May 3, 2011 at 1:39 AM, Konstantin Shvachko wrote:

    I think its a good idea to release hadoop-0.20.203. It moves Apache Hadoop
    a step forward.

    Looks like the technical difficulties are resolved now with latest Arun's
    commits.
    Being a superset of hadoop-0.20.2 it can be considered based on one of the
    official Apache releases.
    I don't think there was a lack of discussions on the lists about the issues
    included in the release candidate. Todd did a thorough review of the entire
    security branch. Many developers participated in discussions.
    Agreeing with Stack I wish HBase was considered a primary target for Hadoop
    support. But it is not realistic to have it in hadoop-0.20.203.
    I have some experience running a version of this release candidate on a
    large cluster. It works. I would add a couple of patches, which make it run
    on Windows for me like HADOOP-7110, HADOOP-7126. But those are not blockers.

    Thanks,
    --Konstantin

    On Mon, May 2, 2011 at 5:12 PM, Ian Holsman wrote:

    On May 3, 2011, at 9:58 AM, Arun C Murthy wrote:


    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are relevant/
    necessary, I'll check with Cos. Other than that hadoop-0.20.203 now a
    superset of hadoop-0.20.2.
    Missed adding HADOOP-5759 to that list, I'll check with Amareshwari
    before committing.
    Arun
    Thanks for doing this so fast Arun.
  • Konstantin Boudnik at May 3, 2011 at 4:44 am

    On Mon, May 2, 2011 at 16:56, Arun C Murthy wrote:
    On May 2, 2011, at 3:01 PM, Arun C Murthy wrote:

    On May 2, 2011, at 12:31 PM, Tom White wrote:

    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163
    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428
    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are
    relevant/necessary, I'll check with Cos. Other than that hadoop-0.20.203
    now a superset of hadoop-0.20.2.
    I have looked somewhat more into these two JIRAs and if I remember correctly
    this fix causes a rolling port side effect in TT and it has been reverted in
    0.20.200 (Y! Fred? release) because Ops weren't happy about this (I am sure
    you can check internal Git to cross-verify my recollection).

    Considering above, these might be better left outside of the release and,
    perhaps, they should be reverted in trunk as well.

    Cos

    thanks,
    Arun
  • Arun C Murthy at May 3, 2011 at 5:19 am

    On May 2, 2011, at 9:43 PM, Konstantin Boudnik wrote:
    I have looked somewhat more into these two JIRAs and if I remember
    correctly
    this fix causes a rolling port side effect in TT and it has been
    reverted in
    0.20.200 (Y! Fred? release) because Ops weren't happy about this (I
    am sure
    you can check internal Git to cross-verify my recollection).

    Considering above, these might be better left outside of the release
    and,
    perhaps, they should be reverted in trunk as well.
    Thanks Cos.

    I looked and we aren't running with these in our internal clusters.
    So, based on this discussion I'll leave these out for now. We can add
    them back later if necessary in a future release.

    Arun
  • Konstantin Boudnik at May 3, 2011 at 5:25 am

    On Mon, May 2, 2011 at 22:18, Arun C Murthy wrote:
    On May 2, 2011, at 9:43 PM, Konstantin Boudnik wrote:

    I have looked somewhat more into these two JIRAs and if I remember
    correctly
    this fix causes a rolling port side effect in TT and it has been reverted
    in
    0.20.200 (Y! Fred? release) because Ops weren't happy about this (I am
    sure
    you can check internal Git to cross-verify my recollection).

    Considering above, these might be better left outside of the release and,
    perhaps, they should be reverted in trunk as well.
    Thanks Cos.

    I looked and we aren't running with these in our internal clusters. So,
    based on this discussion I'll leave these out for now. We can add them back
    later if necessary in a future release.
    +1
  • Koji Noguchi at May 3, 2011 at 6:22 pm

    except
    HADOOP-6386 and HADOOP-6428.
    causes a rolling port side effect in TT
    I remember bugging Cos and Rob to revert HADOOP-6386.
    https://issues.apache.org/jira/browse/HADOOP-6760?focusedCommentId=12867342&
    page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
    t-12867342

    Koji
    On 5/2/11 9:43 PM, "Konstantin Boudnik" wrote:
    On Mon, May 2, 2011 at 16:56, Arun C Murthy wrote:


    On May 2, 2011, at 3:01 PM, Arun C Murthy wrote:

    On May 2, 2011, at 12:31 PM, Tom White wrote:

    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163
    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428
    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are
    relevant/necessary, I'll check with Cos. Other than that hadoop-0.20.203
    now a superset of hadoop-0.20.2.
    I have looked somewhat more into these two JIRAs and if I remember correctly
    this fix causes a rolling port side effect in TT and it has been reverted in
    0.20.200 (Y! Fred? release) because Ops weren't happy about this (I am sure
    you can check internal Git to cross-verify my recollection).

    Considering above, these might be better left outside of the release and,
    perhaps, they should be reverted in trunk as well.

    Cos

    thanks,
    Arun
  • Konstantin Boudnik at May 3, 2011 at 6:33 pm
    Yup, exactly right - it has been reverted in the trunk as well. Thanks
    for digging this up, Koji!
    On Tue, May 3, 2011 at 11:22, Koji Noguchi wrote:
    except
    HADOOP-6386 and HADOOP-6428.
    causes a rolling port side effect in TT
    I remember bugging Cos and Rob to revert HADOOP-6386.
    https://issues.apache.org/jira/browse/HADOOP-6760?focusedCommentId=12867342&
    page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
    t-12867342

    Koji
    On 5/2/11 9:43 PM, "Konstantin Boudnik" wrote:
    On Mon, May 2, 2011 at 16:56, Arun C Murthy wrote:


    On May 2, 2011, at 3:01 PM, Arun C Murthy wrote:

    On May 2, 2011, at 12:31 PM, Tom White wrote:

    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163
    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428
    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are
    relevant/necessary, I'll check with Cos.  Other than that hadoop-0.20.203
    now a superset of hadoop-0.20.2.
    I have looked somewhat more into these two JIRAs and if I remember correctly
    this fix causes a rolling port side effect in TT and it has been reverted in
    0.20.200 (Y! Fred? release) because Ops weren't happy about this (I am sure
    you can check internal Git to cross-verify my recollection).

    Considering above, these might be better left outside of the release and,
    perhaps, they should be reverted in trunk as well.

    Cos

    thanks,
    Arun
  • Jakob Homan at May 3, 2011 at 7:52 pm
    Tested the RC on a single node cluster, kicked the tires. Looks good.
    +1 on its release.

    Regardless of how the RC got here, we only get benefit from releasing
    it. It represents a huge chunk of work from our contributors,
    provides needed features for our users and moves us one step closer to
    making regular releases again.

    -Jakob

    On Tue, May 3, 2011 at 11:33 AM, Konstantin Boudnik wrote:
    Yup, exactly right - it has been reverted in the trunk as well. Thanks
    for digging this up, Koji!
    On Tue, May 3, 2011 at 11:22, Koji Noguchi wrote:
    except
    HADOOP-6386 and HADOOP-6428.
    causes a rolling port side effect in TT
    I remember bugging Cos and Rob to revert HADOOP-6386.
    https://issues.apache.org/jira/browse/HADOOP-6760?focusedCommentId=12867342&
    page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
    t-12867342

    Koji
    On 5/2/11 9:43 PM, "Konstantin Boudnik" wrote:
    On Mon, May 2, 2011 at 16:56, Arun C Murthy wrote:


    On May 2, 2011, at 3:01 PM, Arun C Murthy wrote:

    On May 2, 2011, at 12:31 PM, Tom White wrote:

    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163
    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428
    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are
    relevant/necessary, I'll check with Cos.  Other than that hadoop-0.20.203
    now a superset of hadoop-0.20.2.
    I have looked somewhat more into these two JIRAs and if I remember correctly
    this fix causes a rolling port side effect in TT and it has been reverted in
    0.20.200 (Y! Fred? release) because Ops weren't happy about this (I am sure
    you can check internal Git to cross-verify my recollection).

    Considering above, these might be better left outside of the release and,
    perhaps, they should be reverted in trunk as well.

    Cos

    thanks,
    Arun
  • Konstantin Boudnik at May 3, 2011 at 8:03 pm
    I also have built instrumented cluster and at least no bindings
    required for system testing are broken.
    --
    Take care,
    Konstantin (Cos) Boudnik
    On Tue, May 3, 2011 at 12:51, Jakob Homan wrote:
    Tested the RC on a single node cluster, kicked the tires.  Looks good.
    +1 on its release.

    Regardless of how the RC got here, we only get benefit from releasing
    it.  It represents a huge chunk of work from our contributors,
    provides needed features for our users and moves us one step closer to
    making regular releases again.

    -Jakob

    On Tue, May 3, 2011 at 11:33 AM, Konstantin Boudnik wrote:
    Yup, exactly right - it has been reverted in the trunk as well. Thanks
    for digging this up, Koji!
    On Tue, May 3, 2011 at 11:22, Koji Noguchi wrote:
    except
    HADOOP-6386 and HADOOP-6428.
    causes a rolling port side effect in TT
    I remember bugging Cos and Rob to revert HADOOP-6386.
    https://issues.apache.org/jira/browse/HADOOP-6760?focusedCommentId=12867342&
    page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
    t-12867342

    Koji
    On 5/2/11 9:43 PM, "Konstantin Boudnik" wrote:
    On Mon, May 2, 2011 at 16:56, Arun C Murthy wrote:


    On May 2, 2011, at 3:01 PM, Arun C Murthy wrote:

    On May 2, 2011, at 12:31 PM, Tom White wrote:

    I just did a quick search, and these are the JIRAs that are in 0.20.2
    but appear not to be in 0.20.203.0.
    Thanks Tom.

    I did a quick analysis:

    # Remaining for 0.20.203
    * HADOOP-5611
    * HADOOP-5612
    * HADOOP-5623
    * HDFS-596
    * HDFS-723
    * HDFS-732
    * HDFS-579
    * MAPREDUCE-1070
    * HADOOP-6315
    * MAPREDUCE-1163
    * HADOOP-5759
    * HADOOP-6269
    * HADOOP-6386
    * HADOOP-6428
    Owen, Suresh and I have committed everything on this list except
    HADOOP-6386 and HADOOP-6428. Not sure which of the two are
    relevant/necessary, I'll check with Cos.  Other than that hadoop-0.20.203
    now a superset of hadoop-0.20.2.
    I have looked somewhat more into these two JIRAs and if I remember correctly
    this fix causes a rolling port side effect in TT and it has been reverted in
    0.20.200 (Y! Fred? release) because Ops weren't happy about this (I am sure
    you can check internal Git to cross-verify my recollection).

    Considering above, these might be better left outside of the release and,
    perhaps, they should be reverted in trunk as well.

    Cos

    thanks,
    Arun
  • Nigel Daley at May 3, 2011 at 8:34 pm
    Owen, any reason you're not building the eclipse plugin for this release? Instructions are here: http://wiki.apache.org/hadoop/HowToRelease

    n.

    On Apr 29, 2011, at 4:09 PM, Owen O'Malley wrote:

    I think everything is ready to go on the 0.20.203.0 release. It includes security and a lot of improvements in the capacity scheduler and JobTracker.

    Should we release http://people.apache.org/~omalley/hadoop-0.20.203.0-rc0/?

    -- Owen
  • Owen O'Malley at May 3, 2011 at 8:49 pm

    On May 3, 2011, at 1:33 PM, Nigel Daley wrote:

    Owen, any reason you're not building the eclipse plugin for this release? Instructions are here: http://wiki.apache.org/hadoop/HowToRelease
    Of course, I know (and have updated) the HowToRelease page. It looks like the eclipse-plugin was dropped off of the list of contrib modules to build by default.

    -- Owen
  • Nigel Daley at May 3, 2011 at 9:17 pm

    On May 3, 2011, at 1:48 PM, Owen O'Malley wrote:

    On May 3, 2011, at 1:33 PM, Nigel Daley wrote:

    Owen, any reason you're not building the eclipse plugin for this release? Instructions are here: http://wiki.apache.org/hadoop/HowToRelease
    Of course, I know (and have updated) the HowToRelease page. It looks like the eclipse-plugin was dropped off of the list of contrib modules to build by default.
    I believe it's only built if you have -Declipse.home= defined.

    Cheers,
    n.
  • Rottinghuis, Joep at May 4, 2011 at 2:15 am
    Yes, the Eclipse contrib is skipped unless eclipse.home is set.
    See: src/contrib/eclipse-plugin/build.xml lines 47-50

    <!-- Skip building if eclipse.home is unset. -->
    <target name="check-contrib" unless="eclipse.home">
    <property name="skip.contrib" value="yes"/>
    <echo message="eclipse.home unset: skipping eclipse plugin"/>
    </target>

    When this happens you should be able to see the string "skipping eclipse plugin" in the console output.


    However, turning on Eclipse build without any changes will result in a build failure on branch-0.20-security as well as in release-0.20.203.0-rc0.
    We resolved this on a similar internal branch by applying "alex-HADOOP-3744.patch" as attached in MAPREDUCE-1280.

    This leads me to two questions:
    1) How would I indicate that this Jira should be applied to this branch? Open a new jira? Re-open the existing jira and adding affects/fix versions?
    2) How does one typically indicate which of the several patches is actually the one to be applied?

    Owen has already applied at least one of these.
    I'll try to reconcile the other patches we have applied to our internal branch and follow the recommended process for these to suggest them for the 203 release.

    Thanks,

    Joep

    -----Original Message-----
    From: Nigel Daley
    Sent: Tuesday, May 03, 2011 2:17 PM
    To: [email protected]
    Subject: Re: [VOTE] Release candidate 0.20.203.0-rc0

    On May 3, 2011, at 1:48 PM, Owen O'Malley wrote:

    On May 3, 2011, at 1:33 PM, Nigel Daley wrote:

    Owen, any reason you're not building the eclipse plugin for this release? Instructions are here: http://wiki.apache.org/hadoop/HowToRelease
    Of course, I know (and have updated) the HowToRelease page. It looks like the eclipse-plugin was dropped off of the list of contrib modules to build by default.
    I believe it's only built if you have -Declipse.home= defined.

    Cheers,
    n.

Related Discussions

People

Translate

site design / logo © 2023 Grokbase