FAQ
Has anyone tried running the system tests on a 0.20.20x release? Why
don't we run these via Hudson?

After following the instructions on the wiki [1] and making a bunch of
additional fixes (setting dfs.datanode.ipc.address in the config,
using sbin instead of bin, copying libs into the FI build lib dir,
etc) I was able to get the tests running however the tests seem to
have bitrot.

The reason I ask is that it looks like the src/test/system tests are
only compiled or run via the test-system target and it doesn't look
like Hudson or developers use that target, therefore we're not doing
anything to prevent people from breaking the tests. I tried to run
them to see if one of my changes would break them but I can't imagine
most people will jump through all the above hoops.

On a related note, is there any way to run these against an existing
build/cluster? It looks like they require running on a build that's
been fault injected (ie they use custom protocol classes that are not
present in the normal tarball) which makes them much less useful.

Thanks,
Eli

1. http://wiki.apache.org/hadoop/HowToUseSystemTestFramework

Search Discussions

  • Konstantin Boudnik at Aug 22, 2011 at 4:35 am
    System tests (Herriot controlled) tests were a part of nightly testing of every
    build for at least 2 of .2xx release. I really can not comment on .203 and
    after.

    A normal procedure was to build a normal bits and run the tests; build
    instrumented bits, deploy them to a 10 nodes cluster, and run system tests.
    The current state of the code is that system tests require source code
    workspace to be executed from. I have done some initial work to do workspace
    independent testing but I don't know if it has been included to the public
    releases of .203+ - I haven't really checked.

    At any rate, running system tests are an easy task and the wiki page is
    explaining how to do it. Assembling an instrumented cluster on the other hand
    requires certain knowledge and release process and bits production.
    Instrumented cluster isn't fault-injected - it is just instrumented ;) Yes, it
    contains a few extra helper API calls in a few classes, which exactly makes
    them a way more useful for the testing purpose. Without those a number of
    testing scenarios would be impossible to implement as I have explained it on
    many occasions.

    For the regular runs of system test Roman and I have created a regular
    deployment of 0.22 cluster builds under Apache Hudson control a few months
    ago. I don't know what's going on with this testing after recent troubles with
    the build machines.

    Hope it helps,
    Cos
    On Fri, Aug 19, 2011 at 05:29PM, Eli Collins wrote:
    Has anyone tried running the system tests on a 0.20.20x release? Why
    don't we run these via Hudson?

    After following the instructions on the wiki [1] and making a bunch of
    additional fixes (setting dfs.datanode.ipc.address in the config,
    using sbin instead of bin, copying libs into the FI build lib dir,
    etc) I was able to get the tests running however the tests seem to
    have bitrot.

    The reason I ask is that it looks like the src/test/system tests are
    only compiled or run via the test-system target and it doesn't look
    like Hudson or developers use that target, therefore we're not doing
    anything to prevent people from breaking the tests. I tried to run
    them to see if one of my changes would break them but I can't imagine
    most people will jump through all the above hoops.

    On a related note, is there any way to run these against an existing
    build/cluster? It looks like they require running on a build that's
    been fault injected (ie they use custom protocol classes that are not
    present in the normal tarball) which makes them much less useful.

    Thanks,
    Eli

    1. http://wiki.apache.org/hadoop/HowToUseSystemTestFramework
  • Eli Collins at Aug 22, 2011 at 4:58 pm

    On Sun, Aug 21, 2011 at 9:34 PM, Konstantin Boudnik wrote:
    System tests (Herriot controlled) tests were a part of nightly testing of every
    build for at least 2 of .2xx release. I really can not comment on .203 and
    after.
    Owen - are you running the system tests on the 20x release candidates?
    Do we know if the 20x release pass the system tests?
    A normal procedure was to build a normal bits and run the tests; build
    instrumented bits, deploy them to a 10 nodes cluster, and run system tests.
    The current state of the code is that system tests require source code
    workspace to be executed from. I have done some initial work to do workspace
    independent testing but I don't know if it has been included to the public
    releases of .203+ - I haven't really checked.

    At any rate, running system tests are an easy task and the wiki page is
    explaining how to do it.
    Running the system tests is actually not easy, those wiki instructions
    are out of date, require all kinds of manual steps, and some of the
    tests fail when just run from a local build (ie they require 3 DNs so
    you have to setup a cluster).
    Assembling an instrumented cluster on the other hand
    requires certain knowledge and release process and bits production.
    Instrumented cluster isn't fault-injected - it is just instrumented ;) Yes, it
    contains a few extra helper API calls in a few classes, which exactly makes
    them a way more useful for the testing purpose. Without those a number of
    testing scenarios would be impossible to implement as I have explained it on
    many occasions.
    Could you point me to a thread that covers the few extra helper API
    calls that are injected? I can't see what API would both be necessary
    for a system test and also not able be included in the product itself.
    If you're system testing an instrumented build than you're not system
    testing the product used by users.
    For the regular runs of system test Roman and I have created a regular
    deployment of 0.22 cluster builds under  Apache Hudson control a few months
    ago. I don't know what's going on with this testing after recent troubles with
    the build machines.
    How hard would it be to copy your 22 system test Jenkins job to adapt
    it to use a 20x build? Seems like the test bits should mostly be the
    same.

    Thanks,
    Eli
  • Konstantin Boudnik at Sep 8, 2011 at 4:15 am

    On Mon, Aug 22, 2011 at 09:58AM, Eli Collins wrote:
    On Sun, Aug 21, 2011 at 9:34 PM, Konstantin Boudnik wrote:
    System tests (Herriot controlled) tests were a part of nightly testing of every
    build for at least 2 of .2xx release. I really can not comment on .203 and
    after.
    Owen - are you running the system tests on the 20x release candidates?
    Do we know if the 20x release pass the system tests?
    A normal procedure was to build a normal bits and run the tests; build
    instrumented bits, deploy them to a 10 nodes cluster, and run system tests.
    The current state of the code is that system tests require source code
    workspace to be executed from. I have done some initial work to do workspace
    independent testing but I don't know if it has been included to the public
    releases of .203+ - I haven't really checked.

    At any rate, running system tests are an easy task and the wiki page is
    explaining how to do it.
    Running the system tests is actually not easy, those wiki instructions
    are out of date, require all kinds of manual steps, and some of the
    tests fail when just run from a local build (ie they require 3 DNs so
    you have to setup a cluster).
    I will try once again: system tests are ALWAYS require cluster. This is why
    they are called 'system' in the first place. The execution model, however,
    require source code workspace as well so you can say 'ant test-system' there:
    ant is used as the driver.

    I am going to revisit the wiki to make sure it is up-to-date. But I don't
    think it has been out-dated as you say. The source the system framework relies
    upon are intact - it is guaranteed by test-patch. They are might be changes in
    how cluster deployment is happening, but this is really not a concern of the
    test framework: it explicitly states that it _expect an instrumented cluster
    to be deployed_. This is is design constrain.
    Assembling an instrumented cluster on the other hand
    requires certain knowledge and release process and bits production.
    Instrumented cluster isn't fault-injected - it is just instrumented ;) Yes, it
    contains a few extra helper API calls in a few classes, which exactly makes
    them a way more useful for the testing purpose. Without those a number of
    testing scenarios would be impossible to implement as I have explained it on
    many occasions.
    Could you point me to a thread that covers the few extra helper API
    calls that are injected? I can't see what API would both be necessary
    for a system test and also not able be included in the product itself.
    If you're system testing an instrumented build than you're not system
    testing the product used by users.
    No can do. We had that discussion with you face to face at least twice. I
    imagine there's no thread about this. As for the APIs: there's JavaDoc and
    there's source code - take a look. In essence: only control and monitoring
    APIs are injected which aren't a concern for a public API nor undermine the
    legitimacy of such testing. If you are making such an assertion then seeing
    some hard evidences to back it up would be quite awesome to see.
    For the regular runs of system test Roman and I have created a regular
    deployment of 0.22 cluster builds under ═Apache Hudson control a few months
    ago. I don't know what's going on with this testing after recent troubles with
    the build machines.
    How hard would it be to copy your 22 system test Jenkins job to adapt
    it to use a 20x build? Seems like the test bits should mostly be the
    same.
    Adapting something is usually a non-effort conditional to the HW and time
    resources available.

    Hope it helps.
    Cos
    Thanks,
    Eli
  • Eli Collins at Sep 8, 2011 at 6:01 am

    On Wed, Sep 7, 2011 at 9:14 PM, Konstantin Boudnik wrote:
    On Mon, Aug 22, 2011 at 09:58AM, Eli Collins wrote:
    On Sun, Aug 21, 2011 at 9:34 PM, Konstantin Boudnik wrote:
    System tests (Herriot controlled) tests were a part of nightly testing of every
    build for at least 2 of .2xx release. I really can not comment on .203 and
    after.
    Owen - are you running the system tests on the 20x release candidates?
    Do we know if the 20x release pass the system tests?
    A normal procedure was to build a normal bits and run the tests; build
    instrumented bits, deploy them to a 10 nodes cluster, and run system tests.
    The current state of the code is that system tests require source code
    workspace to be executed from. I have done some initial work to do workspace
    independent testing but I don't know if it has been included to the public
    releases of .203+ - I haven't really checked.

    At any rate, running system tests are an easy task and the wiki page is
    explaining how to do it.
    Running the system tests is actually not easy, those wiki instructions
    are out of date, require all kinds of manual steps, and some of the
    tests fail when just run from a local build (ie they require 3 DNs so
    you have to setup a cluster).
    I will try once again: system tests are ALWAYS require cluster. This is why
    they are called 'system' in the first place. The execution model, however,
    require source code workspace as well so you can say 'ant test-system' there:
    ant is used as the driver.

    I am going to revisit the wiki to make sure it is up-to-date. But I don't
    think it has been out-dated as you say.
    Its the system tests that need to be fixed, eg they're referencing
    scripts in bin that now live in sbin etc. You'll discover this stuff
    if you try to run them.
    How hard would it be to copy your 22 system test Jenkins job to adapt
    it to use a  20x build?  Seems like the test bits should mostly be the
    same.
    Adapting something is usually a non-effort conditional to the HW and time
    resources available.
    Will you volunteer to maintain the system tests? Currently they're
    not running as part of Hudson and are bit-rotting.

    Thanks,
    Eli

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedAug 20, '11 at 12:30a
activeSep 8, '11 at 6:01a
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Eli Collins: 3 posts Konstantin Boudnik: 2 posts

People

Translate

site design / logo © 2022 Grokbase