FAQ
Add orthogonal fault injection mechanism/framework
--------------------------------------------------

Key: HADOOP-5974
URL: https://issues.apache.org/jira/browse/HADOOP-5974
Project: Hadoop Core
Issue Type: Test
Components: test
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik


It'd be great to have a fault injection mechanism for Hadoop.

Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.

Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Konstantin Boudnik (JIRA) at Jun 4, 2009 at 9:57 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716420#action_12716420 ]

    Konstantin Boudnik commented on HADOOP-5974:
    --------------------------------------------

    I would like to propose the following initial requirements for Fault Injection (FI) solution for Hadoop:

    # Has to be orthogonal to existing source code and test base: no need of direct code or tests modifications, preferably based on a cross-cut model
    # Fully detachable: insert/remove faults from the system without hassle: a separate build target has to be set to introduce faults in place with a single command. Removal should be equally easy.
    # High level of fault abstractions: implementation of faults' logic has to be done in high-level language, e.g. Java
    # Need to reuse existing unit/functional tests if possible
    # Fine grained configuration at runtime: fully deterministic or random injection of the faults should be configured at runtime through a configuration file or a set of system properties - no source code modifications or re-compilation required.
    # If an off-shelf solution is used it's better comes under Apache's compatible open-source license

    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Boudnik (JIRA) at Jun 5, 2009 at 7:50 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716737#action_12716737 ]

    Konstantin Boudnik edited comment on HADOOP-5974 at 6/5/09 12:49 PM:
    ---------------------------------------------------------------------

    Here's an overall proposition of the framework layout:
    - AspectJ 1.6 should be used as the base framework
    - additional set of classes needs to be developed to control and configure injection of the faults at the runtime. In the first version of the framework, I'd recommend to go with with randomly (in terms of their happening, not their location in the application code) injected faults
    - randomization level might be configured through system properties from the command line or set in a separate configuration file
    - to completely turn off faults injection for a class the probability level has to be set to 0% ('zero'); setting to 100% will achieve the opposite effect
    - build.xml has to be extended with a new target ('injectfaults') to weave needed aspects in place after the normal compilation of Java classes is done; JUnit targets will have to be modified to pass new probability configuration parameters into spawn JVM
    - aspects' source code will be place under test/src/aop; package structure will mimic the original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

    Some examples of new build/test execution interface:

    To weave (build-in) aspects in place:
    - % ant injectfaults

    To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level):
    - % ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10


    was (Author: cos):
    Here's an overall proposition of the framework layout:
    - AspectJ 1.6 should be used as the base framework
    - additional set of classes needs to be developed to control and configure injection of the faults at the runtime. In the first version of the framework, I'd recommend to go with with randomly (in terms of their happening, not their location in the application code) injected faults
    - randomization level might be configured through system properties from the command line or set in a separate configuration file
    - to completely turn off faults injection for a class the probability level has to be set to 0% ('zero'); setting to 100% will achieve the opposite effect
    - build.xml has to be extended with a new target ('injectfaults') to weave needed aspects in place after the normal compilation of Java classes is done; JUnit targets will have to be modified to pass new probability configuration parameters into spawn JVM
    - aspects' source code will be place under test/src/aop; package structure will mimic the original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

    Some examples of new build/test execution interface:

    To weave (build-in) aspects in place:
    - % ant injectfaults
    To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level):
    - % ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10

    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Boudnik (JIRA) at Jun 5, 2009 at 7:50 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716737#action_12716737 ]

    Konstantin Boudnik commented on HADOOP-5974:
    --------------------------------------------

    Here's an overall proposition of the framework layout:
    - AspectJ 1.6 should be used as the base framework
    - additional set of classes needs to be developed to control and configure injection of the faults at the runtime. In the first version of the framework, I'd recommend to go with with randomly (in terms of their happening, not their location in the application code) injected faults
    - randomization level might be configured through system properties from the command line or set in a separate configuration file
    - to completely turn off faults injection for a class the probability level has to be set to 0% ('zero'); setting to 100% will achieve the opposite effect
    - build.xml has to be extended with a new target ('injectfaults') to weave needed aspects in place after the normal compilation of Java classes is done; JUnit targets will have to be modified to pass new probability configuration parameters into spawn JVM
    - aspects' source code will be place under test/src/aop; package structure will mimic the original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

    Some examples of new build/test execution interface:

    To weave (build-in) aspects in place:
    - % ant injectfaults
    - To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level):
    % ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10

    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Boudnik (JIRA) at Jun 5, 2009 at 7:50 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716737#action_12716737 ]

    Konstantin Boudnik edited comment on HADOOP-5974 at 6/5/09 12:49 PM:
    ---------------------------------------------------------------------

    Here's an overall proposition of the framework layout:
    - AspectJ 1.6 should be used as the base framework
    - additional set of classes needs to be developed to control and configure injection of the faults at the runtime. In the first version of the framework, I'd recommend to go with with randomly (in terms of their happening, not their location in the application code) injected faults
    - randomization level might be configured through system properties from the command line or set in a separate configuration file
    - to completely turn off faults injection for a class the probability level has to be set to 0% ('zero'); setting to 100% will achieve the opposite effect
    - build.xml has to be extended with a new target ('injectfaults') to weave needed aspects in place after the normal compilation of Java classes is done; JUnit targets will have to be modified to pass new probability configuration parameters into spawn JVM
    - aspects' source code will be place under test/src/aop; package structure will mimic the original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

    Some examples of new build/test execution interface:

    To weave (build-in) aspects in place:
    - % ant injectfaults
    To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level):
    - % ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10


    was (Author: cos):
    Here's an overall proposition of the framework layout:
    - AspectJ 1.6 should be used as the base framework
    - additional set of classes needs to be developed to control and configure injection of the faults at the runtime. In the first version of the framework, I'd recommend to go with with randomly (in terms of their happening, not their location in the application code) injected faults
    - randomization level might be configured through system properties from the command line or set in a separate configuration file
    - to completely turn off faults injection for a class the probability level has to be set to 0% ('zero'); setting to 100% will achieve the opposite effect
    - build.xml has to be extended with a new target ('injectfaults') to weave needed aspects in place after the normal compilation of Java classes is done; JUnit targets will have to be modified to pass new probability configuration parameters into spawn JVM
    - aspects' source code will be place under test/src/aop; package structure will mimic the original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

    Some examples of new build/test execution interface:

    To weave (build-in) aspects in place:
    - % ant injectfaults
    - To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level):
    % ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10

    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Boudnik (JIRA) at Jun 8, 2009 at 7:32 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717373#action_12717373 ]

    Konstantin Boudnik commented on HADOOP-5974:
    --------------------------------------------

    My patch is pretty much ready and requires a couple of libraries to be added to the Hadoop project. These libraries aren't associated with any of Apache's projects: they are under Eclipse Software License and are distributed from their website.

    I'm not sure what is the 'rule of thumb' to add the libraries to ivy configuration for Hadoop? Or shall they be added statically, e.g. into SVN repository? I assume that the latter is a bad idea generally, which leaves us with the former option.

    Can any of the watchers comment on this, please?
    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Jun 8, 2009 at 8:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717385#action_12717385 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-5974:
    ------------------------------------------------
    % ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10
    The naming convention may be better to have something like fault.probability.*, fault.probability.datanode.BlockReceiver, etc.
    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Boudnik (JIRA) at Jun 8, 2009 at 8:26 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717391#action_12717391 ]

    Konstantin Boudnik commented on HADOOP-5974:
    --------------------------------------------

    Thanks for the suggestion, Nicholas. I like your way (the prefixing with
    fault.probability) better and I'm putting it into the patch right away.

    As for suffix of the name it'd be completely up to the aspects developers to
    name it. However, I agree that datanode.BlockReceiver would more mnemonically
    appealing.


    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Boudnik (JIRA) at Jun 8, 2009 at 9:24 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717422#action_12717422 ]

    Konstantin Boudnik commented on HADOOP-5974:
    --------------------------------------------

    It seems that none of current Maven repos have AspectJ1.6.4 in place. The latest version available is 1.5.4, which won't work because Hadoop is Java6 project.

    Any idea how to add a latest version of a library to a Maven repository?
    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Giridharan Kesavan (JIRA) at Jun 9, 2009 at 5:40 am
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717569#action_12717569 ]

    Giridharan Kesavan commented on HADOOP-5974:
    --------------------------------------------

    we can file a jira with codehaus with the location of the aspectj jar file and its pom , so they can help us in uploading the latest version of aspectj to the mvn repository.

    BTW I see different aspectj jar file in here .. some of them are at version-1.5.4 and some are at version 1.6.4
    http://www.mvnrepository.com/search.html?query=aspectj

    Could you please mention the name of the aspectj jar that you are lookin for?

    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Boudnik (JIRA) at Jun 9, 2009 at 6:28 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717776#action_12717776 ]

    Konstantin Boudnik commented on HADOOP-5974:
    --------------------------------------------

    Great! Thanks for the pointer - I saw only 1.5.4 in there and somehow missed the latest version. It worked, so I will publish the patch shortly.
    Add orthogonal fault injection mechanism/framework
    --------------------------------------------------

    Key: HADOOP-5974
    URL: https://issues.apache.org/jira/browse/HADOOP-5974
    Project: Hadoop Core
    Issue Type: Test
    Components: test
    Reporter: Konstantin Boudnik
    Assignee: Konstantin Boudnik

    It'd be great to have a fault injection mechanism for Hadoop.
    Having such solution in place will allow to increase test coverage of error handling and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the problems.
    Ideally, the system has to be orthogonal to the current code and test base. E.g. faults have to be injected at build time and would have to be configurable, e.g. all faults could be turned off, or only some of them would be allowed to happen. Also, fault injection has to be separated from production build.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJun 4, '09 at 9:41p
activeJun 9, '09 at 6:28p
posts11
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Konstantin Boudnik (JIRA): 11 posts

People

Translate

site design / logo © 2022 Grokbase