FAQ
tl;dr if we are going to add new frameworks, please make them a separate
patch. another attempt to bring up coding for testability/dep injection
again.

I've changed the subject because I really meant this to be a bigger
discussion about setting up our code to be more testable and using
conventions that allow us to do dependency injection (ala guava or in a way
we can use mockito) or deciding to to include this new-to-me
InjectionHandler frameworks from the 89-fb branch. Porting HBASE-9949's
testing code depended this Injector infrastructure that I found
distasteful.

This particular example smells funny and looked like another stovepiped
framework to me --
1) It added references and objects to our core code instead of having it
only present for our tests.
2) This made the core code more cumbersome to read.
3) Initial version used a brittle convention. Initially, the injection
point was identified by passing empty object array of particular size (0
size mean option 0, size 1 meant injection point 1, etc). Later changed to
a enum, As used in the 89-fb branch we have a new global pool of unrelated
enum/constant values that seemed brittle for further extension.

<soapbox>Generally tests should be able to setup custom objects with
constructors so we can use mockito or something similar in unit tests. If
done well we can instead of adding runtime injector objects to production
code. I'm making a plea to spend sometime tightening the code up and to
conveying more design in design. </soapbox>

Recently been spending time in the hlog area and will likely personally
start there improving comments/names, conveying design intent, and
hopefully improving it. ;)

Jon.
On Wed, Nov 20, 2013 at 9:48 AM, Ted Yu wrote:

As requested by Jonathan, I am posting the following approach for testing
fix of HBASE-9949 :


1. Introduce config parameter for StoreScanner implementation which would
be used by HStore for creating scanner
2. In TestStoreScanner, add StoreScanner implementation which extends
StoreScanner and set the above config parameter to this class.
3. Register custom ChangedReadersObserver through the following API of
HStore :

public void addChangedReaderObserver(ChangedReadersObserver o) {

The BEFORE_SEEK hook would be activated before Store.getScanner() is
called.
The AFTER_SEEK hook can be activated at the end of ctor of StoreScanner
wrapper created in #2
The custom ChangedReadersObserver would activate the COMPACT_COMPLETE hook.

Comments are welcome.


--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Search Discussions

  • Andrew Purtell at Nov 20, 2013 at 11:41 pm

    On Wed, Nov 20, 2013 at 3:27 PM, Jonathan Hsieh wrote:

    tl;dr if we are going to add new frameworks, please make them a separate
    patch
    +1 well said Jon



    --
    Best regards,

        - Andy

    Problems worthy of attack prove their worth by hitting back. - Piet Hein
    (via Tom White)
  • Enis Söztutar at Nov 21, 2013 at 12:38 am
    +1 on improving on testability / DI side of things.

    On Wed, Nov 20, 2013 at 3:27 PM, Jonathan Hsieh wrote:

    tl;dr if we are going to add new frameworks, please make them a separate
    patch. another attempt to bring up coding for testability/dep injection
    again.

    I've changed the subject because I really meant this to be a bigger
    discussion about setting up our code to be more testable and using
    conventions that allow us to do dependency injection (ala guava or in a way
    we can use mockito) or deciding to to include this new-to-me
    InjectionHandler frameworks from the 89-fb branch. Porting HBASE-9949's
    testing code depended this Injector infrastructure that I found
    distasteful.

    This particular example smells funny and looked like another stovepiped
    framework to me --
    1) It added references and objects to our core code instead of having it
    only present for our tests.
    2) This made the core code more cumbersome to read.
    3) Initial version used a brittle convention. Initially, the injection
    point was identified by passing empty object array of particular size (0
    size mean option 0, size 1 meant injection point 1, etc). Later changed to
    a enum, As used in the 89-fb branch we have a new global pool of unrelated
    enum/constant values that seemed brittle for further extension.

    <soapbox>Generally tests should be able to setup custom objects with
    constructors so we can use mockito or something similar in unit tests. If
    done well we can instead of adding runtime injector objects to production
    code. I'm making a plea to spend sometime tightening the code up and to
    conveying more design in design. </soapbox>

    Recently been spending time in the hlog area and will likely personally
    start there improving comments/names, conveying design intent, and
    hopefully improving it. ;)

    Jon.
    On Wed, Nov 20, 2013 at 9:48 AM, Ted Yu wrote:

    As requested by Jonathan, I am posting the following approach for testing
    fix of HBASE-9949 :


    1. Introduce config parameter for StoreScanner implementation which would
    be used by HStore for creating scanner
    2. In TestStoreScanner, add StoreScanner implementation which extends
    StoreScanner and set the above config parameter to this class.
    3. Register custom ChangedReadersObserver through the following API of
    HStore :

    public void addChangedReaderObserver(ChangedReadersObserver o) {

    The BEFORE_SEEK hook would be activated before Store.getScanner() is
    called.
    The AFTER_SEEK hook can be activated at the end of ctor of StoreScanner
    wrapper created in #2
    The custom ChangedReadersObserver would activate the COMPACT_COMPLETE hook.
    Comments are welcome.


    --
    // Jonathan Hsieh (shay)
    // Software Engineer, Cloudera
    // jon@cloudera.com
  • Michael Stack at Nov 22, 2013 at 8:54 pm

    On Wed, Nov 20, 2013 at 3:27 PM, Jonathan Hsieh wrote:

    tl;dr if we are going to add new frameworks, please make them a separate
    patch. another attempt to bring up coding for testability/dep injection
    again.

    I've changed the subject because I really meant this to be a bigger
    discussion about setting up our code to be more testable and using
    conventions that allow us to do dependency injection (ala guava or in a way
    we can use mockito) or deciding to to include this new-to-me
    InjectionHandler frameworks from the 89-fb branch. Porting HBASE-9949's
    testing code depended this Injector infrastructure that I found
    distasteful.

    This particular example smells funny and looked like another stovepiped
    framework to me --
    1) It added references and objects to our core code instead of having it
    only present for our tests.
    2) This made the core code more cumbersome to read.
    3) Initial version used a brittle convention. Initially, the injection
    point was identified by passing empty object array of particular size (0
    size mean option 0, size 1 meant injection point 1, etc). Later changed to
    a enum, As used in the 89-fb branch we have a new global pool of unrelated
    enum/constant values that seemed brittle for further extension.
    How did such a mess even get committed?
    St.Ack
  • Todd Lipcon at Nov 22, 2013 at 11:07 pm
    I haven't seen the particular "framework" in question, but I do think
    there's a continuum of reasonable choices here, ranging from full DI and
    using mocks/spys, to using AspectJ for AOP, to dding specific injections
    into the code at key points. I've experimented with probably the full range
    of options, and what I found to work best in HDFS was the addition of
    "FaultInjector" interfaces for each area of the code. For example:
    https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CheckpointFaultInjector.java
    example usage:
    https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java#L126

    I see Jon's point that this litters the code with explicit injection
    points. However, I disagree that that's a bad thing. Seeing a call to
    FaultInjector.foo() in the code makes a future developer more wary of where
    they need to be careful -- and ensures that, if they change the design of
    that area of the code, that they need to consider the same faults as the
    original. Additionally, contrary to the DI approach (which typically
    involves splitting classes into bunches of smaller units), the changes for
    this type of fault injection are well-contained: they add a line here or
    there, but they don't affect the higher-level design of the code, so you're
    free to code in a way that is more understandable instead of ending up with
    CheckpointDownloadStrategyFactoryFactories or something.

    Just one guy's opinion :)

    -Todd




    On Fri, Nov 22, 2013 at 12:54 PM, Stack wrote:
    On Wed, Nov 20, 2013 at 3:27 PM, Jonathan Hsieh wrote:

    tl;dr if we are going to add new frameworks, please make them a separate
    patch. another attempt to bring up coding for testability/dep injection
    again.

    I've changed the subject because I really meant this to be a bigger
    discussion about setting up our code to be more testable and using
    conventions that allow us to do dependency injection (ala guava or in a way
    we can use mockito) or deciding to to include this new-to-me
    InjectionHandler frameworks from the 89-fb branch. Porting HBASE-9949's
    testing code depended this Injector infrastructure that I found
    distasteful.

    This particular example smells funny and looked like another stovepiped
    framework to me --
    1) It added references and objects to our core code instead of having it
    only present for our tests.
    2) This made the core code more cumbersome to read.
    3) Initial version used a brittle convention. Initially, the injection
    point was identified by passing empty object array of particular size (0
    size mean option 0, size 1 meant injection point 1, etc). Later changed to
    a enum, As used in the 89-fb branch we have a new global pool of unrelated
    enum/constant values that seemed brittle for further extension.
    How did such a mess even get committed?
    St.Ack


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Andrew Purtell at Nov 23, 2013 at 6:51 pm
    Personally I would try to mock before adding fault injection framework.
    (Guilty of doing that in a recent patch-in-progress, but I have come to my
    senses in time.) No objection to fault injection frameworks per se. Using
    HDFS as an example again, please correct me if I'm mistaken, there was an
    AOP fault injection framework once but it is currently disabled, a victim
    of the migration from Ant to Maven, and possibly will be removed. The
    trouble with testing frameworks is the added debt they accumulate over
    time, like everything else. If we commit to adding one, and also use it as
    much as possible, that would be fine with me. Either way, we should
    definitely discuss the new/proposed framework and commit it on its own
    JIRA. I'm concerned how one got through the back door on HBASE-9949.

    On Fri, Nov 22, 2013 at 3:06 PM, Todd Lipcon wrote:

    I haven't seen the particular "framework" in question, but I do think
    there's a continuum of reasonable choices here, ranging from full DI and
    using mocks/spys, to using AspectJ for AOP, to dding specific injections
    into the code at key points. I've experimented with probably the full range
    of options, and what I found to work best in HDFS was the addition of
    "FaultInjector" interfaces for each area of the code. For example:

    https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CheckpointFaultInjector.java
    example usage:

    https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java#L126

    I see Jon's point that this litters the code with explicit injection
    points. However, I disagree that that's a bad thing. Seeing a call to
    FaultInjector.foo() in the code makes a future developer more wary of where
    they need to be careful -- and ensures that, if they change the design of
    that area of the code, that they need to consider the same faults as the
    original. Additionally, contrary to the DI approach (which typically
    involves splitting classes into bunches of smaller units), the changes for
    this type of fault injection are well-contained: they add a line here or
    there, but they don't affect the higher-level design of the code, so you're
    free to code in a way that is more understandable instead of ending up with
    CheckpointDownloadStrategyFactoryFactories or something.

    Just one guy's opinion :)

    -Todd




    On Fri, Nov 22, 2013 at 12:54 PM, Stack wrote:
    On Wed, Nov 20, 2013 at 3:27 PM, Jonathan Hsieh wrote:

    tl;dr if we are going to add new frameworks, please make them a
    separate
    patch. another attempt to bring up coding for testability/dep
    injection
    again.

    I've changed the subject because I really meant this to be a bigger
    discussion about setting up our code to be more testable and using
    conventions that allow us to do dependency injection (ala guava or in a way
    we can use mockito) or deciding to to include this new-to-me
    InjectionHandler frameworks from the 89-fb branch. Porting
    HBASE-9949's
    testing code depended this Injector infrastructure that I found
    distasteful.

    This particular example smells funny and looked like another stovepiped
    framework to me --
    1) It added references and objects to our core code instead of having
    it
    only present for our tests.
    2) This made the core code more cumbersome to read.
    3) Initial version used a brittle convention. Initially, the
    injection
    point was identified by passing empty object array of particular size
    (0
    size mean option 0, size 1 meant injection point 1, etc). Later
    changed
    to
    a enum, As used in the 89-fb branch we have a new global pool of unrelated
    enum/constant values that seemed brittle for further extension.
    How did such a mess even get committed?
    St.Ack


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Best regards,

        - Andy

    Problems worthy of attack prove their worth by hitting back. - Piet Hein
    (via Tom White)
  • Steve Loughran at Nov 25, 2013 at 10:27 am

    On 23 November 2013 18:51, Andrew Purtell wrote:

    Personally I would try to mock before adding fault injection framework.
    (Guilty of doing that in a recent patch-in-progress, but I have come to my
    senses in time.) No objection to fault injection frameworks per se. Using
    HDFS as an example again, please correct me if I'm mistaken, there was an
    AOP fault injection framework once but it is currently disabled, a victim
    of the migration from Ant to Maven, and possibly will be removed.

    That and the fact that not only was it brittle, it was under-understood and
    so undermaintained -people got scared of it, and when it reported problems,
    the "blame the test framework" became the strategy.

    The
    trouble with testing frameworks is the added debt they accumulate over
    time, like everything else. If we commit to adding one, and also use it as
    much as possible, that would be fine with me. Either way, we should
    definitely discuss the new/proposed framework and commit it on its own
    JIRA. I'm concerned how one got through the back door on HBASE-9949.
    Another is "test frameworks may impose requirements on the underlying
    code". This exists in YARN where I couldn't make some of the service events
    of YARN-117 final as it screwed up Mockito.

    Things like Guice are very visible here, but if you can adapt your code to
    use it in other ways it may be acceptable.

    --
    CONFIDENTIALITY NOTICE
    NOTICE: This message is intended for the use of the individual or entity to
    which it is addressed and may contain information that is confidential,
    privileged and exempt from disclosure under applicable law. If the reader
    of this message is not the intended recipient, you are hereby notified that
    any printing, copying, dissemination, distribution, disclosure or
    forwarding of this communication is strictly prohibited. If you have
    received this communication in error, please contact the sender immediately
    and delete it from your system. Thank You.
  • Enis Söztutar at Nov 25, 2013 at 7:07 pm
    Similar to what Todd describes, I remember using our own co-processors at a
    couple of tests for doing fault injection. Since the coprocessors are
    embedded in a lot of places, just wrapping the FI into a coprocessor class
    worked for me.

    The problem with mocking is that, a lot of the tests we have want to change
    a single class (lets say Compactor), but have the whole mini cluster up and
    running. I've seen some use of FI in the 0.89-fb branch.

    Enis

    On Mon, Nov 25, 2013 at 2:27 AM, Steve Loughran wrote:
    On 23 November 2013 18:51, Andrew Purtell wrote:

    Personally I would try to mock before adding fault injection framework.
    (Guilty of doing that in a recent patch-in-progress, but I have come to my
    senses in time.) No objection to fault injection frameworks per se. Using
    HDFS as an example again, please correct me if I'm mistaken, there was an
    AOP fault injection framework once but it is currently disabled, a victim
    of the migration from Ant to Maven, and possibly will be removed.

    That and the fact that not only was it brittle, it was under-understood and
    so undermaintained -people got scared of it, and when it reported problems,
    the "blame the test framework" became the strategy.

    The
    trouble with testing frameworks is the added debt they accumulate over
    time, like everything else. If we commit to adding one, and also use it as
    much as possible, that would be fine with me. Either way, we should
    definitely discuss the new/proposed framework and commit it on its own
    JIRA. I'm concerned how one got through the back door on HBASE-9949.
    Another is "test frameworks may impose requirements on the underlying
    code". This exists in YARN where I couldn't make some of the service events
    of YARN-117 final as it screwed up Mockito.

    Things like Guice are very visible here, but if you can adapt your code to
    use it in other ways it may be acceptable.

    --
    CONFIDENTIALITY NOTICE
    NOTICE: This message is intended for the use of the individual or entity to
    which it is addressed and may contain information that is confidential,
    privileged and exempt from disclosure under applicable law. If the reader
    of this message is not the intended recipient, you are hereby notified that
    any printing, copying, dissemination, distribution, disclosure or
    forwarding of this communication is strictly prohibited. If you have
    received this communication in error, please contact the sender immediately
    and delete it from your system. Thank You.
  • Sergey Shelukhin at Nov 25, 2013 at 10:36 pm
    One thing that is needed for proper mocking is also good class boundaries.
    We don't have these right now for e.g. HStore/HRegion and splitting them
    into granular parts may be a lot of work.
    This creates two problems,
    1) Tests Enis mentions that substitute a class while running a mini cluster
    - because the classes are too big it's hard to test them in isolation, or
    it requires some more ugly plumbing in production code like special test
    ctors.
    But, this can be solved in the usual manner (and is already doable for
    compactor), by specifying class name in config. That's a bit similar to
    some DI frameworks, where you go to a central authority and ask for an
    implementation of such and such interface to use.
    2) Mocking may not work - for example, HBASE-8518 would be hard to fix with
    mocking right now because it's some HStore code calling some other HStore
    code. So there's nothing to mock.
    For such cases, some fault injection is needed now; it could be done using
    CPs but IMHO it's not any better than the FB framework from original JIRA -
    it's still arbitrary injection point, it's just masking as something else,
    a genuine coproc entry point (which we already have too many imho).

    Btw, integration tests could use a fault injection framework but it would
    be very different from what the FB one does, it won't muck with internals
    in such manner.

    TL; DR: to summarize, I think we can inject classes using config as a
    current solution for some tests that we want but cannot add cleanly, but
    ideally we should make classes more unit testable.
    Then we won't need fault injection framework for commit tests.


    On Mon, Nov 25, 2013 at 11:06 AM, Enis Söztutar wrote:

    Similar to what Todd describes, I remember using our own co-processors at a
    couple of tests for doing fault injection. Since the coprocessors are
    embedded in a lot of places, just wrapping the FI into a coprocessor class
    worked for me.

    The problem with mocking is that, a lot of the tests we have want to change
    a single class (lets say Compactor), but have the whole mini cluster up and
    running. I've seen some use of FI in the 0.89-fb branch.

    Enis


    On Mon, Nov 25, 2013 at 2:27 AM, Steve Loughran <stevel@hortonworks.com
    wrote:
    On 23 November 2013 18:51, Andrew Purtell wrote:

    Personally I would try to mock before adding fault injection framework.
    (Guilty of doing that in a recent patch-in-progress, but I have come to my
    senses in time.) No objection to fault injection frameworks per se.
    Using
    HDFS as an example again, please correct me if I'm mistaken, there was
    an
    AOP fault injection framework once but it is currently disabled, a
    victim
    of the migration from Ant to Maven, and possibly will be removed.

    That and the fact that not only was it brittle, it was under-understood and
    so undermaintained -people got scared of it, and when it reported problems,
    the "blame the test framework" became the strategy.

    The
    trouble with testing frameworks is the added debt they accumulate over
    time, like everything else. If we commit to adding one, and also use it as
    much as possible, that would be fine with me. Either way, we should
    definitely discuss the new/proposed framework and commit it on its own
    JIRA. I'm concerned how one got through the back door on HBASE-9949.
    Another is "test frameworks may impose requirements on the underlying
    code". This exists in YARN where I couldn't make some of the service events
    of YARN-117 final as it screwed up Mockito.

    Things like Guice are very visible here, but if you can adapt your code to
    use it in other ways it may be acceptable.

    --
    CONFIDENTIALITY NOTICE
    NOTICE: This message is intended for the use of the individual or entity to
    which it is addressed and may contain information that is confidential,
    privileged and exempt from disclosure under applicable law. If the reader
    of this message is not the intended recipient, you are hereby notified that
    any printing, copying, dissemination, distribution, disclosure or
    forwarding of this communication is strictly prohibited. If you have
    received this communication in error, please contact the sender
    immediately
    and delete it from your system. Thank You.
    --
    CONFIDENTIALITY NOTICE
    NOTICE: This message is intended for the use of the individual or entity to
    which it is addressed and may contain information that is confidential,
    privileged and exempt from disclosure under applicable law. If the reader
    of this message is not the intended recipient, you are hereby notified that
    any printing, copying, dissemination, distribution, disclosure or
    forwarding of this communication is strictly prohibited. If you have
    received this communication in error, please contact the sender immediately
    and delete it from your system. Thank You.
  • Jonathan Hsieh at Nov 26, 2013 at 5:55 pm

    On Mon, Nov 25, 2013 at 2:35 PM, Sergey Shelukhin wrote:

    ...
    TL; DR: to summarize, I think we can inject classes using config as a
    current solution for some tests that we want but cannot add cleanly,

    * but ideally we should make classes more unit testable. Then we won't
    need fault injection framework for commit tests.*
    +1. bolding mine.

    --
    // Jonathan Hsieh (shay)
    // Software Engineer, Cloudera
    // jon@cloudera.com
  • Andrew Purtell at Nov 23, 2013 at 6:41 pm

    On Fri, Nov 22, 2013 at 12:54 PM, Stack wrote:

    How did such a mess even get committed?

    Shall we revert the HBASE-9949 commit to trunk until we have consensus on
    the testing part? Seems both Jon and I, and probably Stack also given the
    above comment, have an issue with the testing bits that were committed.
    (Sorry I didn't come around to this issue before it went it.)

    I also posted the above to the issue.


    --
    Best regards,

        - Andy

    Problems worthy of attack prove their worth by hitting back. - Piet Hein
    (via Tom White)
  • Ted Yu at Nov 23, 2013 at 6:59 pm
    The testing bits were taken out through 9949.addendum
    The above took place before Jonathan started this thread.

    On Sun, Nov 24, 2013 at 2:40 AM, Andrew Purtell wrote:
    On Fri, Nov 22, 2013 at 12:54 PM, Stack wrote:

    How did such a mess even get committed?

    Shall we revert the HBASE-9949 commit to trunk until we have consensus on
    the testing part? Seems both Jon and I, and probably Stack also given the
    above comment, have an issue with the testing bits that were committed.
    (Sorry I didn't come around to this issue before it went it.)

    I also posted the above to the issue.


    --
    Best regards,

    - Andy

    Problems worthy of attack prove their worth by hitting back. - Piet Hein
    (via Tom White)

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedNov 20, '13 at 11:28p
activeNov 26, '13 at 5:55p
posts12
users8
websitehbase.apache.org

People

Translate

site design / logo © 2021 Grokbase