Grokbase Groups HBase dev August 2009
FAQ
filters are not working correctly
---------------------------------

Key: HBASE-1790
URL: https://issues.apache.org/jira/browse/HBASE-1790
Project: Hadoop HBase
Issue Type: Bug
Components: filters
Affects Versions: 0.21.0
Reporter: Matus Zamborsky


Filters used in Scanning the table are not working correctly. For example a table with three rows:
1. rowkey = adminbackslash-nb0, desc:temp = "temp"
2. rowkey = adminbackslash-nb1, desc:temp = "temp"
3. rowkey = adminkleptoman, desc:temp = "temp"

If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Matus Zamborsky (JIRA) at Aug 24, 2009 at 9:16 am
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matus Zamborsky updated HBASE-1790:
    -----------------------------------

    Attachment: hbase-1790.patch
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.21.0
    Reporter: Matus Zamborsky
    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 24, 2009 at 5:39 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1790:
    -------------------------

    Fix Version/s: 0.20.1

    Move to 0.20.1
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.1

    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 27, 2009 at 10:01 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1790:
    -------------------------

    Status: Patch Available (was: Open)
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.1

    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dave Latham (JIRA) at Aug 28, 2009 at 5:28 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748882#action_12748882 ]

    Dave Latham commented on HBASE-1790:
    ------------------------------------

    We are using filters extensively in our deployment on 0.19 HEAD.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 28, 2009 at 5:30 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1790:
    -------------------------


    Bringing into 0.20.0.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 28, 2009 at 5:38 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748867#action_12748867 ]

    stack commented on HBASE-1790:
    ------------------------------

    Matus: Any chance of unit tests? Please add a patch for stateful filters. As to whether folks are using filters, the answer must be no... or at least not in any seriious way.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Clint Morgan (JIRA) at Aug 28, 2009 at 7:42 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748961#action_12748961 ]

    Clint Morgan commented on HBASE-1790:
    -------------------------------------

    We use filters (mostly FilterList and ValueFilter) on 0.20 to answer AND/OR type criteria. Not much use of staful filters except as needed in ValueFilter.

    For prefixing/start/stop rows we adjust the Scan.

    For paging, we use a wrapper on the Scan which skips over the first results then stops scanning after pageNum results. I realize thats not ideal as we unnecessarily pay transport on the first skipped rows, then pay the cost to get the N+1 row. I'd love to see PageFilter working!
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matus Zamborsky (JIRA) at Aug 29, 2009 at 12:10 am
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749038#action_12749038 ]

    Matus Zamborsky commented on HBASE-1790:
    ----------------------------------------

    Clint, thank you for comment. I am glad, that filters are something, which is used atleast somewhere. Stack, I will look into unit tests tomorow and I will also submit the patch. Dave, which filters are you using? Is it possible, that filters are working correctly in 0.19?
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: hbase-1790.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 29, 2009 at 5:28 am
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1790:
    -------------------------

    Attachment: testfilter.patch

    Matus: Here is a TestFilter template that does the testing at the HRegion interface. It might help you get going (I was afraid you'd try to figure how best to test this and wander into the weeds and we'd never hear from you again).

    Regards filters, the API and implementation changed in 0.20.0 so yeah, they work for Dave and partially for others (such as Clint) but seems to be issues here.

    Thanks for helping out.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dave Latham (JIRA) at Aug 29, 2009 at 1:34 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749129#action_12749129 ]

    Dave Latham commented on HBASE-1790:
    ------------------------------------

    Matus, we use several custom filter implementations that commonly deserialize some of the data in the region server and filters out rows based on data in the row, column, or cell. They work great for us, and are a definite performance gain. It will definitely be important for them to be working in 0.20 before we migrate.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Aug 29, 2009 at 2:27 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Attachment: HBASE-1790-v2.patch

    This adds a prefix test to stack's framework and then makes a change inside of HRegion.nextInternal() to return the current result if there is one (once we hit a filtered row key).

    What was happening was we put the second row in the result (List<KV>), iterated to the third, it said filter it, so we iterated the scanner, cleared the result, and returned. So even though the second KV should have been returned, the third KV forced a full clear of the result.

    I added a check that if we just hit a new row (currentRow != row) and there is something still in the result (!result.isEmpty()), then return w/o clearing.

    I also modified prefix filter to be more stateful so that it knows once it has passed it's prefix. Once passed the prefix, it returns filterAllRemaining() as true, so this further cleans up the test case. Without that, when running this at the Region level, we end up returning an empty result list on the third call rather than the second call returning false (that we are done). Since we actually can determine we are done in this case, because the prefix has been passed, adding it cleans that up so we return the correct result in the second call, and that call also returns false letting up know that there will be no more matches.

    I had to update one test (I think I actually wrote it) that tested FilterList MUST_PASS_ONE because it was running against a key passed the prefix, and then reversing back to the prefix. That's impossible behavior so I slightly modified the test.

    Not sure whether this is totally right but in debugging it made sense. All tests I've run pass.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: HBASE-1790-v2.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matus Zamborsky (JIRA) at Aug 29, 2009 at 4:59 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matus Zamborsky updated HBASE-1790:
    -----------------------------------

    Attachment: 1790-3.patch

    Jonathan I see what you tried to do, but if we need to make the Prefix filter stateful(or any other filter) than the HRegion.nextInternal method is basicaly depending on filter to be able to tell, if there will be no more accepted rows, before filtering them. This is possible for alfanumeric filters like prefix filter, but I think, it will break down with ValueFilter and a situation on last passed row, because ValueFilter cant tell if there would be any acceptable rows, before aplying filter. So I rewrite the nextInternal and next method. Also I rewrite the test file TestFilter. There are 4 test methods now: without filter, prefixfilter(dont need to apply jonathans patch), pagefilter and valuefilter. And I also add new filter StartPageFilter, which is the same as PageFilter, but you can specify how many rows should be skipped before the paging.

    I am developing on windows, so I dont have the best conditions for running tests. Although this nextInternal implementation passed TestFilter tests, I am not sure, if it will passed all the others tests. If anybody can try this it would be great.

    As I was searching for all usages of InternalScanner.next function, which basicaly take the result of nextInternal in case of HRegion and thus should return false if this is the last row, I came accross HMerge:

    while(rootScanner.next(results)) {
    for(KeyValue kv: results) {
    HRegionInfo info = Writables.getHRegionInfoOrNull(kv.getValue());
    if (info != null) {
    metaRegions.add(info);
    }
    }
    }

    It looks like, it expects rootScanner.next(results) to be true for every row and not for every row except the last. So Stack, should we correct this?
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 31, 2009 at 8:31 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749616#action_12749616 ]

    stack commented on HBASE-1790:
    ------------------------------

    The piece you pasted from HMerge looks broke to me Matus. Please add a fix to your patch please.

    In pagefilter, please just remove rather than comment-out:

    {code}
    - rowsAccepted = 0;
    + //rowsAccepted = 0;
    {code}

    You do the same commenting out elsewhere in your patch. Please just remove the replaced code.

    The StartPageFilter needs to have in its class javadoc why its different from PageFilter. Or, better, can we not just have PageFilter do what StartPageFilter does (pass 0 for pageStart if you want PageFilter behavior and N if you want to do StartPageFilter actions?)

    Just by way of FYI, fellas usually make the patch in the $HBASE_HOME dir. Yours was made in $HBASE_HOME/src. ... just for the future.

    Unfortunately, it seems as though this patch breaks other hbase tests. I'll try taking a look why in a few hours.






    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Sep 1, 2009 at 5:55 am
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749775#action_12749775 ]

    stack commented on HBASE-1790:
    ------------------------------

    The tests that are failing are the following:

    {code}
    [junit] Test org.apache.hadoop.hbase.TestEmptyMetaInfo FAILED (timeout)
    [junit] Test org.apache.hadoop.hbase.TestHBaseCluster FAILED (timeout)
    [junit] Test org.apache.hadoop.hbase.TestInfoServers FAILED (timeout)
    {code}

    Its always a timeout. Must be getting wrong answers back. Will take a look in morning.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 1, 2009 at 4:38 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Status: Open (was: Patch Available)

    Resuming work on this with stack
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 2:18 am
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray reassigned HBASE-1790:
    ------------------------------------

    Assignee: Jonathan Gray
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 2:18 am
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Attachment: HBASE-1790-v4.patch

    Adds new tests. Adds new filters. Makes a few changes to HRegion code and how filters are run. Modifies some existing tests to match updated semantics. Add "early-out" to a few existing filters so that we don't require an additional round-trip when we know we are done.

    Also includes a modified version of Andrew's patch from HBASE-1807.

    More work to be done, not ready for review... Just checkpointing.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Fix For: 0.20.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 8:55 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Attachment: HBASE-1790-v5.patch

    Completely changes the KeyFilter stuff from apurtell patch over in HBASE-1807 but retains all the functionality besides family checking (after clarifying with user on list that he was in fact not looking for that).

    Adds more tests. New tests are in TestFilter. There are lots of them, though could use more FilterList tests.

    Adds a couple new classes to filters... BinaryComparator which is a WritableBAComparable implementing wrapper of Bytes.compareTo(). CompareFilter is a superclass for RowFilter, QualifierFilter, and ValueFilter classes.

    Existing ValueFilter has become SingleColumnValueFilter which is what it actually does. I actually removed the filterRowIfColumnMissing stuff and instead created a wrapping filter SkipFilter. You wrap a KV checking filter with it, and if any KVs in the row fail, the entire row is filtered out.

    Couple changes in HRegion code to actually fix filters.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 8:57 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Fix Version/s: 0.21.0
    Status: Patch Available (was: Open)

    Please review.

    Excuse the small extra-line formatting error in HRegion.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Andrew Purtell (JIRA) at Sep 2, 2009 at 9:23 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750670#action_12750670 ]

    Andrew Purtell commented on HBASE-1790:
    ---------------------------------------

    Nice Jon.

    Put CompareOp enum in CompareFilter now?

    SkipFilter javadoc is confusing. I think you are missing a "not" in "Any row which did not have the given value for the specified column will be emitted. " Should be "Any row which did not have the given value for the specified column will NOT be emitted. "?

    There are some classes missing in the HbaseObjectWritable map? BinaryComparator, CompareFilter, QualifierFilter, RowFilter, SkipFilter?

    Tests look great.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Sep 2, 2009 at 9:40 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750681#action_12750681 ]

    stack commented on HBASE-1790:
    ------------------------------

    Should CompareFilter be abstract (Its useless standalone). Others?

    Ugh. Do we have to make a new HBaseConfiguration serializing? That looks way broke. Can we fix? We should use HBaseObjectWritable anyways? If I look in HOW, its just using it to make a NullWritable.... do we even do this? I can change HOW so we create a HBC if we need to do a nullwritable?








    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 10:04 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750696#action_12750696 ]

    Jonathan Gray commented on HBASE-1790:
    --------------------------------------

    Changed to a different example for SkipFilter, ran myself in circles with the one that was there :) Thanks for review andrew.

    Made CompareFilter abstract. Moved CompareOp to CompareFilter. Added new classes to HOW.

    Not sure what to do about HBC. Definitely agree we should not instantiate new ones in serialization. I'm open to whatever will work :)
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 10:12 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Attachment: HBASE-1790-v6.patch

    As described. Addresses everything but HBC/HOW issue.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, HBASE-1790-v6.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 11:02 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Attachment: HBASE-1790-v7.patch

    Addresses all issues.

    HOW read/write can now take null HBCs. Only issue is if you pass it a Configurable you need to pass it the HBC in order to have the configuration set on it automatically. Does not impact existing code outside scope of this filter patch.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, HBASE-1790-v6.patch, HBASE-1790-v7.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Sep 2, 2009 at 11:17 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750723#action_12750723 ]

    stack commented on HBASE-1790:
    ------------------------------

    +1 on patch. Didn't apply properly... last lines of patch but looks good.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, HBASE-1790-v6.patch, HBASE-1790-v7.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Sep 2, 2009 at 11:25 pm
    [ https://issues.apache.org/jira/browse/HBASE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-1790:
    ---------------------------------

    Resolution: Fixed
    Hadoop Flags: [Reviewed]
    Status: Resolved (was: Patch Available)

    Committed to trunk and branch. Thanks for review stack and andrew.
    filters are not working correctly
    ---------------------------------

    Key: HBASE-1790
    URL: https://issues.apache.org/jira/browse/HBASE-1790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: filters
    Affects Versions: 0.20.0, 0.21.0
    Reporter: Matus Zamborsky
    Assignee: Jonathan Gray
    Fix For: 0.20.0, 0.21.0

    Attachments: 1790-3.patch, HBASE-1790-v2.patch, HBASE-1790-v4.patch, HBASE-1790-v5.patch, HBASE-1790-v6.patch, HBASE-1790-v7.patch, hbase-1790.patch, testfilter.patch


    Filters used in Scanning the table are not working correctly. For example a table with three rows:
    1. rowkey = adminbackslash-nb0, desc:temp = "temp"
    2. rowkey = adminbackslash-nb1, desc:temp = "temp"
    3. rowkey = adminkleptoman, desc:temp = "temp"
    If I scan all rows in the table without filter I get all the rows as expected. But applying a simple prefixfilter with parameter "adminbackslash" will return only first row. I searched it down to HRegion::nextInternal method, which will not output one passed row before denied row(by filter).
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedAug 24, '09 at 7:25a
activeSep 2, '09 at 11:25p
posts27
users1
websitehbase.apache.org

1 user in discussion

Jonathan Gray (JIRA): 27 posts

People

Translate

site design / logo © 2022 Grokbase