FAQ
FilterList of prefix and columnvalue not working properly with deletes and multiple values
------------------------------------------------------------------------------------------

Key: HBASE-1906
URL: https://issues.apache.org/jira/browse/HBASE-1906
Project: Hadoop HBase
Issue Type: Bug
Reporter: stack
Fix For: 0.20.2, 0.21.0


Attached are some unit tests from client and region that demonstrate the failing issues.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • stack (JIRA) at Oct 14, 2009 at 12:17 am
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1906:
    -------------------------

    Attachment: filterlist.patch
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Oct 14, 2009 at 5:43 am
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1906:
    -------------------------

    Attachment: 1906-v2.patch

    Just formatting clean up. No fix yet.
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Oct 14, 2009 at 5:55 am
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765412#action_12765412 ]

    stack commented on HBASE-1906:
    ------------------------------

    @jgray What seems to be happening is that we can exit the loop in HRegion#nextInternal without a call to filterRow. W/o this call, stuff is left in when we exit via:

    {code}
    if (filter != null && filter.filterRowKey(row, 0, row.length)) {
    if (!results.isEmpty() && !Bytes.equals(currentRow, row)) {
    return true;
    }
    {code}

    Its as though this test should be done first:

    {code}
    if (!Bytes.equals(currentRow, row)) {
    {code}

    ... before we see if a row should be filtered out based off row key.

    If filtered out by filterRowKey, then need to run filterRow on results already accumulated somehow.

    Will keep digging but input if any appreciated.

    That deletes can come out of the peek seems fine after looking at it some...

    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Oct 14, 2009 at 5:48 pm
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1906:
    -------------------------

    Attachment: 1906-v3.patch

    Here's a fix. Most of the patch is just formatting changes (apart from the addition of the two tests --- one client-side and other on HRegion). The fix is in the HRegion#nextInternal. I halved its size. It was duplicating function using near-duplicate code. Importantly, there was a code path where we could exit with results without calling filterRow. The tests had filters that would rule out a whole row if filterRow was called.
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, 1906-v3.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Oct 14, 2009 at 6:57 pm
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1906:
    -------------------------

    Attachment: 1906-v4.patch

    This patch passes all tests.
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Oct 14, 2009 at 7:35 pm
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765707#action_12765707 ]

    Jonathan Gray commented on HBASE-1906:
    --------------------------------------

    +1 for commit. Reviewed patch but did not test, if all existing (and new) filter tests pass then should be okay.

    New HRegion.nextInternal() looks great, thanks for cleaning up that mess stack.
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Oct 14, 2009 at 7:48 pm
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack resolved HBASE-1906.
    --------------------------

    Resolution: Fixed
    Assignee: stack
    Hadoop Flags: [Reviewed]

    Applied to branch and trunk.
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Assignee: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Oct 16, 2009 at 5:22 am
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766420#action_12766420 ]

    stack commented on HBASE-1906:
    ------------------------------

    Here is some more detail on this issue.

    The illustrative code put up a table with 5 column families and added values. It then set up a scanner that used a FilterList of two Filters against one of the column families. The first filter was a prefix filter. The second a test on the cell content. The behavior wanted was that only rows that matched the prefix and the supplied cell value should be returned.

    Before the fix was applied, we would do the right thing -- return rows that matched on prefix and cell value -- but then we'd tag onto the resultset part of a row; its rowid would match the prefix filter but it would not have the required cell content. We'd return all columns that sorted before the column that had the cell the filter was testing.

    The illustrating code then threw in deletes of the cell we were testing on but we were still returning the part row (IIRC).

    What was happening was that there was a code path whereby we could leave the internal next loop without calling the filter filterRow method. This latter method, if given the chance, was knocking out rows that didn't match on both supplied filters. Skipping out without its invocation was letting out candidate results that should have been suppressed.

    Here is the old code:

    {code}
    1745 private boolean nextInternal() throws IOException {
    1746 // This method should probably be reorganized a bit... has gotten messy
    1747 KeyValue kv;
    1748 byte[] currentRow = null;
    1749 boolean filterCurrentRow = false;
    1750 while (true) {
    1751 kv = this.storeHeap.peek();
    1752 if (kv == null) {
    1753 return false;
    1754 }
    1755 byte [] row = kv.getRow();
    1756 if (filterCurrentRow && Bytes.equals(currentRow, row)) {
    1757 // filter all columns until row changes
    1758 this.storeHeap.next(results);
    1759 results.clear();
    1760 continue;
    1761 }
    1762 // see if current row should be filtered based on row key
    1763 if ((filter != null && filter.filterRowKey(row, 0, row.length)) ||
    1764 (oldFilter != null && oldFilter.filterRowKey(row, 0, row.length))) {
    1765 if(!results.isEmpty() && !Bytes.equals(currentRow, row)) {
    1766 return true;
    1767 }
    1768 this.storeHeap.next(results);
    1769 results.clear();
    1770 resetFilters();
    1771 filterCurrentRow = true;
    1772 currentRow = row;
    1773 continue;
    1774 }
    1775 if(!Bytes.equals(currentRow, row)) {
    1776 // Continue on the next row:
    1777 currentRow = row;
    1778 filterCurrentRow = false;
    1779 // See if we passed stopRow
    1780 if(stopRow != null &&
    1781 comparator.compareRows(stopRow, 0, stopRow.length,
    1782 currentRow, 0, currentRow.length) <= 0) {
    1783 return false;
    1784 }
    1785 // if there are _no_ results or current row should be filtered
    1786 if (results.isEmpty() || filter != null && filter.filterRow()) {
    1787 // make sure results is empty
    1788 results.clear();
    1789 resetFilters();
    1790 continue;
    1791 }
    1792 return true;
    1793 }
    1794 this.storeHeap.next(results);
    1795 }
    1796 }
    1797
    1798 public void close() {
    1799 storeHeap.close();
    1800 }
    {code}

    We would exit at #1766 without calling filter.filterRow rather than at #1792.

    The above method was rewritten so we don't skip out without calling filterRow.

    {code}
    private boolean nextInternal() throws IOException {
    byte [] currentRow = null;
    boolean filterCurrentRow = false;
    while (true) {
    KeyValue kv = this.storeHeap.peek();
    if (kv == null) return false;
    byte [] row = kv.getRow();
    boolean samerow = Bytes.equals(currentRow, row);
    if (samerow && filterCurrentRow) {
    // Filter all columns until row changes
    readAndDumpCurrentResult();
    continue;
    }
    if (!samerow) {
    // Continue on the next row:
    currentRow = row;
    filterCurrentRow = false;
    // See if we passed stopRow
    if (this.stopRow != null &&
    comparator.compareRows(this.stopRow, 0, this.stopRow.length,
    currentRow, 0, currentRow.length) <= 0) {
    return false;
    }
    if (hasResults()) return true;
    }
    // See if current row should be filtered based on row key
    if (this.filter != null && this.filter.filterRowKey(row, 0, row.length)) {
    readAndDumpCurrentResult();
    resetFilters();
    filterCurrentRow = true;
    currentRow = row;
    continue;
    }
    this.storeHeap.next(results);
    }
    }
    {code}
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Assignee: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Oct 16, 2009 at 6:14 am
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766435#action_12766435 ]

    stack commented on HBASE-1906:
    ------------------------------

    Any filter that depends on the filterRow will give odd results because this final step in the filter process may not get called if a row has more than one column.
    FilterList of prefix and columnvalue not working properly with deletes and multiple values
    ------------------------------------------------------------------------------------------

    Key: HBASE-1906
    URL: https://issues.apache.org/jira/browse/HBASE-1906
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Assignee: stack
    Fix For: 0.20.2, 0.21.0

    Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch


    Attached are some unit tests from client and region that demonstrate the failing issues.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedOct 14, '09 at 12:15a
activeOct 16, '09 at 6:14a
posts10
users1
websitehbase.apache.org

1 user in discussion

stack (JIRA): 10 posts

People

Translate

site design / logo © 2022 Grokbase