Grokbase Groups HBase dev August 2008
FAQ
compaction can return less versions then we should in some cases
----------------------------------------------------------------

Key: HBASE-855
URL: https://issues.apache.org/jira/browse/HBASE-855
Project: Hadoop HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 0.2.1, 0.18.0
Reporter: Billy Pearson
Assignee: Billy Pearson


say we have a column with max version = 3 and we have 3 records
we insert a new record with a old timestamp.

What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
versions if the new record(s) with the old timestamp has a expired ttl.

This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.

Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Billy Pearson (JIRA) at Aug 29, 2008 at 7:51 am
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Billy Pearson updated HBASE-855:
    --------------------------------

    Affects Version/s: (was: 0.18.0)

    This also effects 0.18.0 we can fix it in 0.18.0 or 0.18.1 eather way is fine with me
    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1
    Reporter: Billy Pearson
    Assignee: Billy Pearson

    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Aug 29, 2008 at 6:38 pm
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Billy Pearson updated HBASE-855:
    --------------------------------

    Fix Version/s: 0.18.0
    0.2.1
    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1
    Reporter: Billy Pearson
    Assignee: Billy Pearson
    Fix For: 0.2.1, 0.18.0


    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Aug 30, 2008 at 3:10 am
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Billy Pearson updated HBASE-855:
    --------------------------------

    Attachment: 855-patch.txt

    This should fix a problem I seen in the max version and ttl

    stack commented on HBASE-834 that he still seeing old version of data in .META. historian

    I looked around and thank I might have found a second bug that might be the problem and included it in the patch

    When we see a new row/column we where using the var timesSeen to track the times we have seen the same row/column.
    but on a new row/column we set it to 0 on the first new row/column seen
    then below we where check for max versions limit like this
    if (timesSeen <= family.getMaxVersions()

    With the timesSeen set to 0 that would allow 1 extra cell to pass on to the HStoreFile then the max versions setting
    We should have been setting timesSeen to 1 in place of 0 sense its the first row/column seen not 0

    I thank this might be your problem you are seeing stack when the record is
    deleted there could still be one old cell hidden in the table that will not show up until the newest cell is deleted.

    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1
    Reporter: Billy Pearson
    Assignee: Billy Pearson
    Fix For: 0.2.1, 0.18.0

    Attachments: 855-patch.txt


    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Aug 30, 2008 at 3:12 am
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Billy Pearson updated HBASE-855:
    --------------------------------

    Affects Version/s: 0.18.0
    Status: Patch Available (was: Open)

    patch should apply to trunk and branch
    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1, 0.18.0
    Reporter: Billy Pearson
    Assignee: Billy Pearson
    Fix For: 0.2.1, 0.18.0

    Attachments: 855-patch.txt


    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 30, 2008 at 7:16 pm
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627234#action_12627234 ]

    stack commented on HBASE-855:
    -----------------------------

    Thanks Billy. TestCompactions is failing since we changed how compactions are done. I was looking at it yesterday and was thinking that 855 was not fully to blame. This new bug in our max versions logic might explain it. Let me test.
    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1, 0.18.0
    Reporter: Billy Pearson
    Assignee: Billy Pearson
    Fix For: 0.2.1, 0.18.0

    Attachments: 855-patch.txt


    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 31, 2008 at 4:22 am
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-855:
    ------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    Ran tests (took a while). It works. Applied to trunk and branch. Thanks for the patch Billy.
    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1, 0.18.0
    Reporter: Billy Pearson
    Assignee: Billy Pearson
    Fix For: 0.2.1, 0.18.0

    Attachments: 855-patch.txt


    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 31, 2008 at 4:38 am
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627273#action_12627273 ]

    stack commented on HBASE-855:
    -----------------------------

    This patch also fixes failing compaction test.
    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1, 0.18.0
    Reporter: Billy Pearson
    Assignee: Billy Pearson
    Fix For: 0.2.1, 0.18.0

    Attachments: 855-patch.txt


    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 31, 2008 at 5:06 am
    [ https://issues.apache.org/jira/browse/HBASE-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627279#action_12627279 ]

    stack commented on HBASE-855:
    -----------------------------

    Hudson just passed; build #294.
    compaction can return less versions then we should in some cases
    ----------------------------------------------------------------

    Key: HBASE-855
    URL: https://issues.apache.org/jira/browse/HBASE-855
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.1, 0.18.0
    Reporter: Billy Pearson
    Assignee: Billy Pearson
    Fix For: 0.2.1, 0.18.0

    Attachments: 855-patch.txt


    say we have a column with max version = 3 and we have 3 records
    we insert a new record with a old timestamp.
    What happeds in the compaction is the the new record with the old timestamp get read first and could push out some of our
    versions if the new record(s) with the old timestamp has a expired ttl.
    This happens because we track the total times we see a row/column but do not reduce this count if the cell is expired
    and sense we pass the cell in order of the newest HStoreFile first with the newest records passed might not be the newest timestamps.
    Got to wait for HBASE-834 to be committed then I can add a patch for this bug. will be a simple fix.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedAug 29, '08 at 7:47a
activeAug 31, '08 at 5:06a
posts9
users1
websitehbase.apache.org

1 user in discussion

stack (JIRA): 9 posts

People

Translate

site design / logo © 2022 Grokbase