FAQ
[hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive
---------------------------------------------------------------------------------------------------------------------

Key: HADOOP-2731
URL: https://issues.apache.org/jira/browse/HADOOP-2731
Project: Hadoop Core
Issue Type: Bug
Components: contrib/hbase
Reporter: Bryan Duxbury


When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.

Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.

If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.

Both Marc Harris and myself have seen this behavior.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Billy Pearson (JIRA) at Jan 30, 2008 at 1:56 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563797#action_12563797 ]

    Billy Pearson commented on HADOOP-2731:
    ---------------------------------------

    I am talking on a theory I have about hot spots like this over at
    HADOOP-2615

    Take a look and see what you thank about the idea on the New idea on compaction process I had and comment if you like it or have anything better.
    [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive
    ---------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury

    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Jan 30, 2008 at 1:59 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563797#action_12563797 ]

    viper799 edited comment on HADOOP-2731 at 1/29/08 5:58 PM:
    ----------------------------------------------------------------

    I am talking on a theory I have about hot spots like this over at
    HADOOP-2615

    Take a look and see what you thank about the idea on the New idea on compaction process I had and comment if you like it or have anything better.
    I thank this would solve this problem if we only compacted a few newest map files at a time the splitter would check the region more often to see if it needs split and do so if needed.


    was (Author: viper799):
    I am talking on a theory I have about hot spots like this over at
    HADOOP-2615

    Take a look and see what you thank about the idea on the New idea on compaction process I had and comment if you like it or have anything better.
    [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive
    ---------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury

    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Jan 30, 2008 at 2:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563811#action_12563811 ]

    Bryan Duxbury commented on HADOOP-2731:
    ---------------------------------------

    I've been taking a look at the RegionServer side of things to try and understand why a split wouldn't occur. Some code:

    {code}
    if (e.getRegion().compactIfNeeded()) {
    splitter.splitRequested(e);
    }
    {code}

    We only queue a region to be split if it's just been compacted. I assume the rationale here is that unless a compaction occurred, there'd be no reason to split in the first place. I'm not convinced that's true, however. A store will only compact if it has more mapfiles than the compaction threshold, which in the case of some of my regions, wasn't the case - the individual mapfiles were 1.5GiB, but there were only 2. As a result, compaction and thus splitting was skipped. Shouldn't we be testing to see if the overall size of the mapfiles make splitting necessary, rather than letting the compaction determine whether we do anything?

    Perhaps we should add an optional compaction. Instead of testing HStore.needsCompaction, which only checks if it is above the compaction threshold, maybe we should also have a isCompactable, which just checks if there is more than one mapfile. The optional compacts could happen behind mandatory, threshold-based compacts. Then, we could always put an HStore on the compact queue whenever there is an event that would cause a change to the number of mapfiles, with the constraint that if the store is already on the compact queue, we don't re-add it.

    If we did all of that, then it would probably put us in the right state to keep the split thread doing exactly what it is doing right now, but splits will also happen in downtime.
    [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive
    ---------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury

    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Jan 30, 2008 at 6:36 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Attachment: split.patch

    One problem I've seen is that the compaction runs, a split is queued and needs to run because we exceed thresholds but during the compaction, we dumped a bunch of flushes -- so many we exceed the compaction threshold and it happens to run again before the split has had a chance to run (2712 made it so this situation does not cascade -- splitter will run after the second split but its not enough).

    Patch that puts the compactor and splitter threads back together again so that a new compaction will not run if a split is needed.

    Testing, it needs more work. Putting aside for the moment to see if HADOOP-2636 helps with this issue.
    [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive
    ---------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Jan 30, 2008 at 7:21 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563869#action_12563869 ]

    stack commented on HADOOP-2731:
    -------------------------------

    HADOOP-2636 doesn't seem to help with this problem in particuar.

    Running 8 clients doing PerformanceEvaluation, what I'm looking for is steady number of files to compact on each run.

    Here are first four compactions before the patch:
    {code}
    2008-01-30 07:03:06,803 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 530893190/info
    2008-01-30 07:04:25,345 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 530893190/info
    2008-01-30 07:06:35,573 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 530893190/info
    2008-01-30 07:11:14,999 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 9 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 560724365/info
    {code}

    A split ran in between the 3rd and 4th compaction.

    Here are first four compactions after application of patch:
    {code}
    2008-01-30 06:43:17,972 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 4 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 1984834473/info
    2008-01-30 06:44:54,734 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 1984834473/info
    2008-01-30 06:48:53,389 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 7 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 712183868/info
    2008-01-30 06:53:25,746 DEBUG org.apache.hadoop.hbase.HStore: started compaction of 9 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir for 712183868/info
    {code}
    [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive
    ---------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Jan 31, 2008 at 6:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Attachment: split-v8.patch

    Problem: We have no governor on flushes so its possible -- especially after all the performance improvements since 0.15.x -- to flush at a rate that overwhelms the rate at which we compact store files

    The attached patch does not solve the overrun problem. It does its best at ameliorating the problem by making splits happen more promptly and then post split, makes it so when region is quiescent, if store files of > 256M -- even if only 1 of them -- we'll split.

    More detail on patch:

    + HADOOP-2712 set a flag such that if a split was queued, we'd stop further compactions. Turns out that more often than not, it was possible for one last compaction to start before the split ran. This patch puts back together the compacting and splitting thread; a check if split is needed will always follow a com
    paction check (event-driven this was not guaranteed when the threads were distinct). Also made it possible to split though no compaction was run (Removed the 2172 addition. Was too subtle).
    + Flushes could also get in the way of a split so now flushes are blocked too when split is queued.
    + On open, check if region needs to be compacted (Previous this check was only done after first flush -- which could be 20/30s out
    + Made it so we split if > 256M, not if > 1.5 * 256M. Set the multiplier on flushes to be 1 instead of 2 so we flush at 64M, not 64M plus some slop. Regularizes split and flushes.
    + Make it so we'll split if only one file > 256M and that we'll compact if only one file but it has references to parent region

    I tried Billy's suggestion of putting a cap on number of mapfiles to compact in the one go. We need to do more work though before we can make use of this suggested technique because regions that hold references are not splittable: I was compacting the N oldest, then on second compactions, would do N oldest again. but the remainder could have references to parent regions and so couldn't be split. Meantime we'd accumulate flush files -- the region would never split and the count of flush files would overwhelm the compactor. Need to be smarter and do as Billy suggests and pick up the small files)
    [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive
    ---------------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v8.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Jan 31, 2008 at 6:33 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Summary: [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive (was: [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive)

    Load doesn't have to be extreme.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v8.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Jan 31, 2008 at 8:33 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Attachment: split-v9.patch

    v9 adds one line to TestCompaction, setting old config. so it will pass.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Jan 31, 2008 at 8:35 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Status: Patch Available (was: Open)

    Passes tests locally.

    I think this issue a blocker but won't mark it so until Bryan Duxbury test says this patch is an improvement over old behavior (and jimk said he'd review). Meantime, moving it to hudson to make sure its ok w/ him in case we end up committing.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Jan 31, 2008 at 9:53 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564518#action_12564518 ]

    Bryan Duxbury commented on HADOOP-2731:
    ---------------------------------------

    After about 45% of 1 million 10KB rows imported, the import started to slow down markedly. I did a little DFS digging to get a sense of the size of mapfiles:

    {code}
    [rapleaf@tf1 hadoop]$ bin/hadoop dfs -lsr / | grep test_table | grep "mapfiles/[^/]*/data" | grep -v compaction.dir | awk '{print $4}' | sort -n | awk '{print $1 / 1024 / 1024}'

    0.589743
    21.5422
    29.4829
    36.4409
    36.834
    54.6908
    56.6071
    60.0075
    61.7568
    64
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.3218
    65.3046
    68.1251
    68.9211
    71.2503
    73.2158
    73.9037
    77.5301
    82.1786
    83.0631
    83.1417
    88.94
    92.9497
    98.2762
    111.76
    112.399
    116.162
    119.337
    127.572
    128.496
    657.9
    760.569
    1261.14
    1564.22
    {code}

    (If you can't read awk, that's size in megabytes of each mapfile in the DFS for my test table).

    There's only 7 regions, and the biggest is almost 1.5 GiB. I will report again when the job has completed and the cluster has had a chance to cool down.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Jan 31, 2008 at 10:27 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564527#action_12564527 ]

    Bryan Duxbury commented on HADOOP-2731:
    ---------------------------------------

    I may have spoken too soon. After a bit of a slowdown around 45%, some splits burst through and the writing rate increased back to what I expected it would be.

    Right at the end of the job, I still have a bunch of big mapfiles:

    {code}
    [rapleaf@tf1 hadoop]$ bin/hadoop dfs -lsr / | grep test_table | grep "mapfiles/[^/]*/data" | grep -v compaction.dir | awk '{print $4}' | sort -n | awk '{print $1 / 1024 / 1024}'
    18.6529
    18.987
    20.3924
    25.5912
    30.4755
    32.5393
    57.0985
    60.0075
    60.2728
    61.7568
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2137
    64.2235
    64.2235
    64.2432
    64.2432
    69.8449
    75.3975
    76.5179
    77.766
    79.3581
    81.8543
    82.6503
    83.0631
    88.94
    90.8564
    92.5664
    97.2247
    101.814
    104.703
    105.116
    110.62
    113.814
    127.543
    128.427
    128.427
    128.427
    128.427
    128.516
    353.175
    367.907
    471.401
    474.664
    575.348
    657.9
    906.067
    921.349
    1578.89
    {code}

    25 minutes after, I've had a few more splits, getting me up to 23 regions overall, with only 40 mapfiles. Some of the files are still much larger than they should be.

    I definitely see this having been an improvement. I don't think it's the whole way yet.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Jan 31, 2008 at 10:37 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564529#action_12564529 ]

    Hadoop QA commented on HADOOP-2731:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12374498/split-v9.patch
    against trunk revision 616796.

    @author +1. The patch does not contain any @author tags.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests -1. The patch failed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1717/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1717/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1717/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1717/console

    This message is automatically generated.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 4:29 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Attachment: split-v10.patch
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v10.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 4:31 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Status: In Progress (was: Patch Available)

    This version makes use of old property that used to hold the splitcompactor threads wait interval. Using it w/ 20 seconds instead of 15. Means we take on writes slower, but doing math (times flushes take, intervals at which they run, how long it takes a compaction to run generally, etc.), should make things more robust.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Attachments: split-v10.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 5:41 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack reassigned HADOOP-2731:
    -----------------------------

    Assignee: stack
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 5:45 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Attachment: split-v11.patch

    Patch that passes TTMR and TTI. Splits get cleaned up real fast now. Changed the mutl-region maker so it will work even if the parent region has been deleted already by the time it goes looking for it.

    Also, chatting w/ Bryan, the numbers he pastes above were from a run that did not have split-v8.patch in place.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 5:45 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Status: Patch Available (was: In Progress)

    Trying hudson again
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 1, 2008 at 8:17 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564679#action_12564679 ]

    Hadoop QA commented on HADOOP-2731:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12374528/split-v11.patch
    against trunk revision 616796.

    @author +1. The patch does not contain any @author tags.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests -1. The patch failed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1720/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1720/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1720/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1720/console

    This message is automatically generated.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 1, 2008 at 9:15 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564936#action_12564936 ]

    Jim Kellerman commented on HADOOP-2731:
    ---------------------------------------

    Reviewed patch. +1
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 9:23 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Attachment: split-v12.patch

    I don't know why failed in TTI. Tried it locally again and its fine. Enabled mapred logging so can see better why it failed.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 9:25 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Status: In Progress (was: Patch Available)

    Patch also includes accomodation of jim review comments removing HConstant change and fixing bad method name.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 1, 2008 at 9:25 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Status: Patch Available (was: In Progress)

    Retrying hudson to get more info on why failed.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 1, 2008 at 11:02 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564969#action_12564969 ]

    Hadoop QA commented on HADOOP-2731:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12374583/split-v12.patch
    against trunk revision 616796.

    @author +1. The patch does not contain any @author tags.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1723/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1723/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1723/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1723/console

    This message is automatically generated.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 1, 2008 at 11:57 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564981#action_12564981 ]

    Bryan Duxbury commented on HADOOP-2731:
    ---------------------------------------

    I've applied the latest patch and tested again. My import job finished with a minimum of errors. and all my mapfiles are significantly smaller. On top of that, splits are happening much more frequently - 2-4x as often I would say.

    There may still be other issues lurking around this area of functionality, but I would commit this issue.

    +1
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 2, 2008 at 12:31 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Priority: Blocker (was: Major)

    Marking as a blocker because without it, folks' initial impression of hbase will be extremely negative (without this patch hbase under even moderate load struggles; if load is sustained, hbase can go unresponsive; if the loading completes, hbase cluster will have a few big regions often way in excess of the configured maximum size that will never be broken down).
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Priority: Blocker
    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 2, 2008 at 12:45 am
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HADOOP-2731:
    --------------------------

    Resolution: Fixed
    Fix Version/s: 0.16.0
    Status: Resolved (was: Patch Available)

    Committed to TRUNK and backported to 0.16.0.
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Priority: Blocker
    Fix For: 0.16.0

    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Feb 2, 2008 at 12:40 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565068#action_12565068 ]

    Hudson commented on HADOOP-2731:
    --------------------------------

    Integrated in Hadoop-trunk #387 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/387/])
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Priority: Blocker
    Fix For: 0.16.0

    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 3, 2008 at 8:18 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565209#action_12565209 ]

    Billy Pearson commented on HADOOP-2731:
    ---------------------------------------

    A+ Job on this patch guys! This makes hbase much more stable then it was, good job
    [hbase] Under load, regions become extremely large and eventually cause region servers to become unresponsive
    -------------------------------------------------------------------------------------------------------------

    Key: HADOOP-2731
    URL: https://issues.apache.org/jira/browse/HADOOP-2731
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/hbase
    Reporter: Bryan Duxbury
    Assignee: stack
    Priority: Blocker
    Fix For: 0.16.0

    Attachments: split-v10.patch, split-v11.patch, split-v12.patch, split-v8.patch, split-v9.patch, split.patch


    When attempting to write to HBase as fast as possible, HBase accepts puts at a reasonably high rate for a while, and then the rate begins to drop off, ultimately culminating in exceptions reaching client code. In my testing, I was able to write about 370 10KB records a second to HBase until I reach around 1 million rows written. At that point, a moderate to large number of exceptions - NotServingRegionException, WrongRegionException, region offline, etc - begin reaching the client code. This appears to be because the retry-and-wait logic in HTable runs out of retries and fails.
    Looking at mapfiles for the regions from the command line shows that some of the mapfiles are between 1 and 2 GB in size, much more than the stated file size limit. Talking with Stack, one possible explanation for this is that the RegionServer is not choosing to compact files often enough, leading to many small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the time comes to do a split or "major" compaction, it takes an unexpectedly long time to complete these operations. This translates into errors for the client application.
    If I back off the import process and give the cluster some quiet time, some splits and compactions clearly do take place, because the number of regions go up and the number of mapfiles/region goes down. I can then begin writing again in earnest for a short period of time until the problem begins again.
    Both Marc Harris and myself have seen this behavior.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJan 29, '08 at 9:54p
activeFeb 3, '08 at 8:18p
posts29
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Billy Pearson (JIRA): 29 posts

People

Translate

site design / logo © 2022 Grokbase