Grokbase Groups HBase dev August 2008
FAQ
During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
----------------------------------------------------------------------------------------------------------------------

Key: HBASE-790
URL: https://issues.apache.org/jira/browse/HBASE-790
Project: Hadoop HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 0.2.0
Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
Reporter: Jonathan Gray
Fix For: 0.2.0


During a batch import, I have two processes importing into a single region.

The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.

Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:

2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v


As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.

All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.

The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Jonathan Gray (JIRA) at Aug 1, 2008 at 12:41 am
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-790:
    --------------------------------

    Attachment: regionserver-lockup.log

    Full regionserver log
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 1, 2008 at 2:29 am
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618918#action_12618918 ]

    stack commented on HBASE-790:
    -----------------------------

    Rong-en has been running a version hbase from around the same vintage as jgray's above and also sees the 20 minute blocks.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 1, 2008 at 8:09 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-790:
    ------------------------

    Attachment: 790.patch

    If we're going to block, make sure a flush has been requested. Also adding logging of flush request. The log has us just sitting around doing nothing; don't know if a flush request has come in.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 1, 2008 at 9:04 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619132#action_12619132 ]

    stack commented on HBASE-790:
    -----------------------------

    Committed the 790 above to see if it helps with this issue (Trying to reproduce). Also adds logging.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 2, 2008 at 12:11 am
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-790:
    ------------------------

    Attachment: 790-v2.patch

    Patch for jon gray to test.

    Usually block/unblock runs fine but then it hangs. Logging added by earlier patch shows no flush has been queued. Our mechanism for queuing flushes and subsequent clear of flush flag is faulty. This patch is a bit of a hack that does check before we go into HRegion synchronized method.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Andrew Purtell at Aug 2, 2008 at 12:30 am
    Since the commit for HBASE-751 went in I have not been able to reproduce the related problems we saw on our clusters.

    Great job Stack and Jim!

    - Andy
  • Rong-En Fan (JIRA) at Aug 2, 2008 at 7:34 am
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619200#action_12619200 ]

    Rong-En Fan commented on HBASE-790:
    -----------------------------------

    In my env, I have multiplier of 2 (HBASE-779) and see this blocking.

    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Rong-En Fan (JIRA) at Aug 2, 2008 at 8:04 am
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619201#action_12619201 ]

    Rong-En Fan commented on HBASE-790:
    -----------------------------------

    hmm.. look at the original description again. Not sure if I'm seeing the same issue as I did not see "Discarding update in my region server logs".

    What I saw is that during the bulk (MR or not), something the client just sits idle, and all region servers are also idle. thread dump shows client is waiting in the locationRegionInMeta method. After 10~20 mins later, the client continues without any exceptions. For MR load, I have to increase the task.timeout from 10m to 30m to prevent task tracker to kill the tasks...
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 2, 2008 at 5:46 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619263#action_12619263 ]

    stack commented on HBASE-790:
    -----------------------------

    Just tried reproducing the 'hanging blocking' on cluster here. Tried various loading combinations -- many column families, loading all of them concurrently and then just some -- but to no avail.

    Jon: Want to retry with HBASE-779 enabled? With the edit to your hbase-default.xml? In other words, raw TRUNK. If it still hangs, try with the above v2 patch? It adds extra signalling flush is needed just before we go into sync block.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Clint Morgan (JIRA) at Aug 3, 2008 at 4:16 am
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619298#action_12619298 ]

    Clint Morgan commented on HBASE-790:
    ------------------------------------

    +1 I had an upload task that was timing out on trunk, and this latest patch fixed it.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Aug 4, 2008 at 8:07 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jim Kellerman updated HBASE-790:
    --------------------------------

    Priority: Blocker (was: Major)

    HBASE-790 is the only blocker for hbase-0.2.0 at this point
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Aug 4, 2008 at 8:11 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619673#action_12619673 ]

    Jonathan Gray commented on HBASE-790:
    -------------------------------------

    -1 on trunk. Got same behavior with latest trunk, applying 790 patch v2 and testing again right now
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Andrew Purtell (JIRA) at Aug 4, 2008 at 8:25 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Andrew Purtell reassigned HBASE-790:
    ------------------------------------

    Assignee: Andrew Purtell
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 4, 2008 at 10:15 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619724#action_12619724 ]

    stack commented on HBASE-790:
    -----------------------------

    Looks like patch don't work.

    Looking in Jon's logs with the patch applied I see this when we hang up on items,06fcb4d2-b751-4584-8278-3f65f923e1f0,1217884873435:

    {code}
    2008-08-04 14:22:31,828 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,06fcb4d2-b751-4584-8278-3f65f923e1f0,1217884873435. Current region memcache size 64.0m
    2008-08-04 14:22:31,828 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on items,06fcb4d2-b751-4584-8278-3f65f923e1f0,1217884873435
    2008-08-04 14:22:34,447 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region items,06fcb4d2-b751-4584-8278-3f65f923e1f0,1217884873435 in 2619ms, sequence id=40529298, compaction requested=false
    {code}

    Last week we added the logging of when flushes are registered.

    Above see how we have 'Flush requested on...' in the midst of a flush start/finish.

    My guess is that some sequence of registering flushes and clearing flush flags is making it so we go into a block without a flush being registered.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Aug 4, 2008 at 11:27 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619740#action_12619740 ]

    Jonathan Gray commented on HBASE-790:
    -------------------------------------

    -1 on patch v2... same behavior as before
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Clint Morgan (JIRA) at Aug 4, 2008 at 11:59 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619745#action_12619745 ]

    Clint Morgan commented on HBASE-790:
    ------------------------------------

    Disregard my previous vote. My problems were related to another issue. Sorry for the noise.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Aug 5, 2008 at 5:45 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-790:
    --------------------------------

    Status: Patch Available (was: Open)

    After spending quite a bit of time running different load tests on v2 patch, it appears the problem did still exist.

    Stack pointed me towards looking into the possibility of distributed state. I dug in and it seemed that state was being spread across two different locations.

    To fix this, I moved the flushRequested boolean into the WriteState class. This allowed me to make it so its state can only be changed when WriteState was synchronized on.

    Upon further testing it seems that this does work and I'm no longer seeing this issue.

    I am +1 on 790 with this patch, and +1 on rolling 0.2.0 RC2
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Aug 5, 2008 at 5:45 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jonathan Gray updated HBASE-790:
    --------------------------------

    Attachment: 790-v3.patch
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 5, 2008 at 5:57 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619970#action_12619970 ]

    stack commented on HBASE-790:
    -----------------------------

    Patch looks good. Does it really fix the issue? Why'd you remove '- r.setLastFlushTime(now);' in Flusher?

    Let me try this patch up on our cluster.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Aug 5, 2008 at 5:59 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619971#action_12619971 ]

    Jim Kellerman commented on HBASE-790:
    -------------------------------------

    Reviewed patch. +1

    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Andrew Purtell (JIRA) at Aug 5, 2008 at 5:59 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619972#action_12619972 ]

    Andrew Purtell commented on HBASE-790:
    --------------------------------------

    I'm also testing this right now.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Aug 5, 2008 at 6:01 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619974#action_12619974 ]

    Jim Kellerman commented on HBASE-790:
    -------------------------------------

    Wait. Stack's right. Without setting the last flush time, the optional cache flush won't work.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Aug 5, 2008 at 6:11 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619978#action_12619978 ]

    Jim Kellerman commented on HBASE-790:
    -------------------------------------

    Ah, ok. lastFlushTime is set in internalFlushCache so is not needed in Flusher. +1
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jonathan Gray (JIRA) at Aug 5, 2008 at 6:13 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619979#action_12619979 ]

    Jonathan Gray commented on HBASE-790:
    -------------------------------------

    The patch does appear to fix the issue. I have run an identical load test probably 15 times in the past week and last night it ran past 400 total regions without any issues. Previously it had been breaking at either 20 or 80 total regions.

    I will continue testing and also run a full test on production once we get the RC.

    I removed the setLastFlushTime in Flusher because it didn't make sense to be doing it twice.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Andrew Purtell (JIRA) at Aug 5, 2008 at 6:33 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619993#action_12619993 ]

    Andrew Purtell commented on HBASE-790:
    --------------------------------------

    +1
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 5, 2008 at 7:05 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620005#action_12620005 ]

    stack commented on HBASE-790:
    -----------------------------

    I ran a little test on cluster. Ran slightly faster with patch applied. Going to commit it.

    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Aug 5, 2008 at 7:07 pm
    [ https://issues.apache.org/jira/browse/HBASE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-790:
    ------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    Committed. Thanks for patch Jon.
    During import, single region blocks requests for >10 minutes, thread dumps, throws out pending requests, and continues
    ----------------------------------------------------------------------------------------------------------------------

    Key: HBASE-790
    URL: https://issues.apache.org/jira/browse/HBASE-790
    Project: Hadoop HBase
    Issue Type: Bug
    Components: regionserver
    Affects Versions: 0.2.0
    Environment: 11 node cluster. 1 master w/ namenodes and hmaster. 10 slaves w/ datanodes and regionservers. All are 2GHz quad core xeons, 4gb ram, raid 0.
    Reporter: Jonathan Gray
    Assignee: Andrew Purtell
    Priority: Blocker
    Fix For: 0.2.0

    Attachments: 790-v2.patch, 790-v3.patch, 790.patch, regionserver-lockup.log


    During a batch import, I have two processes importing into a single region.
    The behavior I saw was a regionserver with 2 regions of the table in question on it. The first region split, and the new regions were reassigned to another regionserver.
    Following that, inserting into the region that was left over began to block client requests. I am attaching the regionserver log; below is the specific problem area:
    2008-07-31 15:38:24,190 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in table locations for row <> and tableName .META.: location server 72.34.249.217:60020, location region name .META.,,1
    2008-07-31 15:38:24,194 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,12175434512
    2008-07-31 15:38:34,052 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 7 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:00,270 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@17b4239f, row => 02c241b4-9d32-452d-8dab-247f4af693eb, {column => content:title, value => '...', column => content:content, va
    org.apache.hadoop.hbase.NotServingRegionException: items,01beddd6-813b-4f2b-ac48-a0cef395cb7e,1217543451296
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1436)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1147)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
    2008-07-31 15:39:09,547 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:39:44,079 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 9 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:40:19,574 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 1 on 60020' on region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296: Memcache size 64.0m is >= than blocking
    2008-07-31 15:49:09,130 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 1
    2008-07-31 15:49:09,144 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217543884691
    2008-07-31 15:49:09,146 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.212_1217535541159_60020/hlog.dat.1217544549145
    2008-07-31 16:03:09,060 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296. Current region memcache size 64.0m
    2008-07-31 16:03:09,467 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates for region items,8001eb31-98bb-4087-bd8d-e4b42805addb,1217543451296 'IPC Server handler 5 on 60020'
    2008-07-31 16:03:09,478 INFO org.apache.hadoop.ipc.Server: Process Thread Dump: Discarding call batchUpdate([B@4e727e0e, row => c08408b4-b68c-433c-ba3f-d46d3ba73288, {column => content:title, value => '...', column => content:content, v
    As you can see there was a 14 minute delay between updates being blocked, and the unblocking occurring.
    All the pending batchUpdates were thrown out (too old) and then importing proceeded normally.
    The same behavior repeated itself later on a different regionserver, and again after a while it unfroze, kicked out pending updates, and continued.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedAug 1, '08 at 12:38a
activeAug 5, '08 at 7:07p
posts28
users2
websitehbase.apache.org

2 users in discussion

stack (JIRA): 27 posts Andrew Purtell: 1 post

People

Translate

site design / logo © 2022 Grokbase