--------------------------------------------
Key: HADOOP-1221
URL: https://issues.apache.org/jira/browse/HADOOP-1221
Project: Hadoop
Issue Type: Bug
Components: dfs
Reporter: Koji Noguchi
We had a namenode stuck in CPU 99% and it was showing a slow response time.
(dfs.namenode.handler.count was still set to 10.)
ReplicationMonitor thread was using the most CPU time.
Jstack showed,
"org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor@1c7b0f4d" daemon prio=10 tid=0x0000002d90690800 nid=0x4855 runnable [0x0000000041941000..0x0000000041941b30]
java.lang.Thread.State: RUNNABLE
at java.util.AbstractList$Itr.remove(AbstractList.java:360)
at org.apache.hadoop.dfs.FSNamesystem.blocksToInvalidate(FSNamesystem.java:2475)
- locked <0x0000002a9f522038> (a org.apache.hadoop.dfs.FSNamesystem)
at org.apache.hadoop.dfs.FSNamesystem.computeDatanodeWork(FSNamesystem.java:1775)
at org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor.run(FSNamesystem.java:1713)
at java.lang.Thread.run(Thread.java:619)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.