Hello,
I just wanted to make sure that I'm interpreting a series of common issues correctly.
I saw ZK expirations causing regionserver failures, and this in a GC log of one of the regionservers:
16237.033: [GC[YG occupancy: 22353 K (38336 K)]16245.298: [Rescan (parallel) , 0.0264040 secs]16245.325: [weak refs processing, 0.0000970 secs] [1 CMS-remark: 1465176K(3282456K)] 1487530K(3320792K), 0.0266760 secs] [Times: user=0.02 sys=0.01, real=8.29 secs]
5328.127: [GC[YG occupancy: 27822 K (38336 K)]5334.773: [Rescan (parallel) , 0.0156270 secs]5334.788: [weak refs processing, 0.0003130 secs] [1 CMS-remark: 1144288K(2375464K)] 1172111K(2413800K), 0.0161190 secs] [Times: user=0.02 sys=0.00, real=6.66 secs]
I noted the rather large delta between the user/sys times & the real times here:
[Times: user=0.02 sys=0.00, real=6.66 secs]
[Times: user=0.02 sys=0.01, real=8.29 secs]
So I'm assuming in the second of the two common causes of the GC issues?
That is, CPU or I/O bound M/R tasks are starving the GC of CPU time?
Just wanted to check that I was stringing the logic (and logs) together correctly.
Thanks!
Take care,
-stu