FAQ
Can you show the output of a tpstats on one of the effected nodes? That
will give some indication where the trouble might be.

Patrick
On Tue, Apr 19, 2016 at 6:54 AM, sai krishnam raju potturi wrote:

hi;
do we see any hung process like Repairs on those 3 nodes? what does
"nodetool netstats" show??

thanks
Sai
On Tue, Apr 19, 2016 at 8:24 AM, Erik Forsberg wrote:

Hi!

I have this problem where 3 of my 84 nodes misbehave with too long GC
times, leading to them being marked as DN.

This happens when I load data to them using CQL from a hadoop job, so
quite a lot of inserts at a time. The CQL loading job is using
TokenAwarePolicy with fallback to DCAwareRoundRobinPolicy. Cassandra java
driver version 2.1.7.1 is in use.

My other observation is that around the time the GC starts to work like
crazy, there is a lot of outbound network traffic from the troublesome
nodes. If a healthy node has around 25 Mbit/s in, 25 Mbit/s out, an
unhealthy sees 25 Mbit/s in, 200 Mbit/s out.

So, something is iffy with these 3 nodes, but I have some trouble finding
out exactly what makes them differ.

This is Cassandra 2.0.13 (yes, old) using vnodes. Keyspace is using
NetworkTopologyStrategy with replication 2, in one datacenter.

One thing I know I'm doing wrong is that I have slightly differing number
of hosts in each of my 6 chassies (One of them have 15 nodes, one of have
13, the remaining have 14). Could what I'm seeing here be the effect of
that?

Other ideas on what could be wrong? Some kind of vnode imbalance? How can
I diagnose that? What metrics should I be looking at?

Thanks,
\EF

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 4 | next ›
Discussion Overview
groupuser @
categoriescassandra
postedApr 19, '16 at 12:24p
activeApr 21, '16 at 12:21p
posts4
users3
websitecassandra.apache.org
irc#cassandra

People

Translate

site design / logo © 2022 Grokbase