FAQ
Hi,

I'm running a nodetool rebuild to include a new DC in my cluster.
My config is:
DC1, 2 nodes per rack (2 racks), 70gb each node
DC2, 2 nodes per rack (1 rack), 90gb each node
DC3, 2 nodes per rack (1 rack) (*THIS IS THE NEW DC*)

What I did was get the 2 nodes in DC3 up and running with bootstrap=false,
and then ran a rebuild using DC2 as a parameter.

However, when I started, the load in both new nodes rapidly increased to
1.4GB, according to nodetool status. And then it was slowly increasing for
4 hours, in a 10mb basis. Then, suddenly, 1 node had 49.5GB and the other
followed soon.
In the instance logs, I have only stream messages from when I've started
the rebuild.

My point is, is it normal to Cassandra accumulate this amount of data and
then send it? I was hoping that it was more of a gradual and incremental
proccess.

thanks,

Felipe Esteves

Tecnologia

felipe.esteves@b2wdigital.com <seu.email@b2wdigital.com>

Tel.: (21) 3504-7162 ramal 57162

--

Search Discussions

  • Jeff Jirsa at Feb 26, 2016 at 8:39 pm
    Cassandra is streaming it at a near constant rate (if you had metrics for network interface, you’d probably see that), but it doesn’t register in nodetool status until it completes all of the sstables for a column family. At that point, the -tmp–Data.db files get renamed to drop the –tmp, and they become live on the node.

    I suspect you have a table/CF that’s approximately 47/48gb, and it completed, and it’s size in nodetool status jumped at that time.



    From: Felipe Esteves
    Reply-To: "user@cassandra.apache.org"
    Date: Friday, February 26, 2016 at 11:48 AM
    To: "user@cassandra.apache.org"
    Subject: Nodetool Rebuild sending few big packets of data. Is it normal?

    Hi,

    I'm running a nodetool rebuild to include a new DC in my cluster.
    My config is:
    DC1, 2 nodes per rack (2 racks), 70gb each node
    DC2, 2 nodes per rack (1 rack), 90gb each node
    DC3, 2 nodes per rack (1 rack) (THIS IS THE NEW DC)

    What I did was get the 2 nodes in DC3 up and running with bootstrap=false, and then ran a rebuild using DC2 as a parameter.

    However, when I started, the load in both new nodes rapidly increased to 1.4GB, according to nodetool status. And then it was slowly increasing for 4 hours, in a 10mb basis. Then, suddenly, 1 node had 49.5GB and the other followed soon.
    In the instance logs, I have only stream messages from when I've started the rebuild.

    My point is, is it normal to Cassandra accumulate this amount of data and then send it? I was hoping that it was more of a gradual and incremental proccess.

    thanks,

    Felipe Esteves

    Tecnologia

    felipe.esteves@b2wdigital.com

    Tel.: (21) 3504-7162 ramal 57162
  • Felipe Esteves at Feb 26, 2016 at 9:28 pm
    Hi Jeff,

    Thanks for the info, you're right!

    Felipe Esteves

    Tecnologia

    felipe.esteves@b2wdigital.com <seu.email@b2wdigital.com>

    Tel.: (21) 3504-7162 ramal 57162

    2016-02-26 17:38 GMT-03:00 Jeff Jirsa <jeff.jirsa@crowdstrike.com>:
    Cassandra is streaming it at a near constant rate (if you had metrics for
    network interface, you’d probably see that), but it doesn’t register in
    nodetool status until it completes all of the sstables for a column family.
    At that point, the -tmp–Data.db files get renamed to drop the –tmp, and
    they become live on the node.

    I suspect you have a table/CF that’s approximately 47/48gb, and it
    completed, and it’s size in nodetool status jumped at that time.



    From: Felipe Esteves
    Reply-To: "user@cassandra.apache.org"
    Date: Friday, February 26, 2016 at 11:48 AM
    To: "user@cassandra.apache.org"
    Subject: Nodetool Rebuild sending few big packets of data. Is it normal?

    Hi,

    I'm running a nodetool rebuild to include a new DC in my cluster.
    My config is:
    DC1, 2 nodes per rack (2 racks), 70gb each node
    DC2, 2 nodes per rack (1 rack), 90gb each node
    DC3, 2 nodes per rack (1 rack) (*THIS IS THE NEW DC*)

    What I did was get the 2 nodes in DC3 up and running with bootstrap=false,
    and then ran a rebuild using DC2 as a parameter.

    However, when I started, the load in both new nodes rapidly increased to
    1.4GB, according to nodetool status. And then it was slowly increasing for
    4 hours, in a 10mb basis. Then, suddenly, 1 node had 49.5GB and the other
    followed soon.
    In the instance logs, I have only stream messages from when I've started
    the rebuild.

    My point is, is it normal to Cassandra accumulate this amount of data and
    then send it? I was hoping that it was more of a gradual and incremental
    proccess.

    thanks,

    Felipe Esteves

    Tecnologia

    felipe.esteves@b2wdigital.com <seu.email@b2wdigital.com>

    Tel.: (21) 3504-7162 ramal 57162



    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriescassandra
postedFeb 26, '16 at 7:48p
activeFeb 26, '16 at 9:28p
posts3
users2
websitecassandra.apache.org
irc#cassandra

2 users in discussion

Felipe Esteves: 2 posts Jeff Jirsa: 1 post

People

Translate

site design / logo © 2022 Grokbase