FAQ
Hello,

If you set a ttl and expire a column, I've read that this eventually turns
into a tombstone and will be cleaned out by the GC. Are expirations
considered a form of delete that still requires a node repair to be run in
gc_grace_period seconds? The operations guide says you have to run node
repair if you have deletes, so I'm trying to find out if we can upsert the
column with expirations using a ttl=1 to substitute deletes. The node repair
operations is very intensive in our environment and causes a
significant performance degradation on the system.

Thanks

Search Discussions

  • Sylvain Lebresne at Oct 12, 2011 at 7:18 am
    Unfortunately, expiring column are no magic bullet. If you insert
    columns with ttl=1,
    you're roughly doing the same thing than deleting, so the exact same
    rule concerning
    repair applies.

    What can be said on repair and expiring columns (and that may or may
    not be helpful)
    is that if you have a column family on which all and every column you
    insert has a
    ttl > n (for some n, including n = infinity) and ttl are your only
    means of deletion for
    that CF (i.e, no deletes), then it would be enough to run repair on
    that column family
    only every gc_grace + n period of time (instead of every gc_grace period).

    --
    Sylvain
    On Wed, Oct 12, 2011 at 3:49 AM, Terry Cumaranatunge wrote:
    Hello,

    If you set a ttl and expire a column, I've read that this eventually turns
    into a tombstone and will be cleaned out by the GC. Are expirations
    considered a form of delete that still requires a node repair to be run in
    gc_grace_period seconds? The operations guide says you have to run node
    repair if you have deletes, so I'm trying to find out if we can upsert the
    column with expirations using a ttl=1 to substitute deletes. The node repair
    operations is very intensive in our environment and causes a
    significant performance degradation on the system.

    Thanks
  • Jonathan Ellis at Oct 12, 2011 at 1:51 pm
    Well, the reason you'd want to run repair is to get the tombstone on
    nodes that missed the insert. And that would only be important if you
    sometimes generate inserts that would be otherwise shadowed by the
    tombstone, right?
    On Wed, Oct 12, 2011 at 2:17 AM, Sylvain Lebresne wrote:
    Unfortunately, expiring column are no magic bullet. If you insert
    columns with ttl=1,
    you're roughly doing the same thing than deleting, so the exact same
    rule concerning
    repair applies.

    What can be said on repair and expiring columns (and that may or may
    not be helpful)
    is that if you have a column family on which all and every column you
    insert has a
    ttl > n (for some n, including n = infinity) and ttl are your only
    means of deletion for
    that CF (i.e, no deletes), then it would be enough to run repair on
    that column family
    only every gc_grace + n period of time (instead of every gc_grace period).

    --
    Sylvain
    On Wed, Oct 12, 2011 at 3:49 AM, Terry Cumaranatunge wrote:
    Hello,

    If you set a ttl and expire a column, I've read that this eventually turns
    into a tombstone and will be cleaned out by the GC. Are expirations
    considered a form of delete that still requires a node repair to be run in
    gc_grace_period seconds? The operations guide says you have to run node
    repair if you have deletes, so I'm trying to find out if we can upsert the
    column with expirations using a ttl=1 to substitute deletes. The node repair
    operations is very intensive in our environment and causes a
    significant performance degradation on the system.

    Thanks


    --
    Jonathan Ellis
    Project Chair, Apache Cassandra
    co-founder of DataStax, the source for professional Cassandra support
    http://www.datastax.com
  • Sylvain Lebresne at Oct 12, 2011 at 3:33 pm

    On Wed, Oct 12, 2011 at 3:51 PM, Jonathan Ellis wrote:
    Well, the reason you'd want to run repair is to get the tombstone on
    nodes that missed the insert.  And that would only be important if you
    sometimes generate inserts that would be otherwise shadowed by the
    tombstone, right?
    The initial question was about "can I use inserting with ttl=1 instead of
    issuing deletes", so that would be a case where you do shadow a previous
    version with a very small ttl and so repair is important.

    But you're right that if you only issue data with expiration (no deletes) and
    that you
    * either do not overwrite columns
    * or are sure that when you do overwrite, the value you're overwriting has
    a ttl that is lesser or equal than the ttl of the value you're
    overwriting with
    (+gc_grace to be precise)
    then yes, repair is not necessary because you can't have shadowed value
    resurfacing. And that's the case Eric is talking about btw.

    But again, using inserts with tiny ttl in place of tombstones is not
    one of those situation, so repair is necessary as usual.

    --
    Sylvain

    On Wed, Oct 12, 2011 at 2:17 AM, Sylvain Lebresne wrote:
    Unfortunately, expiring column are no magic bullet. If you insert
    columns with ttl=1,
    you're roughly doing the same thing than deleting, so the exact same
    rule concerning
    repair applies.

    What can be said on repair and expiring columns (and that may or may
    not be helpful)
    is that if you have a column family on which all and every column you
    insert has a
    ttl > n (for some n, including n = infinity) and ttl are your only
    means of deletion for
    that CF (i.e, no deletes), then it would be enough to run repair on
    that column family
    only every gc_grace + n period of time (instead of every gc_grace period).

    --
    Sylvain
    On Wed, Oct 12, 2011 at 3:49 AM, Terry Cumaranatunge wrote:
    Hello,

    If you set a ttl and expire a column, I've read that this eventually turns
    into a tombstone and will be cleaned out by the GC. Are expirations
    considered a form of delete that still requires a node repair to be run in
    gc_grace_period seconds? The operations guide says you have to run node
    repair if you have deletes, so I'm trying to find out if we can upsert the
    column with expirations using a ttl=1 to substitute deletes. The node repair
    operations is very intensive in our environment and causes a
    significant performance degradation on the system.

    Thanks


    --
    Jonathan Ellis
    Project Chair, Apache Cassandra
    co-founder of DataStax, the source for professional Cassandra support
    http://www.datastax.com
  • Eric Tamme at Oct 12, 2011 at 1:21 pm

    On 10/11/2011 09:49 PM, Terry Cumaranatunge wrote:
    Hello,
    If you set a ttl and expire a column, I've read that this eventually
    turns into a tombstone and will be cleaned out by the GC. Are
    expirations considered a form of delete that still requires a node
    repair to be run in gc_grace_period seconds? The operations guide says
    you have to run node repair if you have deletes, so I'm trying to find
    out if we can upsert the column with expirations using a ttl=1 to
    substitute deletes. The node repair operations is very intensive in
    our environment and causes a significant performance degradation on
    the system.
    Thanks
    No - if you only use TTL to expire data, and no actual deletes or
    updates on the ttl, then you generally do not need to do a nodetool repair.

    I run two clusters that have rolling data sets relying on TTL that have
    been running for months without any issues and have never run nodetool
    repair.

    -Eric

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriescassandra
postedOct 12, '11 at 1:49a
activeOct 12, '11 at 3:33p
posts5
users4
websitecassandra.apache.org
irc#cassandra

People

Translate

site design / logo © 2022 Grokbase