FAQ
Hi guys,

I am using Redis 2.6.5 on Debian 6.0 "Squeeze" via the dotdeb.org repo. We
have a server with 2 redis instances running on it and a failover that is a
slave to each of those instances (ports 6379 and 6380). Since upgrading
from 2.4.15 to 2.6.5 only one of the slaves will sync with the master, 6380
works and 6379 does not. The problematic instance has several GB of data
while the working instance only has several hundred MB.

When I issue the slaveof command on the slave it connects to master and
says that the link is down, as expected. The master shows the slave and
starts a bgsave also as expected, but before it gets to send_bulk (which
takes about 90s) the master loses the slave somehow. The slave still shows
that it has a master but is down, but the master shows zero slaves. Around
the time the bgsave completes the master will show the slave again but it
begins a new bgsave, and that process just loops forever.

Seems like a genuine bug to me but perhaps something is misconfigured. Any
ideas or thoughts?

Cheers,
Sami

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/PiriBqMa8UUJ.
To post to this group, send email to redis-db@googlegroups.com.
To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Search Discussions

  • Salvatore Sanfilippo at Dec 4, 2012 at 10:12 am

    On Mon, Dec 3, 2012 at 11:17 PM, Sami Samhuri wrote:
    Seems like a genuine bug to me but perhaps something is misconfigured. Any
    ideas or thoughts?
    Hello Sami, could you please provide some line of log around on both
    slave and master when the SYNC process fails? Thank you.

    Salvatore

    --
    Salvatore 'antirez' Sanfilippo
    open source developer - VMware
    http://invece.org

    Beauty is more important in computing than anywhere else in technology
    because software is so complicated. Beauty is the ultimate defence
    against complexity.
    — David Gelernter

    --
    You received this message because you are subscribed to the Google Groups "Redis DB" group.
    To post to this group, send email to redis-db@googlegroups.com.
    To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.
    For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
  • Sami Samhuri at Dec 4, 2012 at 6:51 pm
    Hi Salvatore,

    Of course, sorry for not thinking to include the logs. Here is the log from
    the master instance:

    [11414] 04 Dec 10:43:47.437 * Slave ask for synchronization
    [11414] 04 Dec 10:43:47.437 * Starting BGSAVE for SYNC
    [11414] 04 Dec 10:43:49.426 * Background saving started by pid 17133
    [11414] 04 Dec 10:45:17.016 # Client addr=10.17.124.52:53506 fd=380 age=90
    idle=90 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=16349
    oll=5580 omem=135276512 events=r cmd=sync scheduled to be closed ASAP for
    overcoming of output buffer limits.
    [17133] 04 Dec 10:45:59.710 * DB saved on disk
    [17133] 04 Dec 10:46:00.211 * RDB: 203 MB of memory used by copy-on-write
    [11414] 04 Dec 10:46:01.152 * Background saving terminated with success
    [11414] 04 Dec 10:46:01.916 * Slave ask for synchronization
    [11414] 04 Dec 10:46:01.917 * Starting BGSAVE for SYNC
    [11414] 04 Dec 10:46:03.917 * Background saving started by pid 17453
    [17453] 04 Dec 10:47:37.305 * DB saved on disk
    [17453] 04 Dec 10:47:37.749 * RDB: 222 MB of memory used by copy-on-write
    [11414] 04 Dec 10:47:38.632 * Background saving terminated with success

    and the log from the slave instance:

    [18688] 04 Dec 10:43:46.215 * SLAVE OF redis1:6379 enabled (user request)
    [18688] 04 Dec 10:43:47.371 * Connecting to MASTER...
    [18688] 04 Dec 10:43:47.372 * MASTER <-> SLAVE sync started
    [18688] 04 Dec 10:43:47.430 * Non blocking connect for SYNC fired the event.
    [18688] 04 Dec 10:43:47.433 * Master replied to PING, replication can
    continue...
    [18688] 04 Dec 10:46:01.150 # I/O error reading bulk count from MASTER:
    Resource temporarily unavailable
    [18688] 04 Dec 10:46:01.891 * Connecting to MASTER...
    [18688] 04 Dec 10:46:01.892 * MASTER <-> SLAVE sync started
    [18688] 04 Dec 10:46:01.914 * Non blocking connect for SYNC fired the event.
    [18688] 04 Dec 10:46:01.915 * Master replied to PING, replication can
    continue...

    And there is the error, plain as day. It does not appear to be a bug. So I
    guess my question now is which output buffer that refers to and how can I
    tune it?

    Regards,
    Sami
    On Tuesday, December 4, 2012 2:12:26 AM UTC-8, Salvatore Sanfilippo wrote:
    On Mon, Dec 3, 2012 at 11:17 PM, Sami Samhuri wrote:
    Seems like a genuine bug to me but perhaps something is misconfigured. Any
    ideas or thoughts?
    Hello Sami, could you please provide some line of log around on both
    slave and master when the SYNC process fails? Thank you.

    Salvatore

    --
    Salvatore 'antirez' Sanfilippo
    open source developer - VMware
    http://invece.org

    Beauty is more important in computing than anywhere else in technology
    because software is so complicated. Beauty is the ultimate defence
    against complexity.
    — David Gelernter
    --
    You received this message because you are subscribed to the Google Groups "Redis DB" group.
    To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/gRcOnWtDnQcJ.
    To post to this group, send email to redis-db@googlegroups.com.
    To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.
    For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
  • Sami Samhuri at Dec 4, 2012 at 6:54 pm
    Ah, I found a thread that already covers this
    issue: https://groups.google.com/forum/#!msg/redis-db/ix_pWabAvxY/IE0hzEVHMSAJ

    Looks like it's easy to configure properly. Sorry for the noise, I should
    have just read the log and googled the error in the first place.

    Cheers,
    Sami
    On Tuesday, December 4, 2012 10:50:59 AM UTC-8, Sami Samhuri wrote:

    Hi Salvatore,

    Of course, sorry for not thinking to include the logs. Here is the log
    from the master instance:

    [11414] 04 Dec 10:43:47.437 * Slave ask for synchronization
    [11414] 04 Dec 10:43:47.437 * Starting BGSAVE for SYNC
    [11414] 04 Dec 10:43:49.426 * Background saving started by pid 17133
    [11414] 04 Dec 10:45:17.016 # Client addr=10.17.124.52:53506 fd=380
    age=90 idle=90 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0
    obl=16349 oll=5580 omem=135276512 events=r cmd=sync scheduled to be closed
    ASAP for overcoming of output buffer limits.
    [17133] 04 Dec 10:45:59.710 * DB saved on disk
    [17133] 04 Dec 10:46:00.211 * RDB: 203 MB of memory used by copy-on-write
    [11414] 04 Dec 10:46:01.152 * Background saving terminated with success
    [11414] 04 Dec 10:46:01.916 * Slave ask for synchronization
    [11414] 04 Dec 10:46:01.917 * Starting BGSAVE for SYNC
    [11414] 04 Dec 10:46:03.917 * Background saving started by pid 17453
    [17453] 04 Dec 10:47:37.305 * DB saved on disk
    [17453] 04 Dec 10:47:37.749 * RDB: 222 MB of memory used by copy-on-write
    [11414] 04 Dec 10:47:38.632 * Background saving terminated with success

    and the log from the slave instance:

    [18688] 04 Dec 10:43:46.215 * SLAVE OF redis1:6379 enabled (user request)
    [18688] 04 Dec 10:43:47.371 * Connecting to MASTER...
    [18688] 04 Dec 10:43:47.372 * MASTER <-> SLAVE sync started
    [18688] 04 Dec 10:43:47.430 * Non blocking connect for SYNC fired the
    event.
    [18688] 04 Dec 10:43:47.433 * Master replied to PING, replication can
    continue...
    [18688] 04 Dec 10:46:01.150 # I/O error reading bulk count from MASTER:
    Resource temporarily unavailable
    [18688] 04 Dec 10:46:01.891 * Connecting to MASTER...
    [18688] 04 Dec 10:46:01.892 * MASTER <-> SLAVE sync started
    [18688] 04 Dec 10:46:01.914 * Non blocking connect for SYNC fired the
    event.
    [18688] 04 Dec 10:46:01.915 * Master replied to PING, replication can
    continue...

    And there is the error, plain as day. It does not appear to be a bug. So I
    guess my question now is which output buffer that refers to and how can I
    tune it?

    Regards,
    Sami
    On Tuesday, December 4, 2012 2:12:26 AM UTC-8, Salvatore Sanfilippo wrote:

    On Mon, Dec 3, 2012 at 11:17 PM, Sami Samhuri <sami.s...@gmail.com>
    wrote:
    Seems like a genuine bug to me but perhaps something is misconfigured. Any
    ideas or thoughts?
    Hello Sami, could you please provide some line of log around on both
    slave and master when the SYNC process fails? Thank you.

    Salvatore

    --
    Salvatore 'antirez' Sanfilippo
    open source developer - VMware
    http://invece.org

    Beauty is more important in computing than anywhere else in technology
    because software is so complicated. Beauty is the ultimate defence
    against complexity.
    — David Gelernter
    --
    You received this message because you are subscribed to the Google Groups "Redis DB" group.
    To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/X4OZ1pF13oIJ.
    To post to this group, send email to redis-db@googlegroups.com.
    To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.
    For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
  • Salvatore Sanfilippo at Dec 5, 2012 at 12:00 pm
    Hello Sami,

    Redis 2.6 has configurable limits for the amount of memory client
    buffers can use.
    In your case the replication process is a bit slow so the default
    client buffer is not enough, you can easily fix the issue editing
    redis.conf (or simply via CONFIG SET command):

    client-output-buffer-limit slave 0 0 0

    This will completely disable limits. Otherwise, alternatively, you can
    just use a bigger limit until the replication succeed.

    You can find full documentation in the self-commented redis.conf file
    in the Redis distribution.

    Cheers,
    Salvatore
    On Tue, Dec 4, 2012 at 7:54 PM, Sami Samhuri wrote:
    Ah, I found a thread that already covers this issue:
    https://groups.google.com/forum/#!msg/redis-db/ix_pWabAvxY/IE0hzEVHMSAJ

    Looks like it's easy to configure properly. Sorry for the noise, I should
    have just read the log and googled the error in the first place.

    Cheers,
    Sami

    On Tuesday, December 4, 2012 10:50:59 AM UTC-8, Sami Samhuri wrote:

    Hi Salvatore,

    Of course, sorry for not thinking to include the logs. Here is the log
    from the master instance:

    [11414] 04 Dec 10:43:47.437 * Slave ask for synchronization
    [11414] 04 Dec 10:43:47.437 * Starting BGSAVE for SYNC
    [11414] 04 Dec 10:43:49.426 * Background saving started by pid 17133
    [11414] 04 Dec 10:45:17.016 # Client addr=10.17.124.52:53506 fd=380 age=90
    idle=90 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=16349
    oll=5580 omem=135276512 events=r cmd=sync scheduled to be closed ASAP for
    overcoming of output buffer limits.
    [17133] 04 Dec 10:45:59.710 * DB saved on disk
    [17133] 04 Dec 10:46:00.211 * RDB: 203 MB of memory used by copy-on-write
    [11414] 04 Dec 10:46:01.152 * Background saving terminated with success
    [11414] 04 Dec 10:46:01.916 * Slave ask for synchronization
    [11414] 04 Dec 10:46:01.917 * Starting BGSAVE for SYNC
    [11414] 04 Dec 10:46:03.917 * Background saving started by pid 17453
    [17453] 04 Dec 10:47:37.305 * DB saved on disk
    [17453] 04 Dec 10:47:37.749 * RDB: 222 MB of memory used by copy-on-write
    [11414] 04 Dec 10:47:38.632 * Background saving terminated with success

    and the log from the slave instance:

    [18688] 04 Dec 10:43:46.215 * SLAVE OF redis1:6379 enabled (user request)
    [18688] 04 Dec 10:43:47.371 * Connecting to MASTER...
    [18688] 04 Dec 10:43:47.372 * MASTER <-> SLAVE sync started
    [18688] 04 Dec 10:43:47.430 * Non blocking connect for SYNC fired the
    event.
    [18688] 04 Dec 10:43:47.433 * Master replied to PING, replication can
    continue...
    [18688] 04 Dec 10:46:01.150 # I/O error reading bulk count from MASTER:
    Resource temporarily unavailable
    [18688] 04 Dec 10:46:01.891 * Connecting to MASTER...
    [18688] 04 Dec 10:46:01.892 * MASTER <-> SLAVE sync started
    [18688] 04 Dec 10:46:01.914 * Non blocking connect for SYNC fired the
    event.
    [18688] 04 Dec 10:46:01.915 * Master replied to PING, replication can
    continue...

    And there is the error, plain as day. It does not appear to be a bug. So I
    guess my question now is which output buffer that refers to and how can I
    tune it?

    Regards,
    Sami
    On Tuesday, December 4, 2012 2:12:26 AM UTC-8, Salvatore Sanfilippo wrote:

    On Mon, Dec 3, 2012 at 11:17 PM, Sami Samhuri <sami.s...@gmail.com>
    wrote:
    Seems like a genuine bug to me but perhaps something is misconfigured.
    Any
    ideas or thoughts?
    Hello Sami, could you please provide some line of log around on both
    slave and master when the SYNC process fails? Thank you.

    Salvatore

    --
    Salvatore 'antirez' Sanfilippo
    open source developer - VMware
    http://invece.org

    Beauty is more important in computing than anywhere else in technology
    because software is so complicated. Beauty is the ultimate defence
    against complexity.
    — David Gelernter
    --
    You received this message because you are subscribed to the Google Groups
    "Redis DB" group.
    To view this discussion on the web visit
    https://groups.google.com/d/msg/redis-db/-/X4OZ1pF13oIJ.

    To post to this group, send email to redis-db@googlegroups.com.
    To unsubscribe from this group, send email to
    redis-db+unsubscribe@googlegroups.com.
    For more options, visit this group at
    http://groups.google.com/group/redis-db?hl=en.


    --
    Salvatore 'antirez' Sanfilippo
    open source developer - VMware
    http://invece.org

    Beauty is more important in computing than anywhere else in technology
    because software is so complicated. Beauty is the ultimate defence
    against complexity.
    — David Gelernter

    --
    You received this message because you are subscribed to the Google Groups "Redis DB" group.
    To post to this group, send email to redis-db@googlegroups.com.
    To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.
    For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupredis-db @
categoriesredis
postedDec 4, '12 at 9:45a
activeDec 5, '12 at 12:00p
posts5
users2
websiteredis.io
irc#redis

People

Translate

site design / logo © 2022 Grokbase