The buffer is useful when you need to buffer updates on the source
cluster before starting cdcr, if the source cluster might receive
updates in the meanwhile and you want to be sure to not miss them.
To understand this better, you need to understand how cdcr clean
transaction logs. Cdcr when started (with the START action) will
instantiate a log reader for each target cluster. The position of the
log reader will indicate cdcr which transaction logs it can clean. If
all the log readers are beyond a certain point, then cdcr can clean all
the transaction logs up to this point.
However, there might be cases when the source cluster will be up without
any log readers instantiated:
1) The source cluster is started, but cdcr is not started yet
2) the source cluster is started, cdcr is started, but the target
cluster was not accessible when cdcr was started. In this case, cdcr
will not be able to instantiate a log reader for this cluster.
In these two scenarios, if updates are received by the source cluster,
then they might be cleaned out from the transaction log as per the
normal update log cleaning procedure.
That is where the buffer becomes useful. When you know that while
starting up your clusters and cdcr, you will be in one of these two
scenarios, then you can activate the buffer to be sure to not miss
updates. Then when the source and target clusters are properly up and
cdcr replication is properly started, you can turn off this buffer.
On 14/06/16 06:41, Bharath Kumar wrote:
I have setup cross data center replication using solr 6, i want to know why
the buffer needs to be enabled on the source cluster? Even if the buffer is
not enabled, i am able to replicate the data between source and target
sites. What is the advantages of enabling the buffer on the source site? If
i enable the buffer, the transaction logs are never deleted and over a
period of time we are running out of disk. Can you please let me know why
the buffer enabling is required?