Tom Lane
Simon Riggs wrote
- Write application to archive WAL files to tape, disk, or network
Probably need to do first part, but I'm arguing not to do the copy
I'd like to somehow see this handled by a user-supplied program or
script. What we mainly need is to define a good API that lets the
archiver program understand which WAL segment files to archive when.
B - Backing up WAL log files
-Ordinarily, when old log segment files are no longer needed, they
recycled (renamed to become the next segments in the numbered
This means that the data within them must be copied from there to
another location
AFTER postgres has closed that file
BEFORE it is renamed and recycled
My inclination would be to change the backend code so that as soon as a
WAL segment is completed, it is flagged as being ready to dump to tape
(or wherever). Possibly the easiest way to do this is to rename the
segment file somehow, perhaps "nnn" becomes "nnn.full". Then, after the
archiver process has properly dumped the file, reflag it as being dumped
(perhaps rename to "nnn.done"). Obviously there are any number of ways
we could do this flagging, and depending on an OS rename facility might
not be the best.
Yes, that would be the correct time to begin archive.

The way the code is currently written there is a slot in MoveOfflineLogs
which looks to see if XLOG_archive_dir is set before entering a section
which is empty apart from a message. That routine doesn't get called
until we're about to recycle the files, which means we've lost our
window of opportunity to archive them. Making the number of files larger
doesn't effect that being called last.... I'm going to ignore that
"hint" and any patch will include deletion of that code to avoid later

The log switch and close occurs during XLogWrite, when it is established
that there is no more room in the current log file for the current

The file-flagging mechanism only allows a single archiver program to
operate, so I'll structure it as a new function XLogArchiveNotify() so
we can add in extra stuff later to improve/change things. That way we
have a home for the API.
A segment then can be recycled when it is both (a) older than the latest
checkpoint and (b) flagged as dumped. Note that this approach allows
dumping of a file to start before the first time at which it could be
recycled. In the event of a crash and restart, WAL replay has to be
able to find the flagged segments, so the flagging mechanism can't be
one that would make this impossible.
The number of WAL logs is effectively tunable anyway because it depends
on the number of checkpoint segments, so we can increase that if there
are issues with archival speed v txn rate.

The rename is always safe because the log file names never wrap.

However, I'm loathe to touch the files, in case something crashes
somewhere and we are left with recovery failing because of an
unlocatable file. (To paraphrase one of the existing code comments, only
the truly paranoid survive). A similar way is to have a "buddy" file,
which indicates whether it is full and ready for archival. i.e. when we
close file "nnn" we also write an nearly/empty file called "nnn.full".
That file can then be deleted later BY THE archiver once archival has
finished, allowing it to be recycled by InstallXLogFileSegment(). (Would
require at least 6 more file descriptors, but I'm not sure if that's an

InstallXLogFileSegment() can check for XLogArchiveBusy() to see whether
it is allowed to reuse or allocate a new one. In initial implementation
this would just test to see whether "nnn.full" still exists. This will
allow a range of behaviour to be catered for, such as long waits while
manual tape mounts are requested by the archiver etc..

So in summary, the API is:

Archiver initialises and waits on notify
Postgresql initialises
Postgresql fills log, switches and close it, then calls
Archiver moves log somewhere safe, then sets state such that...
...sometime later
Postgresql checks XLogArchiveBusy() to see if its safe to recycle file
and discovers the state set by

API is completely unintrusive on current tried and tested operation, and
leaves the archiver(s) free to act as they choose, outside of the
address space of PostgreSQL. That way we don't have to update regession
tests with some destructive non-manual crash tests to show that works.

Clearly, we wouldn't want WAL logs to hang around too long, so we need
an initiation method for the archival process. Otherwise, we'll be
writing "nnn.full" notifications yet without anybody ever deleting them.
Either this could be set at startup with an archive_log_mode parameter
(OK, the names been used before, but if the cap fits, wear it) or
setting a maximum limit to number of archive logs and a few other ideas,
none of which I like.

Hmmmm...any listeners got any ideas here? How do we want this to work?

Anybody want to write a more complex archiver process to act as more
than just a test harness?

Best regards,

Simon Riggs

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 13 of 13 | next ›
Discussion Overview
grouppgsql-hackers-pitr @
postedFeb 10, '04 at 8:24p
activeFeb 17, '04 at 10:41p



site design / logo © 2021 Grokbase