Tom,

Thanks for your many comments and practical suggestions - most of which
I think I should be able to bash something out once I've got my new dev
env sorted.

I'll update the proposal into a design document with some of my earlier
blah taken out and all of your clarifications put in.

There's a few comments on stuff below:

Best Regards, Simon Riggs
Tom Lane writes
- Write application to archive WAL files to tape, disk, or network
Probably need to do first part, but I'm arguing not to do the copy
to
tape..
I'd like to somehow see this handled by a user-supplied program or
script. What we mainly need is to define a good API that lets the
archiver program understand which WAL segment files to archive when.
B - Backing up WAL log files
-Ordinarily, when old log segment files are no longer needed, they
are
recycled (renamed to become the next segments in the numbered
sequence).
This means that the data within them must be copied from there to
another location
AFTER postgres has closed that file
BEFORE it is renamed and recycled
My inclination would be to change the backend code so that as soon as a
WAL segment is completed, it is flagged as being ready to dump to tape
(or wherever). Possibly the easiest way to do this is to rename the
segment file somehow, perhaps "nnn" becomes "nnn.full". Then, after the
archiver process has properly dumped the file, reflag it as being dumped
(perhaps rename to "nnn.done"). Obviously there are any number of ways
we could do this flagging, and depending on an OS rename facility might
not be the best.

A segment then can be recycled when it is both (a) older than the latest
checkpoint and (b) flagged as dumped. Note that this approach allows
dumping of a file to start before the first time at which it could be
recycled. In the event of a crash and restart, WAL replay has to be
able to find the flagged segments, so the flagging mechanism can't be
one that would make this impossible.
That sort of API doesn't do much for my sense of truth-and-beauty, but
it will work and allow us to get to the testing stage beyond where we
will, I'm sure, discover many things. When that knowledge is gained *we*
can refactor.

Spawning new post to think through the API in more detail.
With full OS file backup, if the database is shutdown correctly,
then we
will need a way to tell the database "you think you're up to date,
but
you're not - I've added some more WAL files into the directories, so
roll forward on those now please".
I do not think this is an issue either, because my vision of this does
not include tar backups of shutdown databases. What will be backed up
is a live database, therefore the postmaster will definitely know that
it needs to perform WAL replay. What we will need is hooks to make sure
that the full set of required log files is available.
OK, again lets go for it on that assumption.

Longer term, I would feel more comfortable with a specific "backup
state". Relying on a side-effect of crash recovery for disaster recovery
doesn't give me a warm feeling. BUT, that feeling is for later, not now.
It's entirely
possible that that set of log files exceeds available disk space, so it
needs to be possible to run WAL replay incrementally, loading and then
replaying additional log segments after deleting old ones.
Possibly we could do this with some postmaster command-line switches.
J. R. Nield's patch embodied an "interactive recovery" backend mode,
which I didn't like in detail but the general idea is not necessarily
wrong.
Again, yes, though I will for now aim at the assumption that recovery
can be completed within available disk space, with this as an immediate
add-on when we have something that works.

That is also the basis for a "warm standby" solution: Copy the tar to a
new system (similar as you say), then repeatedly move new WAL logs
across to it, then startup in recover-only mode.

"Recover-only" mode would be initiated by a command line switch, as you
say. This would recover all of the WAL logs, then immediately shutdown
again.

The extension to that is what Oli Sennhauser has suggested, which is to
allow the second system to come up in read-only mode.

Best Regards, Simon Riggs

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 11 of 13 | next ›
Discussion Overview
grouppgsql-hackers-pitr @
categoriespostgresql
postedFeb 10, '04 at 8:24p
activeFeb 17, '04 at 10:41p
posts13
users6
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase