Further design plans for PITR...as posted previously, Bruce and I had a
long discussion recently to iron out the major thinking and a good deal
of the detail also.
In overview, major change is introducing an ARCHIVE process running
under control of the Postmaster, similar to Stats collector.

Due to personal commitments in latter May, early June, these changes
will not be complete until mid/late June. Best I can do...
Including time required for the fair amount of documentation required
for code to be usefully tested during beta. The good news is there is
little speculation in this design now, it is just hanging the code in
the right place - about half the code is waiting to be remerged into
this latest design.

I'll submit the code in pieces as well, so we can view progress, whether
or not those are incrementally committed.

Committers & all others interested: pls check this out and make any
comments or questions now...time for rework is now slipping fast.

Best Regards, Simon Riggs, 2nd Quadrant

...detail chatter follows
On Thu, 2004-05-06 at 05:38, Bruce Momjian wrote:
Simon Riggs wrote:
Bruce, was this OK with you...shall I post?

Some items occurred to me during write up...are you OK with those? Do
you want to alter anything before I post?
Looks good with a few adjustments:
Some additions and backtracks...
These choices should be offered as a single GUC, with mutually
exclusive values of
- CIRCULAR (named same as DB2 to illustrate that some xlogging does take
place, just not archive logging)
It would be nice to allow the external program to work if you specfied
the program as '', but external isn't the same as running no program
because the external program will also do the flag file removal once it
is archived. I am a little worried about adding an external capability
when we don't have anyone ready to actually show someone wanting such an
external program. Not sure how to handle that -- add it in 7.5 and
see, or go with a boolean and see if we can get an external thing
working for 7.6.
OK, EXTERNAL will not be included in the 7.5 drop; I'm not certain it is
necessary now because of other changes in the design (below).
We always spawn an ARCHIVER process under postmaster, no matter what the
setting of the main GUC. That way, it can be started up if required.
Archive process id is stored in shared memory (or on disk as
I think shared memory, but I am not positive. I think shared memory
because the postmaster could potentially have to stop/restart it. I
will have to look at how the stats process is done.
Looks to me like this would have to be a disk file, e.g. archiver.pid
but I'll isolate that piece of code in case someone has a bright idea.
The archiver program updates its config values when someone changes
these values via postgresql.conf (and uses pg_ctl reload). These can
only be modified from postgresql.conf.
This would be PGC_SIGHUP. However, we need to make sure the archiver
sees those changes like the backends see such changes now.
Basically, I think that we need to push user-level control of this
process down beyond the directory scanning code (that is pretty
standard), and allow them to call an arbitrary program to transfer the
logs. My idea is that the pitr_transfer program will get $1=WAL file
name and $2=pitr_location and the program can use those arguments to do
the transfer. We can even put a pitr_transfer.sample program in share
and document $1 and $2.
- initdb needs to be altered to add the pg_rlog directory
Should we put the rlog directory as subdirectory of xlog? Seems so.
- code also required to note when xlog file switches occur during
extended recovery across a number of xlog files
...was accepted
- didn't discuss when we test for archive_dest and what happens then. We
know Informix, DB2 and Oracle all freeze if archive_dest is not
available. That's not an option at the moment...for the future. Right
now we can choose to either PANIC, ERROR or WARNING and so need a
GUC-specified policy to control that behaviour. (Suggest naming options
Yep, we can allow the admin to specify what happens if we can't archive.
Summary of additional GUCs required (names not discussed...still open!)
- wal_archive_mode = CIRCULAR (default) | ARCHIVE | EXTERNAL
- archive_dest = 'directory, user@host:/dir, etc' (no default)
- archive_program = 'cp, scp, etc' (no default, or scp?)
- wal_archive_error_policy = WARNING (default) | SHUTDOWN
I would remove the wal_ part because though it is implemented via WAL,
the actually process is archiving. WAL is just an implementation
So, in summary, we have 5 GUCs, all PGC_SIGHUP
- archive_mode = CIRCULAR (default) (==off)| ARCHIVE (==on)
- archive_dest = 'directory, user@host:/dir, etc' (no default)
- archive_program = 'cp, scp, etc' (no default)
- archive_error_policy = WARNING (default) | SHUTDOWN
- archive_debug
- The GUC for recovery target maybe should be a postmaster command
switch? That way we wouldn't need to edit postgresql.conf before
recovery and we also wouldn't need to give it a name...
I like centralizing it all in GUC. Command-line parameters are pretty
hard to specify for one-time usage like this. However, if you set it
via GUC, and you don't modify the value and restart the postmaster, is
it going to honor that old xid. That would be a strange problem. I
guess we could fail to start if we don't find the specified xid in the
wal files.
Postmaster startup only, applies only if enters recovery
- recovery_target = 12345262 (default is NOT SET)

Does it stop before that xid or after that xid?
Recovery target supplied at recovery-time start of postmaster cannot
easily be supplied as a GUC or Postmaster startup switch. Suggestion is
to test for a file called:
which has something in it like this
That looks over-cooked, but I'll make it simple (believe me!)
After recovery completes, the file is renamed to:
This then avoids complications with interactions of crash recovery and
rollforward recovery. If we crash during recovery, it will restart
cleanly and continue. Once recovery completes, if we then crash, we
don't go back into rollforward recovery (unless we want to), which would
not be the case if we put a GUC in the postgresql.conf file directly
because we would need to re-edit it and send out a SIGHUP via pg_ctl
reload - which is guaranteed not to happen under stress at 4am.
No changes to postgresql.conf are required.
[No capability, for now, to rollforward when logspace > available disk,
but that can be a later addition]

ARCHIVER architecture very similar to Stats Collector. Startup just
before Stats collector, postmaster will restart. I'll put all the code
in one place like we have with stats collector.

At startup, ARCHIVER will test archive capability: We write a test file
to xlog directory called [pgarch_startup_$pid_$date] to xlog, then
execute the command once using that name as a parameter, which should
then copy file to archive location using the archive_program command. At
startup, failure of the archive_program will be a PANIC condition,
whereas once started, PostgreSQL will act according to

If ARCHIVER fails, it will be restarted by Postmaster. Archive_program
runs in its own process, so shouldn't be able to touch PostgreSQL. It
will run in (postgres) security context, so no permissions changes.
archive_error_policy will only come into effect once the situation
occurs that archive directory runs out of space - after archiver_program
has failed and the WARNING to restart it has been ignored by admins.

Since EXTERNAL is not being supported, originally posted program called
pg_arch lives no more...c'est la vie

Final issues:
- need to know which signal to use from backend->ARCHIVER when an xlog
fills. Somebody let me know - not bothered which...?


Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 29 of 33 | next ›
Discussion Overview
grouppgsql-hackers @
postedApr 26, '04 at 3:38p
activeMay 11, '04 at 9:59p



site design / logo © 2021 Grokbase