I am not sure the following pg_ctl behaviour is really a bug, but I find it unexpected enough to
report.
I was testing synchronous replication in a test setup on a single machine. (After all, one could
have different instances on different arrays, right? If you think this is an unlikely use-case,
perhaps the following is not important.)
There are two installations of 9.1devel (git as of today):
primary: /var/data1/pg_stuff/pg_installations/pgsql.vanilla_1
standby: /var/data1/pg_stuff/pg_installations/pgsql.vanilla_2
The standby's data_directory is generated by pg_basebackup from vanilla_1.
The problem is the very first run of pg_ctl restart:
pg_ctl first correctly decides that the standby instance (=vanilla_2) isn't yet running:
pg_ctl: PID file "/var/data1/pg_stuff/pg_installations/pgsql.vanilla_2/data/postmaster.pid" does
not exist
This is OK and expected. But then it continues (in the logfile) with:
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 20519) running in data directory
"/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/data"?
So, complaints about the *other* instance. It doesn't happen once a successful start (with pg_ctl
start) has happened.
It starts fine when started right away with 'start' instead of 'restart'.
Also, if it has been started once, it will react to 'pg_ctl restart' without the errors.
I'll attach a shell-script, that provokes the error, see the 'restart' on the line with the
comment: 'HERE'
It would seem (see below) that pg_ctl's final decision about the standby, (that is has started up)
is wrong; the standby does *not* eventually start.
Below the output of the attached shell script. (careful - it deletes stuff)
(It still contains some debug lines, but I didn't want to change it too much.)
$ clear; ./split_vanilla.sh
PGPASSFILE=/home/rijkers/.pg_rijkers
waiting for server to shut down.... done
server stopped
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
removed `/var/data1/pg_stuff/archive_dir/000000010000000000000018'
removed `/var/data1/pg_stuff/archive_dir/000000010000000000000019'
removed `/var/data1/pg_stuff/archive_dir/000000010000000000000019.00000020.backup'
removed `/var/data1/pg_stuff/archive_dir/00000001000000000000001A'
/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/bin/pg_basebackup
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
BINDIR = /var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/bin
PGPORT=6564
PGPASSFILE=/home/rijkers/.pg_rijkers
PGDATA=/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/data
/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/bin/pg_ctl
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
UID PID PPID C STIME TTY STAT TIME CMD
rijkers 20519 1 20 17:19 pts/25 S+ 0:00
/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/bin/postgres -D
/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/data
rijkers 20521 20519 0 17:19 ? Ss 0:00 \_ postgres: writer process
rijkers 20522 20519 0 17:19 ? Ss 0:00 \_ postgres: wal writer process
rijkers 20523 20519 0 17:19 ? Ss 0:00 \_ postgres: autovacuum launcher process
rijkers 20524 20519 0 17:19 ? Ss 0:00 \_ postgres: archiver process
rijkers 20525 20519 0 17:19 ? Ss 0:00 \_ postgres: stats collector process
BINDIR = /var/data1/pg_stuff/pg_installations/pgsql.vanilla_2/bin
PGPORT=6664
PGPASSFILE=/home/rijkers/.pg_rijkers
PGDATA=/var/data1/pg_stuff/pg_installations/pgsql.vanilla_2/data
/var/data1/pg_stuff/pg_installations/pgsql.vanilla_2/bin/pg_ctl
pg_ctl: PID file "/var/data1/pg_stuff/pg_installations/pgsql.vanilla_2/data/postmaster.pid" does
not exist
Is server running?
starting server anyway
waiting for server to start............................................................... done
server started
-- logfile 1:
LOG: database system is shut down
LOG: database system was shut down at 2011-03-18 17:19:54 CET
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
-- logfile 2:
LOG: shutting down
LOG: database system is shut down
FATAL: lock file "postmaster.pid" already exists
HINT: Is another postmaster (PID 20519) running in data directory
"/var/data1/pg_stuff/pg_installations/pgsql.vanilla_1/data"?
thanks,
Erik Rijkers