FAQ

[PostgreSQL] Trying out replication: cp cannot stat log file during recovery

Henry C.
Apr 12, 2011 at 7:45 pm
Greets,

Pg 9.0.3.

I'm trying out Pg's built-in replication for the first time and noticed
something odd.

On the slave I see the following in the logs (after rsyncing all from master
to slave and firing up Pg on the slave):

...
restored log file "000000010000018E0000000E" from archive
restored log file "000000010000018E0000000F" from archive
consistent recovery state reached at 18E/10000000
restored log file "000000010000018E00000010" from archive
cp: cannot stat `/home/arc/000000010000018E00000011': No such file or directory
unexpected pageaddr 18D/91000000 in log file 398, segment 17, offset 0
cp: cannot stat `/home/arc/000000010000018E00000011': No such file or directory
streaming replication successfully connected to primary
...

/home/arc is an NFS mount from master and is where the WAL archive is kept
(yes, I'll move it eventually; for now I'm just testing).

Things seem to run fine up until (and after) log file
000000010000018E00000011. That particular file is definitely present. Why
would cp(1) fail to stat the file when it worked fine for all the others?

I notice from another mailing list post that 'unexpected pageaddr' is possibly
not that serious and is probably unrelated to the cp/stat error above.

However, since recovery seems to have skipped a log file, what would that mean
in terms of the slave being a true copy of master and integrity of the data?


thanks
Henry
reply

Search Discussions

2 responses

  • Fujii Masao at Apr 13, 2011 at 2:28 am

    On Wed, Apr 13, 2011 at 4:45 AM, Henry C. wrote:
    Greets,

    Pg 9.0.3.

    I'm trying out Pg's built-in replication for the first time and noticed
    something odd.

    On the slave I see the following in the logs (after rsyncing all from master
    to slave and firing up Pg on the slave):

    ...
    restored log file "000000010000018E0000000E" from archive
    restored log file "000000010000018E0000000F" from archive
    consistent recovery state reached at 18E/10000000
    restored log file "000000010000018E00000010" from archive
    cp: cannot stat `/home/arc/000000010000018E00000011': No such file or directory
    unexpected pageaddr 18D/91000000 in log file 398, segment 17, offset 0
    cp: cannot stat `/home/arc/000000010000018E00000011': No such file or directory
    streaming replication successfully connected to primary
    ...

    /home/arc is an NFS mount from master and is where the WAL archive is kept
    (yes, I'll move it eventually; for now I'm just testing).

    Things seem to run fine up until (and after) log file
    000000010000018E00000011.  That particular file is definitely present.  Why
    would cp(1) fail to stat the file when it worked fine for all the others?
    I guess that file didn't exist in the archive at the moment when cp failed.
    It was archived after that. So you observed that file in the archive.
    I notice from another mailing list post that 'unexpected pageaddr' is possibly
    not that serious and is probably unrelated to the cp/stat error above.

    However, since recovery seems to have skipped a log file, what would that mean
    in terms of the slave being a true copy of master and integrity of the data?
    When the standby fails to read the WAL file from the archive, it tries to read
    that from the master via replication connection. So the standby would not skip
    that file.

    Regards,

    --
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
  • Henry C. at Apr 13, 2011 at 8:19 am

    On Wed, April 13, 2011 04:28, Fujii Masao wrote:
    When the standby fails to read the WAL file from the archive, it tries to
    read that from the master via replication connection. So the standby would not
    skip that file.
    Great, thanks. It looks like it's proceeding normally (if slow) then.

Related Discussions

Discussion Navigation
viewthread | post

2 users in discussion

Henry C.: 2 posts Fujii Masao: 1 post