While I am working on pg_ctl, I saw this TODO item:

Have the postmaster write a random number to a file on startup that
pg_ctl checks against the contents of a pg_ping response on its initial
connection (without login)

This will protect against connecting to an old instance of the
postmaster in a different or deleted subdirectory.

http://archives.postgresql.org/pgsql-bugs/2009-10/msg00110.php
http://archives.postgresql.org/pgsql-bugs/2009-10/msg00156.php

Based on our new PQPing(), do we ever want to implement this or should I
remove the TODO item? It seems this would require a server connection,
which is something we didn't want to force pg_ctl -w to do in case
authentication is broken.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Search Discussions

  • Tom Lane at Dec 27, 2010 at 9:03 pm

    Bruce Momjian writes:
    While I am working on pg_ctl, I saw this TODO item:
    Have the postmaster write a random number to a file on startup that
    pg_ctl checks against the contents of a pg_ping response on its initial
    connection (without login)
    This will protect against connecting to an old instance of the
    postmaster in a different or deleted subdirectory.
    http://archives.postgresql.org/pgsql-bugs/2009-10/msg00110.php
    http://archives.postgresql.org/pgsql-bugs/2009-10/msg00156.php
    Based on our new PQPing(), do we ever want to implement this or should I
    remove the TODO item? It seems this would require a server connection,
    which is something we didn't want to force pg_ctl -w to do in case
    authentication is broken.
    Well, rereading that old thread makes me realize that what you just
    implemented is still pretty far short of what was discussed. In
    particular, this implementation entirely fails to cope with the
    possibility that a Windows postmaster is using a specialized
    listen_addresses setting that has to be taken into account in order to
    get a TCP connection. I wonder whether we should revert this patch and
    have another go at the idea of a separate postmaster.ports status file
    with a line for each active port.

    The business with a magic number can't be implemented unless we actually
    add a new separate pg_ping protocol. PQping() has removed a lot of the
    pressure to have that, namely all the authentication-failure problem
    cases. I'm not sure that the case where you're looking at an inactive
    data directory but there's a live postmaster someplace else with the
    same port number is important enough to justify new protocol all by
    itself.

    regards, tom lane
  • Bruce Momjian at Dec 28, 2010 at 6:28 pm

    Tom Lane wrote:
    Bruce Momjian <bruce@momjian.us> writes:
    While I am working on pg_ctl, I saw this TODO item:
    Have the postmaster write a random number to a file on startup that
    pg_ctl checks against the contents of a pg_ping response on its initial
    connection (without login)
    This will protect against connecting to an old instance of the
    postmaster in a different or deleted subdirectory.
    http://archives.postgresql.org/pgsql-bugs/2009-10/msg00110.php
    http://archives.postgresql.org/pgsql-bugs/2009-10/msg00156.php
    Based on our new PQPing(), do we ever want to implement this or should I
    remove the TODO item? It seems this would require a server connection,
    which is something we didn't want to force pg_ctl -w to do in case
    authentication is broken.
    Well, rereading that old thread makes me realize that what you just
    implemented is still pretty far short of what was discussed. In
    particular, this implementation entirely fails to cope with the
    possibility that a Windows postmaster is using a specialized
    listen_addresses setting that has to be taken into account in order to
    get a TCP connection. I wonder whether we should revert this patch and
    have another go at the idea of a separate postmaster.ports status file
    with a line for each active port.
    I had forgotten about having to use TCP and needing to honor
    listen_address restrictions. We only need one valid listen_address so I
    went ahead and added a line to the postmaster.pid file.

    I am not sure what a separate file will buy us except additional files
    to open/manage.
    The business with a magic number can't be implemented unless we actually
    add a new separate pg_ping protocol. PQping() has removed a lot of the
    pressure to have that, namely all the authentication-failure problem
    cases. I'm not sure that the case where you're looking at an inactive
    data directory but there's a live postmaster someplace else with the
    same port number is important enough to justify new protocol all by
    itself.
    Yes, that was my calculus too. I realized that we create session ids by
    merging the process id and backend start time, so I went ahead and added
    the postmaster start time epoch to the postmaster.pid file. While there
    is no way to pass back the postmaster start time from PQping, I added
    code to pg_ctl to make sure the time in the postmaster.pid file is not
    _before_ pg_ctl started running. We only check PQping() after we have
    started the postmaster ourselves, so it fits our needs.

    Patch attached.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Bruce Momjian at Dec 29, 2010 at 6:44 pm

    Bruce Momjian wrote:
    Yes, that was my calculus too. I realized that we create session ids by
    merging the process id and backend start time, so I went ahead and added
    the postmaster start time epoch to the postmaster.pid file. While there
    is no way to pass back the postmaster start time from PQping, I added
    code to pg_ctl to make sure the time in the postmaster.pid file is not
    _before_ pg_ctl started running. We only check PQping() after we have
    started the postmaster ourselves, so it fits our needs.
    Tom suggested that there might be clock skew between pg_ctl and the
    postmaster, so I added a 2-second slop in checking the postmaster start
    time. Tom also wanted the connection information to be output all at
    once, but that causes a problem with detecting pre-9.1 servers so I
    avoided it.

    Updated patch attached.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Bruce Momjian at Dec 31, 2010 at 10:26 pm

    Bruce Momjian wrote:
    Bruce Momjian wrote:
    Yes, that was my calculus too. I realized that we create session ids by
    merging the process id and backend start time, so I went ahead and added
    the postmaster start time epoch to the postmaster.pid file. While there
    is no way to pass back the postmaster start time from PQping, I added
    code to pg_ctl to make sure the time in the postmaster.pid file is not
    _before_ pg_ctl started running. We only check PQping() after we have
    started the postmaster ourselves, so it fits our needs.
    Tom suggested that there might be clock skew between pg_ctl and the
    postmaster, so I added a 2-second slop in checking the postmaster start
    time. Tom also wanted the connection information to be output all at
    once, but that causes a problem with detecting pre-9.1 servers so I
    avoided it.
    Patch applied, and TODO item removed because patch mostly detects if a
    stale postmaster created the postmaster.pid file. The TODO was:

    Allow pg_ctl to work properly with configuration files located outside
    the PGDATA directory)

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Peter Eisentraut at Jan 1, 2011 at 8:24 pm

    On fre, 2010-12-31 at 17:26 -0500, Bruce Momjian wrote:
    Patch applied, and TODO item removed because patch mostly detects if a
    stale postmaster created the postmaster.pid file. The TODO was:
    Please fix this new compiler warning:

    pg_ctl.c:1787: warning: implicit declaration of function ‘time’
  • Bruce Momjian at Jan 1, 2011 at 8:55 pm

    Peter Eisentraut wrote:
    On fre, 2010-12-31 at 17:26 -0500, Bruce Momjian wrote:
    Patch applied, and TODO item removed because patch mostly detects if a
    stale postmaster created the postmaster.pid file. The TODO was:
    Please fix this new compiler warning:

    pg_ctl.c:1787: warning: implicit declaration of function ?time?
    Thanks, done.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedDec 24, '10 at 2:58p
activeJan 1, '11 at 8:55p
posts7
users3
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase