FAQ
Greetings,
I've got a postgresql-8.1.10 instance running on a Linux system. Over
the weekend it suddenly started rejecting client connection attempts
with a "too many connected" error. At the time, we were using the
default max_connections of 100. I've doubled this for the time being
until we can determine what/why its misbehaving.

The real problem that I'm having is that I can't seem to find any
obvious way to determine what the current number of connections are to
the server. Googling a bit suggests that running "select count(*)
from pg_stat_activity;" would provide the answer, however I'm very
suspicious of the accuracy of the results. At this very minute, it
claims that there are just 3 connections, however I'm fairly confident
that there must be a lot more, as we've go about 200 client systems
running an in hour app that are continuously hitting the database
(every 30-60 seconds), plus a few web apps that are seeing some
activity (even over the weekend). So just 3 seems wrong to me,
especially when I run 'netstat -a | grep -c postgr' and I get 183 back
(granted many are in a TIME_WAIT state, but shouldn't that still count
as a valid connection to the database?). For example, here's the
current output:

select * from pg_stat_activity ;
datid | datname | procpid | usesysid | usename |
current_query | query_start | backend_start |
client_addr | client_port
-------+---------+---------+----------+-----------+------------------------------+-------------+-------------------------------+-------------+-------------
16636 | nightly | 29579 | 16386 | lfriedman | <command string
not enabled> | | 2009-07-26 13:07:38.139417-07 |
-1
16636 | nightly | 22822 | 10 | postgres | <command string
not enabled> | | 2009-07-26 13:00:02.723319-07 |
-1
65770 | minidvs | 22662 | 16386 | lfriedman | <command string
not enabled> | | 2009-07-26 09:48:22.958256-07 |
-1
(3 rows)


Another thing that makes no sense is why the client_addr is seemingly
null (is there some postgresql.conf option that I should be setting or
changing?).

I'll be very happy to hear that I'm misunderstanding something.

Thanks in advance for guidance.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L. Friedman netllama@gmail.com
LlamaLand https://netllama.linux-sxs.org

Search Discussions

  • Tom Lane at Jul 26, 2009 at 8:23 pm

    Lonni J Friedman writes:
    I've got a postgresql-8.1.10 instance running on a Linux system. Over
    the weekend it suddenly started rejecting client connection attempts
    with a "too many connected" error. ...
    The real problem that I'm having is that I can't seem to find any
    obvious way to determine what the current number of connections are to
    the server. Googling a bit suggests that running "select count(*)
    from pg_stat_activity;" would provide the answer, however I'm very
    suspicious of the accuracy of the results.
    pg_stat_activity should be reasonably trustworthy, modulo the fact that
    the display might be a fraction of a second out-of-date.
    especially when I run 'netstat -a | grep -c postgr' and I get 183 back
    (granted many are in a TIME_WAIT state, but shouldn't that still count
    as a valid connection to the database?).
    No, it doesn't. That's a recently-dead connection. It may still be of
    interest to the TCP stack, but Postgres has forgotten about it.
    Another thing that makes no sense is why the client_addr is seemingly
    null
    Expected for a Unix-socket connection.

    I think you've got a lot of extremely transient connections and you just
    happened to have a burst of them at the same instant. Aside from the
    question of not being sure where to set max_connections, this is a
    pretty bad idea from a performance standpoint. PG backends aren't all
    that lightweight, so it's not good to fire one up for just a single
    query, which is what it sounds like your app is doing. Consider using a
    connection pooler.

    regards, tom lane
  • Lonni J Friedman at Jul 27, 2009 at 11:21 pm
    First, thanks for replying! Comments/questions below.

    On Sun, Jul 26, 2009 at 1:23 PM, Tom Lanewrote:
    Lonni J Friedman <netllama@gmail.com> writes:
    I've got a postgresql-8.1.10 instance running on a Linux system.  Over
    the weekend it suddenly started rejecting client connection attempts
    with a "too many connected" error. ...
    The real problem that I'm having is that I can't seem to find any
    obvious way to determine what the current number of connections are to
    the server.  Googling a bit suggests that running "select count(*)
    from pg_stat_activity;" would provide the answer, however I'm very
    suspicious of the accuracy of the results.
    pg_stat_activity should be reasonably trustworthy, modulo the fact that
    the display might be a fraction of a second out-of-date.
    Hrmm, that's not what I'm seeing. I'm finding that connections
    continue to appear in the table long after I've terminated a remote
    pqsl connection. I'm talking minutes or even hours. In one case, I
    rebooted a system that was supposedly connected, and it continued to
    appear in the table (or rather the IP associated with the system).
    especially when I run 'netstat -a | grep -c postgr' and I get 183 back
    (granted many are in a TIME_WAIT state, but shouldn't that still count
    as a valid connection to the database?).
    No, it doesn't.  That's a recently-dead connection.  It may still be of
    interest to the TCP stack, but Postgres has forgotten about it.
    Another thing that makes no sense is why the client_addr is seemingly
    null
    Expected for a Unix-socket connection.

    I think you've got a lot of extremely transient connections and you just
    happened to have a burst of them at the same instant.  Aside from the
    question of not being sure where to set max_connections, this is a
    pretty bad idea from a performance standpoint.  PG backends aren't all
    that lightweight, so it's not good to fire one up for just a single
    query, which is what it sounds like your app is doing.  Consider using a
    connection pooler.
    Thanks, I'll definitely look into that as time allows.
  • Tom Lane at Jul 27, 2009 at 11:31 pm

    Lonni J Friedman writes:
    On Sun, Jul 26, 2009 at 1:23 PM, Tom Lanewrote:
    pg_stat_activity should be reasonably trustworthy, modulo the fact that
    the display might be a fraction of a second out-of-date.
    Hrmm, that's not what I'm seeing. I'm finding that connections
    continue to appear in the table long after I've terminated a remote
    pqsl connection. I'm talking minutes or even hours.
    What are you doing to "terminate" these remote connections? What it
    sounds like is the connected server process isn't being told about the
    termination, and so it sits there waiting for input that will never
    come. We do enable TCP keepalive if available, so unless your server
    is running a seriously obsolete OS, it will eventually figure out the
    client is gone --- but that takes order-of-hours with the standard TCP
    timeout settings.

    Between that and your unreasonably large number of TIME_WAIT
    connections, it definitely seems like you've got TCP-level connection
    reliability problems. TIME_WAIT state should go away pretty fast too
    if things are working properly at the network level. I wonder whether
    you have a router that is dropping connections it thinks are idle.
    Beyond that my TCP expertise does not extend.

    regards, tom lane
  • Lonni J Friedman at Jul 27, 2009 at 11:37 pm

    On Mon, Jul 27, 2009 at 4:31 PM, Tom Lanewrote:
    Lonni J Friedman <netllama@gmail.com> writes:
    On Sun, Jul 26, 2009 at 1:23 PM, Tom Lanewrote:
    pg_stat_activity should be reasonably trustworthy, modulo the fact that
    the display might be a fraction of a second out-of-date.
    Hrmm, that's not what I'm seeing.  I'm finding that connections
    continue to appear in the table long after I've terminated a remote
    pqsl connection.  I'm talking minutes or even hours.
    What are you doing to "terminate" these remote connections?  What it
    sounds like is the connected server process isn't being told about the
    termination, and so it sits there waiting for input that will never
    come.  We do enable TCP keepalive if available, so unless your server
    is running a seriously obsolete OS, it will eventually figure out the
    client is gone --- but that takes order-of-hours with the standard TCP
    timeout settings.
    Normally, just quitting from psql, but as part of today's experiment I
    rebooted the system that the table claimed was still connected. The
    server is running Linux with a reasonably recent 2.6.x kernel.
    Between that and your unreasonably large number of TIME_WAIT
    connections, it definitely seems like you've got TCP-level connection
    reliability problems.  TIME_WAIT state should go away pretty fast too
    if things are working properly at the network level.  I wonder whether
    you have a router that is dropping connections it thinks are idle.
    Beyond that my TCP expertise does not extend.
    The TIME_WAIT entries do go away fairly quickly, but that's not what
    I'm looking at now. I'm talking about the content of the
    pg_stat_activity table.
  • Tom Lane at Jul 28, 2009 at 2:02 am

    Lonni J Friedman writes:
    On Mon, Jul 27, 2009 at 4:31 PM, Tom Lanewrote:
    What are you doing to "terminate" these remote connections?  What it
    sounds like is the connected server process isn't being told about the
    termination, and so it sits there waiting for input that will never
    come.
    Normally, just quitting from psql, but as part of today's experiment I
    rebooted the system that the table claimed was still connected. The
    server is running Linux with a reasonably recent 2.6.x kernel.
    Hm, but what's the client-side OS? A reasonable OS should send a
    connection close notification (TCP RST) when the psql process dies,
    even if you managed to kill it in a way that prevented psql from
    closing the connection for itself. However, if that didn't happen
    for some reason, reboot would not make things better. It would just
    guarantee that the OS no longer had any memory of the connection either.

    It still sounds like your problems are fundamentally network-level
    problems and not Postgres problems... but it's hard to tell from
    here whether it's client-side software or network infrastructure
    doing it to you.

    regards, tom lane
  • Lonni J Friedman at Jul 28, 2009 at 2:17 am

    On Mon, Jul 27, 2009 at 7:02 PM, Tom Lanewrote:
    Lonni J Friedman <netllama@gmail.com> writes:
    On Mon, Jul 27, 2009 at 4:31 PM, Tom Lanewrote:
    What are you doing to "terminate" these remote connections?  What it
    sounds like is the connected server process isn't being told about the
    termination, and so it sits there waiting for input that will never
    come.
    Normally, just quitting from psql, but as part of today's experiment I
    rebooted the system that the table claimed was still connected.  The
    server is running Linux with a reasonably recent 2.6.x kernel.
    Hm, but what's the client-side OS?  A reasonable OS should send a
    connection close notification (TCP RST) when the psql process dies,
    even if you managed to kill it in a way that prevented psql from
    closing the connection for itself.  However, if that didn't happen
    for some reason, reboot would not make things better.  It would just
    guarantee that the OS no longer had any memory of the connection either.
    The client side is Linux too.
    It still sounds like your problems are fundamentally network-level
    problems and not Postgres problems... but it's hard to tell from
    here whether it's client-side software or network infrastructure
    doing it to you.
    ok, thanks



    --
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    L. Friedman netllama@gmail.com
    LlamaLand https://netllama.linux-sxs.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-novice @
categoriespostgresql
postedJul 26, '09 at 8:11p
activeJul 28, '09 at 2:17a
posts7
users2
websitepostgresql.org
irc#postgresql

2 users in discussion

Lonni J Friedman: 4 posts Tom Lane: 3 posts

People

Translate

site design / logo © 2021 Grokbase