This thread was renamed. It used to be: "shared_buffers Question". The
old thread kind of died out. I'm hoping to get some more direction by
rephrasing the problem, along with some extra observations I've
recently made.

The core of the problem is that Postgres is filling up my hard drive
with swap files at the rate of around 3 to 7 GB per week (that's
Gigabytes not Megabytes) . At this rate it takes roughly two months to
fill up my 40 GB hard drive with swap files. When the hard drive is
full, Postgres crashes and all the 200 connected clients hang. It's not
pretty. I can see the swap files and watch them grow over time. They
are located on my Mac OS 10.3.2 eMac at /var/vm/. I know the swap files
come from Postgres processes because, if I stop the Postgres server,
the swap files go away and the hard drive space is freed up.

When I look at each individual Postgres back-end process in the
Activity Monitor, each one seems to take up a lot of memory. Processes
usually jump to 60.04 MB of "virtual memory" soon after they are first
created. However, "real memory" use ranges 1 MB to 10 MB per process.

I'm looking for advice on what settings to look at in order to solve
the problem of my hard drive filling up with swap files. My guess is
that it must be something I'm doing, since I would find it hard to
imagine that Postgres has an *innate* memory leak that big on OS X.
Here are the only settings I have customized for my Postgres
installation, in case it helps:

In /etc/profile:
ulimit -u 512 # increases the max processes per user (from 100 to 512)
at the shell level.
ulimit -n 40000 # increases the max number of open files allowed by the
shell level (used to be 8000).

In postgresql.conf:
max_connections = 200
shared_buffers = 2000

In /etc/rc:
sysctl -w kern.sysv.shmmax=167772160
sysctl -w kern.sysv.shmall=65536

In /etc/sysctl.conf:
kern.maxproc=2048
kern.maxprocperuid=512 # Turn up the maxproc per user
kern.maxfiles=40000 # Turn up max files
kern.maxfilesperproc=30000 # Turn up max files per process


Thanks for any insight you can provide!

Joe

Search Discussions

  • Alvaro Herrera at Aug 20, 2004 at 4:15 pm

    On Fri, Aug 20, 2004 at 10:25:20AM -0500, Joe Lester wrote:
    This thread was renamed. It used to be: "shared_buffers Question".
    The old thread kind of died out. I'm hoping to get some more direction
    by rephrasing the problem, along with some extra observations I've
    recently made.

    The core of the problem is that Postgres is filling up my hard drive
    with swap files at the rate of around 3 to 7 GB per week (that's
    Gigabytes not Megabytes).
    Maybe you said this before, but I didn't follow the previous thread.
    What do you mean by swap files? Where are those files located? Can you
    show a listing of them?

    I assume, of course, that your data proper does not grow linearly with
    those "swap files," does it?

    --
    Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
    Si no sabes adonde vas, es muy probable que acabes en otra parte.
  • Joe Lester at Aug 20, 2004 at 4:46 pm
    Thanks for the response!
    Maybe you said this before, but I didn't follow the previous thread.
    What do you mean by swap files?
    By swap files I mean the files that contain the memory that the OS is
    maintaining on the hard disk, after physical RAM has overflowed.
    Where are those files located?
    I can see the swap files and watch them grow over time. They are
    located on my Mac OS 10.3.2 eMac at /var/vm/.
    Can you show a listing of them?
    Right now there are not that many since I just restarted the server
    recently. But already, they total 3 GB.

    officelink:/var/vm root# ls -lh
    drwx--x--x 14 root wheel 476B 17 Feb 2004 app_profile
    -rw------T 1 root wheel 64M 31 Jul 09:10 swapfile0
    -rw------T 1 root wheel 64M 31 Jul 15:13 swapfile1
    -rw------T 1 root wheel 128M 31 Jul 16:08 swapfile2
    -rw------T 1 root wheel 256M 3 Aug 13:20 swapfile3
    -rw------T 1 root wheel 512M 18 Aug 14:29 swapfile4
    -rw------T 1 root wheel 512M 18 Aug 16:24 swapfile5
    -rw------T 1 root wheel 512M 18 Aug 23:30 swapfile6
    -rw------T 1 root wheel 512M 19 Aug 12:59 swapfile7
    -rw------T 1 root wheel 512M 20 Aug 09:56 swapfile8
    I assume, of course, that your data proper does not grow linearly with
    those "swap files," does it?
    The total amount of disk space consumed by the swap files vastly
    exceeds the size of my postgres data directory, if that's what you're
    asking, which is about 300 MB or so.
  • Alvaro Herrera at Aug 20, 2004 at 4:57 pm
    On Fri, Aug 20, 2004 at 11:45:18AM -0500, Joe Lester wrote:

    Joe,
    Maybe you said this before, but I didn't follow the previous thread.
    What do you mean by swap files?
    By swap files I mean the files that contain the memory that the OS is
    maintaining on the hard disk, after physical RAM has overflowed.
    Wow, this would be the most spectacular memory leak I have ever seen.
    How do you know these files are related to the Postgres processes?
    Could we see a top extract, or some equivalent showing a process list
    and their memory sizes?

    --
    Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
    Jajaja! Solo hablaba en serio!
  • Joe Lester at Aug 20, 2004 at 6:10 pm

    Wow, this would be the most spectacular memory leak I have ever seen.
    How do you know these files are related to the Postgres processes?
    Because if I stop the Postgres server, the swap files go away and the
    hard drive space is freed up.
    Could we see a top extract, or some equivalent showing a process list
    and their memory sizes?
    Here's a partial top extract. As noted before, the VSIZE is very high
    on all.

    Processes: 231 total, 3 running, 228 sleeping... 300 threads
    13:05:16
    Load Avg: 0.12, 0.07, 0.02 CPU usage: 6.3% user, 15.9% sys, 77.8%
    idle
    SharedLibs: num = 102, resident = 14.0M code, 1.21M data, 2.77M
    LinkEdit
    MemRegions: num = 9888, resident = 134M + 7.73M private, 32.9M shared
    PhysMem: 78.6M wired, 283M active, 141M inactive, 504M used, 7.82M
    free
    VM: 12.1G + 70.5M 465206(0) pageins, 1792391(0) pageouts

    PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE
    VSIZE
    14326 top 19.7% 0:03.13 1 17 26 364K 384K 740K
    27.1M
    14293 bash 0.0% 0:00.13 1 12 18 124K 852K 796K
    18.2M
    14291 sshd 0.0% 0:00.49 1 15 42 504K 884K 1.43M
    30.0M
    14290 postgres 0.0% 0:00.31 1 9 32 860K 16.9M 6.90M
    59.5M
    14289 ftpd 0.0% 0:00.02 1 15 34 152K 640K 772K
    27.4M
    14264 postgres 0.0% 0:00.31 1 9 32 860K 16.9M 6.32M
    59.5M
    14235 postgres 0.0% 0:01.36 1 9 33 880K 16.9M 9.62M
    60.0M
    14224 postgres 0.0% 0:00.27 1 9 32 856K 16.9M 6.26M
    59.5M
    14220 postgres 0.0% 0:00.24 1 9 32 840K 16.9M 6.50M
    59.5M
    14218 postgres 0.0% 0:39.17 1 9 32 860K 16.9M 10.7M
    59.5M
    14113 postgres 0.0% 0:00.21 1 9 32 648K 16.9M 4.69M
    59.5M
    14075 postgres 0.0% 1:19.88 1 9 33 756K 16.9M 7.84M
    60.0M
    14063 postgres 0.0% 0:00.46 1 9 33 680K 16.9M 7.38M
    60.0M
    14061 postgres 0.0% 0:00.18 1 9 32 616K 16.9M 4.48M
    59.5M
    14058 postgres 0.0% 0:00.60 1 9 33 720K 16.9M 6.53M
    60.0M
    14055 postgres 0.0% 0:00.17 1 9 32 496K 16.9M 1.77M
    59.5M
    14028 postgres 0.0% 0:00.18 1 9 32 448K 16.9M 1.68M
    59.5M
    14015 postgres 0.0% 0:18.61 1 9 33 844K 16.9M 8.61M
    60.0M
    14013 postgres 0.0% 0:19.65 1 9 33 784K 16.9M 7.21M
    60.0M
    14009 postgres 0.0% 2:05.01 1 9 33 940K 16.9M 10.9M
    60.0M
    13980 postgres 0.0% 0:00.47 1 9 32 804K 16.9M 2.70M
    59.5M
    13444 postgres 0.0% 0:16.70 1 9 33 792K 16.9M 10.7M
    60.0M
    13415 postgres 0.0% 0:02.03 1 9 33 936K 16.9M 6.74M
    60.0M
    13409 postgres 0.0% 0:01.75 1 9 32 508K 16.9M 1.96M
    59.5M
    13355 postgres 0.0% 0:01.75 1 9 33 884K 16.9M 5.50M
    60.0M
    13339 postgres 0.0% 0:19.49 1 9 33 716K 16.9M 7.21M
    60.0M
    13337 postgres 0.0% 0:01.93 1 9 33 848K 16.9M 3.56M
    60.0M
    13321 postgres 0.0% 0:01.88 1 9 33 868K 16.9M 5.89M
    60.0M
  • Jeff at Aug 20, 2004 at 6:28 pm

    On Aug 20, 2004, at 2:09 PM, Joe Lester wrote:
    H PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE
    VSIZE
    14326 top 19.7% 0:03.13 1 17 26 364K 384K 740K
    27.1M
    14293 bash 0.0% 0:00.13 1 12 18 124K 852K 796K
    18.2M
    14291 sshd 0.0% 0:00.49 1 15 42 504K 884K 1.43M
    30.0M
    14290 postgres 0.0% 0:00.31 1 9 32 860K 16.9M 6.90M
    59.5M
    On osx the number to look at isn't vsize, but actually rprvt. vsize is
    its size in virtual memory (basically useless)- rprvt is the size of
    its "private memory" - non shared memory. It is a pretty good number to
    go by.

    But I believe the whole problem is the emac only has 512MB of ram and
    you simply don't have enough memory for that many connections. Even
    with each connection sucking up 2MB of memory that is 400MB and as you
    can see.. most of them are sucking 4-8MB.

    If you cannot upgrade I'd recommend looking into something like pgpool
    so you can pool your connections (this results in having a lot fewer
    instances of postgres, thus saving tons of memory)

    If you watch top I'm guessing you'll see a TON of pagein/pageouts
    (watch the numbers change. it displays the difference in parenthesis
    after updates)
  • Joe Lester at Aug 20, 2004 at 7:18 pm

    On osx the number to look at isn't vsize, but actually rprvt. vsize
    is its size in virtual memory (basically useless)- rprvt is the size
    of its "private memory" - non shared memory. It is a pretty good
    number to go by.
    But I believe the whole problem is the emac only has 512MB of ram and
    you simply don't have enough memory for that many connections. Even
    with each connection sucking up 2MB of memory that is 400MB and as you
    can see.. most of them are sucking 4-8MB.
    Thanks for the response, Jeff. The RPRVT, which you say is best to go
    by, shows only 4K-2MB per connection. That's less than the 4-8MB size
    you reference which I believe comes from the RSIZE column in top. So
    according to RSIZE I should add more RAM. But according to RPRVT I
    should be OK with 512 MB of RAM. I'm a little confused on which column
    to go by.

    Also, even if I need to add more RAM, it still strikes me as more than
    a little weird that a memory overflow situation gradually eats up all
    the space on my hard drive. That's the problem I'd really like to solve
    if possible.

    PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE
    VSIZE
    14378 postgres 0.0% 0:00.41 1 9 33 880K 17.8M 8.31M
    60.0M
    14359 postgres 0.0% 0:00.90 1 9 32 860K 17.8M 4.12M
    59.5M
    14346 postgres 0.0% 0:00.31 1 9 32 840K 17.8M 2.26M
    59.5M
    14342 postgres 0.0% 0:01.08 1 9 33 880K 17.8M 7.86M
    60.0M
    14290 postgres 0.0% 0:00.55 1 9 32 868K 17.8M 3.09M
    59.5M
    14235 postgres 0.0% 0:03.39 1 9 33 880K 17.8M 7.68M
    60.0M
    14224 postgres 0.0% 0:00.43 1 9 32 864K 17.8M 2.55M
    59.5M
    14220 postgres 0.0% 0:00.40 1 9 33 888K 17.8M 3.02M
    60.0M
    14218 postgres 0.0% 1:07.53 1 9 33 2.04M 17.8M 10.9M
    60.0M
    14075 postgres 0.0% 1:40.98 1 9 33 872K 17.8M 10.5M
    60.0M
    14063 postgres 0.0% 0:00.55 1 9 33 892K 17.8M 3.27M
    60.0M
    If you cannot upgrade I'd recommend looking into something like pgpool
    so you can pool your connections (this results in having a lot fewer
    instances of postgres, thus saving tons of memory)
    Huh. Cool. Do you have any experience of what the performance hit would
    be for using pgpool?
  • Jeff at Aug 21, 2004 at 1:46 am

    On Aug 20, 2004, at 3:18 PM, Joe Lester wrote:

    Huh. Cool. Do you have any experience of what the performance hit
    would be for using pgpool?
    I've only had performance increases with pgpool. Then again, my
    connections are common of web apps (as web apps connect) - many many
    short lived connections.

    Given you have long lived connections.. I'm not sure if pgpool would
    help.
    It may, give it a whirl.

    however it seems Tom found the actual problem.
  • Bill Moran at Aug 20, 2004 at 5:02 pm

    Joe Lester wrote:

    Thanks for the response!
    Maybe you said this before, but I didn't follow the previous thread.
    What do you mean by swap files?
    By swap files I mean the files that contain the memory that the OS is
    maintaining on the hard disk, after physical RAM has overflowed.
    Where are those files located?
    I can see the swap files and watch them grow over time. They are
    located on my Mac OS 10.3.2 eMac at /var/vm/.
    Can you show a listing of them?
    Right now there are not that many since I just restarted the server
    recently. But already, they total 3 GB.

    officelink:/var/vm root# ls -lh
    drwx--x--x 14 root wheel 476B 17 Feb 2004 app_profile
    -rw------T 1 root wheel 64M 31 Jul 09:10 swapfile0
    -rw------T 1 root wheel 64M 31 Jul 15:13 swapfile1
    -rw------T 1 root wheel 128M 31 Jul 16:08 swapfile2
    -rw------T 1 root wheel 256M 3 Aug 13:20 swapfile3
    -rw------T 1 root wheel 512M 18 Aug 14:29 swapfile4
    -rw------T 1 root wheel 512M 18 Aug 16:24 swapfile5
    -rw------T 1 root wheel 512M 18 Aug 23:30 swapfile6
    -rw------T 1 root wheel 512M 19 Aug 12:59 swapfile7
    -rw------T 1 root wheel 512M 20 Aug 09:56 swapfile8
    I assume, of course, that your data proper does not grow linearly with
    those "swap files," does it?
    The total amount of disk space consumed by the swap files vastly
    exceeds the size of my postgres data directory, if that's what you're
    asking, which is about 300 MB or so.
    How many Postgres processes are running? Does this number increase with
    the memory usage.

    Simple fact of the matter is that you have apparently found a memory
    leak. How Mac OS X deals with swapping is (more or less) unimportant.

    I'm wondering, however, if you have a connection leak instead. i.e.
    is it possible that your client application is opening a whole bunch
    of connections and never closing them? You did show that you have
    a max # of connection of 200. That's pretty high, unless you've got
    a lot of RAM in that machine.

    A more accurate description of the problem would be:
    How many connections are actually open?
    How much memory is actually in use by Postgres processes? (The amount of
    swap in use is unimportant to the Postgres folks, it's an OS thing)

    --
    Bill Moran
    Potential Technologies
    http://www.potentialtech.com
  • Joe Lester at Aug 20, 2004 at 6:33 pm

    How many Postgres processes are running?
    146 right now. Most of them are idle at any given point in time.
    Does this number increase with
    the memory usage.
    No. There are 140 - 150 postgres processes running on average, but the
    memory usage keeps going up and up.
    Simple fact of the matter is that you have apparently found a memory
    leak. How Mac OS X deals with swapping is (more or less) unimportant.
    Wow, I sure hope not. I was really hoping to take care of the problem.
    I'm wondering, however, if you have a connection leak instead. i.e.
    is it possible that your client application is opening a whole bunch
    of connections and never closing them?
    No. The clients open only one connection (and hang onto it for dear
    life :-).
    You did show that you have
    a max # of connection of 200. That's pretty high, unless you've got
    a lot of RAM in that machine.
    I have 512 MB of RAM in the machine. The server is performing
    wonderfully. It's just that the swap files keep sprouting like weeds.
    A more accurate description of the problem would be:
    How many connections are actually open?
    146 right now. That's about average.
    How much memory is actually in use by Postgres processes? (The amount
    of
    swap in use is unimportant to the Postgres folks, it's an OS thing)
    This is where I could use some pointers. The following line is a top
    entry for a single postgres process. Hope that helps.

    PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE
    VSIZE
    14235 postgres 0.0% 0:01.36 1 9 33 880K 16.9M 9.62M
    60.0M
  • Bill Moran at Aug 20, 2004 at 7:16 pm

    Joe Lester wrote:
    How many Postgres processes are running?
    146 right now. Most of them are idle at any given point in time.
    That's a lot for a machine with only 512M of RAM.
    I'm wondering, however, if you have a connection leak instead. i.e.
    is it possible that your client application is opening a whole bunch
    of connections and never closing them?
    No. The clients open only one connection (and hang onto it for dear
    life :-).
    If these clients aren't utilizing the database, it might be worthwhile
    to have them disconnect after a period of inactivity, and reconnect when
    things get busy again.
    You did show that you have
    a max # of connection of 200. That's pretty high, unless you've got
    a lot of RAM in that machine.
    I have 512 MB of RAM in the machine.
    That's not a lot of RAM. I have 512M in a machine that's only designed
    to handle 20 connections (although that's MS Windows + MSSQL ... but
    you get the idea ... we're talking a factor of 10 here)
    The server is performing
    wonderfully. It's just that the swap files keep sprouting like weeds.
    I would expect that if you ignore it for a while, eventually it will
    reach an equalibrium. (where it's not increasing the amount of swap in
    use) but it will always hurt performance any time is has to page in or
    out.
    How much memory is actually in use by Postgres processes? (The amount
    of
    swap in use is unimportant to the Postgres folks, it's an OS thing)
    This is where I could use some pointers. The following line is a top
    entry for a single postgres process. Hope that helps.

    PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
    14235 postgres 0.0% 0:01.36 1 9 33 880K 16.9M 9.62M 60.0M
    Please don't wrap machine-generated output ... it makes it VERY difficult
    to understand.

    I'll defer this answer to Jeff, as he seems to know quite a bit more about
    how Darwin manages memory than I do.

    His recommendation to try pgpool was also good.

    --
    Bill Moran
    Potential Technologies
    http://www.potentialtech.com
  • Joe Lester at Aug 20, 2004 at 7:30 pm

    If these clients aren't utilizing the database, it might be worthwhile
    to have them disconnect after a period of inactivity, and reconnect
    when
    things get busy again.
    That's a good idea, except in the future, all the clients will be
    active most of the time. So, I'd like to get the server to the point
    where it can handle 150-200 client connections gracefully.
    I would expect that if you ignore it for a while, eventually it will
    reach an equalibrium. (where it's not increasing the amount of swap in
    use) but it will always hurt performance any time is has to page in or
    out.
    Unfortunately, it does not reach an equilibrium. It just keeps eating
    disk space until it's all gone.
    Please don't wrap machine-generated output ... it makes it VERY
    difficult
    to understand.
    Please forgive me. What do you mean by "wrap machine-generated output".
    I would love to oblige.
  • Bill Moran at Aug 20, 2004 at 8:01 pm

    Joe Lester wrote:

    If these clients aren't utilizing the database, it might be worthwhile
    to have them disconnect after a period of inactivity, and reconnect
    when
    things get busy again.
    That's a good idea, except in the future, all the clients will be
    active most of the time. So, I'd like to get the server to the point
    where it can handle 150-200 client connections gracefully.
    Ahh ...
    I would expect that if you ignore it for a while, eventually it will
    reach an equalibrium. (where it's not increasing the amount of swap in
    use) but it will always hurt performance any time is has to page in or
    out.
    Unfortunately, it does not reach an equilibrium. It just keeps eating
    disk space until it's all gone.
    Well ... I was wrong, and per Tom's post, you've found a problem in
    Darwin/OS X.
    Please don't wrap machine-generated output ... it makes it VERY
    difficult
    to understand.
    Please forgive me. What do you mean by "wrap machine-generated output".
    I would love to oblige.
    Well, you're wrapping everything. Notice how my part of the conversation
    above is ugly. It should look like this:
    Please don't wrap machine-generated output ... it makes it VERY difficult
    to understand.
    In the case of 'machine-generated output', I was specifically talking
    about top. You sent this:

    Processes: 231 total, 3 running, 228 sleeping... 300 threads
    13:05:16
    Load Avg: 0.12, 0.07, 0.02 CPU usage: 6.3% user, 15.9% sys, 77.8%
    idle
    SharedLibs: num = 102, resident = 14.0M code, 1.21M data, 2.77M
    LinkEdit
    MemRegions: num = 9888, resident = 134M + 7.73M private, 32.9M shared
    PhysMem: 78.6M wired, 283M active, 141M inactive, 504M used, 7.82M
    free
    VM: 12.1G + 70.5M 465206(0) pageins, 1792391(0) pageouts

    PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE
    VSIZE
    14326 top 19.7% 0:03.13 1 17 26 364K 384K 740K
    27.1M
    ...

    Notice how incredibly difficult it is to make sense of the top output.
    Compare with this:

    Processes: 231 total, 3 running, 228 sleeping... 300 threads 13:05:16
    Load Avg: 0.12, 0.07, 0.02 CPU usage: 6.3% user, 15.9% sys, 77.8% idle
    SharedLibs: num = 102, resident = 14.0M code, 1.21M data, 2.77M LinkEdit
    MemRegions: num = 9888, resident = 134M + 7.73M private, 32.9M shared
    PhysMem: 78.6M wired, 283M active, 141M inactive, 504M used, 7.82M free
    VM: 12.1G + 70.5M 465206(0) pageins, 1792391(0) pageouts

    PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
    14326 top 19.7% 0:03.13 1 17 26 364K 384K 740K 27.1M
    ...

    This is usually caused by a setting in your mail client that reads
    something like "wrap lines at 72 characters" being turned on.

    You should wrap your text at 72 chars when you're typing, (so it displays
    readibly on most mail programs) but it's not a good idea to arbitrarily
    wrap _all_ text in a message to any line length. Doing so usually ends
    up making a mess of some part of the message.

    On another note: sorry about leading you in the wrong direction on the
    problem, but I'm glad Tom was able to isolate it for you.

    --
    Bill Moran
    Potential Technologies
    http://www.potentialtech.com
  • Joe Lester at Aug 20, 2004 at 8:27 pm

    On Aug 20, 2004, at 3:01 PM, Bill Moran wrote:
    Please don't wrap machine-generated output ... it makes it VERY
    difficult
    to understand.
    This is usually caused by a setting in your mail client that reads
    something like "wrap lines at 72 characters" being turned on.

    You should wrap your text at 72 chars when you're typing, (so it
    displays
    readibly on most mail programs) but it's not a good idea to arbitrarily
    wrap _all_ text in a message to any line length. Doing so usually ends
    up making a mess of some part of the message.
    I'll try to be sensitive to that. Unfortunately, my mail client forces
    the text to wrap and provides no override preference. I'm using
    Mail.app, so if anyone knows of a workaround or solution, please let me
    know. Thanks.
  • Tom Lane at Aug 20, 2004 at 7:43 pm

    Bill Moran writes:
    Joe Lester wrote:
    I'm wondering, however, if you have a connection leak instead. i.e.
    is it possible that your client application is opening a whole bunch
    of connections and never closing them?
    No. The clients open only one connection (and hang onto it for dear
    life :-).
    If these clients aren't utilizing the database, it might be worthwhile
    to have them disconnect after a period of inactivity, and reconnect when
    things get busy again.
    If my theory is right, this would actually be counterproductive. The
    leak I think I'm seeing is associated with backend exit and so the way
    to slow it as much as possible is to prolong backend lifetime as much
    as possible. Joe, what is the mean lifetime of your connections anyway?
    I assume they don't stay up forever.

    regards, tom lane
  • Joe Lester at Aug 20, 2004 at 7:59 pm

    On Aug 20, 2004, at 2:43 PM, Tom Lane wrote:
    Bill Moran <wmoran@potentialtech.com> writes:
    Joe Lester wrote:
    I'm wondering, however, if you have a connection leak instead. i.e.
    is it possible that your client application is opening a whole bunch
    of connections and never closing them?
    No. The clients open only one connection (and hang onto it for dear
    life :-).
    If these clients aren't utilizing the database, it might be worthwhile
    to have them disconnect after a period of inactivity, and reconnect
    when
    things get busy again.
    If my theory is right, this would actually be counterproductive. The
    leak I think I'm seeing is associated with backend exit and so the way
    to slow it as much as possible is to prolong backend lifetime as much
    as possible. Joe, what is the mean lifetime of your connections
    anyway?
    I assume they don't stay up forever.
    They are "permanent connections", meaning that the same connection
    stays open on the server as long as the client application is running.
    And it's common for the clients to stay running for days at a time. I'd
    say the average length of a connection is 3 days.
  • Tom Lane at Aug 20, 2004 at 8:08 pm

    Joe Lester writes:
    And it's common for the clients to stay running for days at a time. I'd
    say the average length of a connection is 3 days.
    Uh-huh. Can you do anything to increase that?

    Another possibility for slowing the leakage rate (pending a real fix)
    is to decrease the size of your shared memory segment, ie, reduce
    shared_buffers and the other shared-memory sizing parameters as much
    as you can without killing performance.

    regards, tom lane
  • Tom Lane at Aug 20, 2004 at 7:29 pm

    Joe Lester writes:
    Simple fact of the matter is that you have apparently found a memory
    leak. How Mac OS X deals with swapping is (more or less) unimportant.
    Wow, I sure hope not. I was really hoping to take care of the problem.
    I think what you've found is an OS X bug.

    I was able to replicate this behavior on OS X 10.3.5. All I did was
    start the postmaster and then start a continuous loop in a shell window:
    while true
    do
    psql -c "select count(*) from tenk1" regression
    done
    (tenk1 is just a test table with 10000 or so rows ... not very big.)
    After a few thousand iterations I have more swapfiles than I did before.
    The postmaster itself is certainly not leaking memory, and there are no
    backends lasting longer than a fraction of a second, but the machine is
    acting like it's got a problem. Watching top, I see the "free PhysMem"
    steadily decrease to zero and then bounce back up to about 645M (out of
    768M installed). Each time it bounces up, the VM pageouts count takes a
    jump (otherwise pageouts doesn't move), and there's also a burst of disk
    activity according to iostat. The cycle repeats every 45 seconds or so.
    Meanwhile, the entire system has become exceedingly sluggish (opening
    Safari is painful, for example, and even just switching front
    application is visibly slow).

    What I think is happening is that the system thinks that the memory
    associated with the exited backends is still in use, and hence
    faithfully hangs onto it and eventually swaps it out. Almost certainly,
    this bug is specifically triggered by our use of SysV shared memory.
    If Apple had this bug for every process spawned by OS X, they'd have
    noticed and fixed it long since ;-). But SysV shmem is not a popular
    feature on OS X (as evidenced by the ridiculously low default limits
    on it) and it's easy to imagine such a bug going unnoticed if it's
    only triggered by shmem usage.

    A crude estimate of the leakage rate I'm seeing is about 2.5MB per
    exited backend, which is in the same general area as the size of the
    shared memory segment, so it seems plausible that the system is somehow
    treating an "afterimage" of the shmem segment as live data that it needs
    to swap out.

    BTW, I tried hacking the backend to forcibly shmdt() just before exit,
    just to see if that would work around the problem. No go; doesn't seem
    to change the behavior at all. I can't think of anything else we could
    do at the application level to dodge the problem.

    Time to file a bug report. With Apple, not with us.

    regards, tom lane
  • Joe Lester at Aug 20, 2004 at 7:40 pm

    On Aug 20, 2004, at 2:28 PM, Tom Lane wrote:
    I think what you've found is an OS X bug.

    I was able to replicate this behavior on OS X 10.3.5. All I did was
    start the postmaster and then start a continuous loop in a shell
    window:
    while true
    do
    psql -c "select count(*) from tenk1" regression
    done
    (tenk1 is just a test table with 10000 or so rows ... not very big.)
    After a few thousand iterations I have more swapfiles than I did
    before.
    The postmaster itself is certainly not leaking memory, and there are no
    backends lasting longer than a fraction of a second, but the machine is
    acting like it's got a problem. Watching top, I see the "free PhysMem"
    steadily decrease to zero and then bounce back up to about 645M (out of
    768M installed). Each time it bounces up, the VM pageouts count takes
    a
    jump (otherwise pageouts doesn't move), and there's also a burst of
    disk
    activity according to iostat. The cycle repeats every 45 seconds or
    so.
    Meanwhile, the entire system has become exceedingly sluggish (opening
    Safari is painful, for example, and even just switching front
    application is visibly slow).
    Yes, my system also becomes increasingly sluggish in the manner you
    describe as the swap files increase in number and size.
    Time to file a bug report. With Apple, not with us.
    I will. Thanks Tom!

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedAug 20, '04 at 3:25p
activeAug 21, '04 at 1:46a
posts19
users5
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase