I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
seems to be writing data to disk at rates that I think are
disproportional to the update load imposed on the database. I am
looking for ways to determine the cause of this I/O.

As an example, here is a typical graph produced by Munin:

http://purefiction.net/paste/pg-iostat.png

Running an hourly iostat produces this output:

Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 43.50 0.21 0.04 427915 72736
sda 43.62 0.21 0.04 428183 88904
sda 43.74 0.21 0.05 428440 104877
sda 43.90 0.21 0.06 428808 124681
sda 44.06 0.21 0.07 429111 145447
sda 44.27 0.21 0.08 429532 170317
sda 44.46 0.21 0.09 429985 193594

In other words, it's reading about 400MB/hour and writing around
15-20GB/hour, or exactly 118GB during the last six hours. To
determine how well this correlates to the actual inserts and updates
being performed on the database, I ran a loop alongside iostat that
executed "select sum(n_tup_upd), sum(n_tup_ins) from
pg_stat_all_tables" against PostgreSQL every hour and output the
difference. Here are a few samples:
delta_upd | delta_ins |
+-----------+-----------+
7111 | 2343 |
7956 | 2302 |
7876 | 2181 |
9269 | 2477 |
8553 | 2205 |
For the write numbers to match the tuple numbers, each updated/
inserted tuple would have to average at least 1.5MB (15 GB divided by
10,000 tuples), which is not the case; the total size of the raw
tuples updated/inserted during the above session probably does not
exceed a couple of megabytes. Even considering overhead, page size,
MVCC, etc., this does not compute.

I have not narrowed this explicitly down to PostgreSQL, but since the
stock Linux kernel we are running does not offer per-process I/O
statistics, I cannot determine this for sure. However, except for the
PostgreSQL database, everything else on the box should be identical
to what we are running on other boxes, which are not exhibiting the
same kind of load. Oh, and I have PostgreSQL logging turned off.

Note that PostgreSQL's performance in itself seems fine, and
according to top/ps it's only very rarely in iowait.

Alexander.

Search Discussions

  • Tom Lane at Oct 2, 2006 at 3:50 pm

    Alexander Staubo writes:
    I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
    seems to be writing data to disk at rates that I think are
    disproportional to the update load imposed on the database. I am
    looking for ways to determine the cause of this I/O.
    Are you sure that iostat is to be trusted? The read numbers in
    particular look suspiciously uniform ... it would be a strange
    query load that would create a read demand changing less than 1%
    from hour to hour, unless perhaps that represented the disk's
    saturation point, which is not the case if you're not seeing
    obvious performance problems.

    regards, tom lane
  • Alexander Staubo at Oct 2, 2006 at 5:25 pm

    On Oct 2, 2006, at 17:50 , Tom Lane wrote:

    Alexander Staubo <alex@purefiction.net> writes:
    I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
    seems to be writing data to disk at rates that I think are
    disproportional to the update load imposed on the database. I am
    looking for ways to determine the cause of this I/O.
    Are you sure that iostat is to be trusted?
    No. :) But iostat reads directly from /dev/diskstats, which should be
    reliable. Of course, it still doesn't say anything about which
    process is doing the writing; for that I would need to install the
    atop kernel patches or similar.

    ...
    The read numbers in
    particular look suspiciously uniform ... it would be a strange
    query load that would create a read demand changing less than 1%
    from hour to hour, unless perhaps that represented the disk's
    saturation point, which is not the case if you're not seeing
    obvious performance problems.
    They are not uniform at all -- they correlate perfectly with the web
    traffic; it just so happens that the samples I quoted were from peak
    hours. Take a look at the Munin graph. (The spikes correspond to
    scheduled maintenance tasks like backups.)

    Alexander.
  • Alexander Staubo at Oct 2, 2006 at 6:20 pm
    I'm surprised that I have to reply to myself, since in hindsight this
    should be bloody obvious: It's the pgsql_tmp directory. I just
    monitored the file creation in that directory, and found PostgreSQL
    to be creating huge temporary, extremely short-lived files ranging
    from 1MB to 20MB in size. I increased work_mem to 25MB -- should be
    perfectly safe on a box with 4GB of RAM, I hope - and the iostat
    traffic seems to have dropped to near zero.

    I'm a bit baffled as to why PostgreSQL would ever be sorting 20MB of
    data in the first place, even with ~12 connections running queries
    concurrently, but I suppose I will have to look more closely at our
    query patterns.

    Alexander.
    On Oct 2, 2006, at 19:25 , Alexander Staubo wrote:
    On Oct 2, 2006, at 17:50 , Tom Lane wrote:

    Alexander Staubo <alex@purefiction.net> writes:
    I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
    seems to be writing data to disk at rates that I think are
    disproportional to the update load imposed on the database. I am
    looking for ways to determine the cause of this I/O.
    Are you sure that iostat is to be trusted?
    No. :) But iostat reads directly from /dev/diskstats, which should
    be reliable. Of course, it still doesn't say anything about which
    process is doing the writing; for that I would need to install the
    atop kernel patches or similar.

    ...
    The read numbers in
    particular look suspiciously uniform ... it would be a strange
    query load that would create a read demand changing less than 1%
    from hour to hour, unless perhaps that represented the disk's
    saturation point, which is not the case if you're not seeing
    obvious performance problems.
    They are not uniform at all -- they correlate perfectly with the
    web traffic; it just so happens that the samples I quoted were from
    peak hours. Take a look at the Munin graph. (The spikes correspond
    to scheduled maintenance tasks like backups.)

    Alexander.


    ---------------------------(end of
    broadcast)---------------------------
    TIP 9: In versions below 8.0, the planner will ignore your desire to
    choose an index scan if your joining column's datatypes do not
    match

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedOct 2, '06 at 3:37p
activeOct 2, '06 at 6:20p
posts4
users2
websitepostgresql.org
irc#postgresql

2 users in discussion

Alexander Staubo: 3 posts Tom Lane: 1 post

People

Translate

site design / logo © 2022 Grokbase