Dear all,

From the last few days, I researched a lot on Postgresql Performance
Tuning due to slow speed of my server.
My application selects data from mysql database about 100000 rows ,
process it & insert into postgres 2 tables by making about 45 connections.

I set my postgresql parameters in postgresql.conf as below: ( OS :
Ubuntu, RAM : 16 GB, Postgres : 8.4.2 )

max_connections = 80
shared_buffers = 2048MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync=off
full_page_writes=off
synchronous_commit=off
checkpoint_segments = 32
checkpoint_completion_target = 0.7
effective_cache_size = 4096MB


After this I change my pg_xlog directory to a separate directory other
than data directory by symlinking.


By Application issue insert statements through postgresql connections only.

Please let me know if I missing any other important configuration.



Thanks

Search Discussions

  • Dusan Misic at Aug 4, 2011 at 8:19 am
    Hi Adarsh,

    Have you set checkpoint_segments and checkpoint_completion_target the right
    way?

    Tuning these parameters are a MUST if you want good write performance.

    See this link for more information:
    http://www.postgresql.org/docs/current/static/runtime-config-wal.html<http://www.postgresql.org/docs/current/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-CHECKPOINTS>

    Cheers,

    Dusan
    On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma wrote:

    Dear all,

    From the last few days, I researched a lot on Postgresql Performance Tuning
    due to slow speed of my server.
    My application selects data from mysql database about 100000 rows , process
    it & insert into postgres 2 tables by making about 45 connections.

    I set my postgresql parameters in postgresql.conf as below: ( OS : Ubuntu,
    RAM : 16 GB, Postgres : 8.4.2 )

    max_connections = 80
    shared_buffers = 2048MB
    work_mem = 32MB
    maintenance_work_mem = 512MB
    fsync=off full_page_writes=off synchronous_commit=off checkpoint_segments =
    32
    checkpoint_completion_target = 0.7 effective_cache_size = 4096MB


    After this I change my pg_xlog directory to a separate directory other than
    data directory by symlinking.


    By Application issue insert statements through postgresql connections only.

    Please let me know if I missing any other important configuration.



    Thanks



    --
    Sent via pgsql-performance mailing list ([email protected].**
    org <[email protected]>)
    To make changes to your subscription:
    http://www.postgresql.org/**mailpref/pgsql-performance<http://www.postgresql.org/mailpref/pgsql-performance>
  • Willy-Bas Loos at Aug 4, 2011 at 8:34 am

    On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma wrote:
    After this I change my pg_xlog directory to a separate directory other than
    data directory by symlinking.
    (...)
    Please let me know if I missing any other important configuration.
    Moving the pg_xlog to a different directory only helps when that
    directory is on a different harddisk (or whatever I/O device).

    HTH,

    WBL
    --
    "Patriotism is the conviction that your country is superior to all
    others because you were born in it." -- George Bernard Shaw
  • Scott Marlowe at Aug 4, 2011 at 8:46 am

    On Thu, Aug 4, 2011 at 2:34 AM, Willy-Bas Loos wrote:
    On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma wrote:
    After this I change my pg_xlog directory to a separate directory other than
    data directory by symlinking.
    (...)
    Please let me know if I missing any other important configuration.
    Moving the pg_xlog to a different directory only helps when that
    directory is on a different harddisk (or whatever I/O device).
    Not entirely true. By simply being on a different mounted file
    system this moves the fsync calls on the pg_xlog directories off of
    the same file system as the main data store. Previous testing has
    shown improvements in performance from just using a different file
    system.

    That said, the only real solution to a heavy write load is a heavy
    duty IO subsystem, with lots of drives and battery backed cache.
  • Dusan Misic at Aug 4, 2011 at 9:03 am
    Scott is right. His answer solves the problem in the long run. Even if your
    write load increases, it will perform fast enough.

    For now try increasing checkpoint_segments size, restart Postgres for new
    settings to take effect and try again with your write load.

    If you are not satisfied with write speed, then it is time to upgrade your
    storage system / aka to increase I/O performance.
    On Aug 4, 2011 10:46 AM, "Scott Marlowe" wrote:
    On Thu, Aug 4, 2011 at 2:34 AM, Willy-Bas Loos wrote:
    On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma wrote:
    After this I change my pg_xlog directory to a separate directory other
    than
    data directory by symlinking.
    (...)
    Please let me know if I missing any other important configuration.
    Moving the pg_xlog to a different directory only helps when that
    directory is on a different harddisk (or whatever I/O device).
    Not entirely true. By simply being on a different mounted file
    system this moves the fsync calls on the pg_xlog directories off of
    the same file system as the main data store. Previous testing has
    shown improvements in performance from just using a different file
    system.

    That said, the only real solution to a heavy write load is a heavy
    duty IO subsystem, with lots of drives and battery backed cache.

    --
    Sent via pgsql-performance mailing list ([email protected])
    To make changes to your subscription:
    http://www.postgresql.org/mailpref/pgsql-performance
  • Nicholson, Brad (Toronto, ON, CA) at Aug 4, 2011 at 12:50 pm

    -----Original Message-----
    From: p[email protected] [mailto:pgsql-performance-
    [email protected]] On Behalf Of Scott Marlowe
    Sent: Thursday, August 04, 2011 4:46 AM
    To: Willy-Bas Loos
    Cc: Adarsh Sharma; [email protected]
    Subject: Re: [PERFORM] Need to tune for Heavy Write

    Moving the pg_xlog to a different directory only helps when that
    directory is on a different harddisk (or whatever I/O device).
    Not entirely true. By simply being on a different mounted file
    system this moves the fsync calls on the pg_xlog directories off of
    the same file system as the main data store. Previous testing has
    shown improvements in performance from just using a different file
    system.
    Is this still the case for xfs or ext4 where fsync is properly flushing only the correct blocks to disk, or was this referring to the good old ext3 flush everything on fysnc issue?

    Brad.
  • Scott Marlowe at Aug 4, 2011 at 4:40 pm

    On Thu, Aug 4, 2011 at 6:41 AM, Nicholson, Brad (Toronto, ON, CA) wrote:
    -----Original Message-----
    From: p[email protected] [mailto:pgsql-performance-
    [email protected]] On Behalf Of Scott Marlowe
    Sent: Thursday, August 04, 2011 4:46 AM
    To: Willy-Bas Loos
    Cc: Adarsh Sharma; [email protected]
    Subject: Re: [PERFORM] Need to tune for Heavy Write

    Moving the pg_xlog to a different directory only helps when that
    directory is on a different harddisk (or whatever I/O device).
    Not entirely true.  By simply being on a different  mounted file
    system this moves the fsync calls on the pg_xlog directories off of
    the same file system as the main data store.  Previous testing has
    shown improvements in performance from just using a different file
    system.
    Is this still the case for xfs or ext4 where fsync is properly flushing only the correct blocks to disk, or was this referring to the good old ext3 flush everything on fysnc issue?
    Good question. One I do not know the answer to. Since I run my dbs
    with separate pg_xlog drive sets I've never been able to test that.
  • Dusan Misic at Aug 4, 2011 at 8:43 am
    To put it simple, you need to set checkpoint_segments way higher than your
    current value!

    Link: wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

    On Aug 4, 2011 6:57 AM, "Adarsh Sharma" wrote:
  • Craig Ringer at Aug 4, 2011 at 1:50 pm

    On 4/08/2011 12:56 PM, Adarsh Sharma wrote:
    Dear all,

    From the last few days, I researched a lot on Postgresql Performance
    Tuning due to slow speed of my server.
    My application selects data from mysql database about 100000 rows ,
    process it & insert into postgres 2 tables by making about 45 connections.
    Why 45?

    Depending on your disk subsystem, that may be way too many for optimum
    throughput. Or too few, for that matter.

    Also, how are you doing your inserts? Are they being done in a single
    big transaction per connection, or at least in resonable chunks? If
    you're doing stand-alone INSERTs autocommit-style you'll see pretty
    shoddy performance.

    Have you looked into using COPY to bulk load your data? Possibly using
    the libpq or jdbc copy APIs, or possibly using server-side COPY?
    fsync=off full_page_writes=off synchronous_commit=off
    !!!!

    I hope you don't want to KEEP that data if you have a hardware fault or
    power loss. Setting fsync=off is pretty much saying "I don't mind if you
    eat my data".

    Keep. Really. Really. Good. Backups.

    --
    Craig Ringer
  • Kevin Grittner at Aug 4, 2011 at 1:57 pm

    Adarsh Sharma wrote:

    Postgres : 8.4.2
    You should definitely update to a more recent bug patch level:

    http://www.postgresql.org/support/versioning
    RAM : 16 GB
    effective_cache_size = 4096MB
    That should probably be more like 12GB to 15GB. It probably won't
    affect the load time here, but could affect other queries.
    My application selects data from mysql database about 100000
    rows process it & insert into postgres 2 tables by making about 45
    connections.
    How many cores do you have? How many disk spindles in what sort of
    array with what sort of controller.

    Quite possibly you can improve performance dramatically by not
    turning loose a "thundering herd" of competing processes.

    Can you load the target table without indexes and then build the
    indexes?

    Can you use the COPY command (or at least prepared statements) for
    the inserts to minimize parse/plan time?

    An important setting you're missing is:

    wal_buffers = 16MB

    -Kevin
  • Scott Marlowe at Aug 4, 2011 at 3:07 pm

    On Thu, Aug 4, 2011 at 7:57 AM, Kevin Grittner wrote:
    RAM : 16 GB
    effective_cache_size = 4096MB
    That should probably be more like 12GB to 15GB.  It probably won't
    affect the load time here, but could affect other queries.
    Actually on a heavily written database a large effective cache size
    makes things slower.
  • Jim Nasby at Aug 17, 2011 at 9:17 pm

    On Aug 4, 2011, at 10:07 AM, Scott Marlowe wrote:
    On Thu, Aug 4, 2011 at 7:57 AM, Kevin Grittner
    wrote:
    RAM : 16 GB
    effective_cache_size = 4096MB
    That should probably be more like 12GB to 15GB. It probably won't
    affect the load time here, but could affect other queries.
    Actually on a heavily written database a large effective cache size
    makes things slower.
    effective_cache_size or shared_buffers? I can see why a large shared_buffers could cause problems, but what effect does effective_cache_size have on a write workload?
    --
    Jim C. Nasby, Database Architect [email protected]
    512.569.9461 (cell) http://jim.nasby.net
  • Samuel Gendler at Aug 4, 2011 at 5:45 pm

    On Wed, Aug 3, 2011 at 9:56 PM, Adarsh Sharma wrote:

    Dear all,

    From the last few days, I researched a lot on Postgresql Performance Tuning
    due to slow speed of my server.
    My application selects data from mysql database about 100000 rows , process
    it & insert into postgres 2 tables by making about 45 connections.

    It's already been mentioned, but is worth reinforcing, that if you are
    inserting 100,000 rows in 100,000 transactions, you'll see a huge
    performance improvement by doing many more inserts per transaction. Try
    doing at least 500 inserts in each transaction (though you can possibly go
    quite a bit higher than that without any issues, depending upon what other
    traffic the database is handling in parallel). You almost certainly don't
    need 45 connections in order to insert only 100,000 rows. I've got a crappy
    VM with 2GB of RAM in which inserting 100,000 relatively narrow rows
    requires less than 10 seconds if I do it in a single transaction on a single
    connection. Probably much less than 10 seconds, but the code I just tested
    with does other work while doing the inserts, so I don't have a pure test at
    hand.
  • Mark Kirkwood at Aug 4, 2011 at 10:32 pm

    On 05/08/11 05:40, Samuel Gendler wrote:

    On Wed, Aug 3, 2011 at 9:56 PM, Adarsh Sharma
    wrote:

    Dear all,

    From the last few days, I researched a lot on Postgresql
    Performance Tuning due to slow speed of my server.
    My application selects data from mysql database about 100000 rows
    , process it & insert into postgres 2 tables by making about 45
    connections.


    It's already been mentioned, but is worth reinforcing, that if you are
    inserting 100,000 rows in 100,000 transactions, you'll see a huge
    performance improvement by doing many more inserts per transaction.
    Try doing at least 500 inserts in each transaction (though you can
    possibly go quite a bit higher than that without any issues, depending
    upon what other traffic the database is handling in parallel). You
    almost certainly don't need 45 connections in order to insert only
    100,000 rows. I've got a crappy VM with 2GB of RAM in which inserting
    100,000 relatively narrow rows requires less than 10 seconds if I do
    it in a single transaction on a single connection. Probably much less
    than 10 seconds, but the code I just tested with does other work while
    doing the inserts, so I don't have a pure test at hand.
    Also worth mentioning is doing those 500 inserts in *fewer* than 500
    INSERT operations is likely to be a huge improvement, e.g:

    INSERT INTO table VALUES (....),(....);

    instead of

    INSERT INTO table VALUES (....);
    INSERT INTO table VALUES (....);

    I'd be tempted to do all 500 row insertions in one INSERT statement as
    above. You might find that 1 connection doing this is fast enough (it is
    only doing 200 actual INSERT calls in that case to put in 100000 rows).

    regards

    Mark

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedAug 4, '11 at 4:56a
activeAug 17, '11 at 9:17p
posts14
users10
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2023 Grokbase