We are trying to optimize our Database server without spending a
fortune on hardware. Here is our current setup.

Main Drive array: 8x 750 GB SATA 2 Drives in a RAID 10 Configuration,
this stores the OS, Applications, and PostgreSQL Data drives. 3 TB
Array, 2 TB Parition for PostgreSQL.
Secondary drive array: 2x 36 GB SAS 15000 RPM Drives in a RAID 1
Configuration: the pg_xlog directory, checkpoints set to use about 18
GB max, this way when massive numbers of small writes occur, they
don't slow the system down. Drive failure loses no data. Checkpoints
will be another matter, hope to keep under control with bgwriter
tweaking.

Now our "normal" activities are really fast. Scanning data, etc., all
runs pretty quickly. What is NOT fast is some of the massive
queries. We have some Updates with joins that match a 100m line table
with a 200m line table. Outside of custom coding pl/pgsql code that
creates the subfunction on the fly (which is up for consideration) to
try to keep the Matching joins to an O(n) problem from the current
O(n^2) one, we are looking at hardware as an option to help speed up
these big batch queries that sometimes run for 5-6 days.

CPU not a problem, 2x Quad-core Xeon, never taxing more than 13%, this
will change as more of our database functions are brought over here
from the other servers
RAM is not upgradable, have 48GB of RAM on there.
Work_mem shouldn't be the issue, the big processes get Work_mem set to
10GB, and if they are using temp tables, another 6-8GB for
temp_buffers. Maintenance Mem is set to 2 GB.

However, the joins of two 50GB tables really just can't be solved in
RAM without using drive space. My question is, can hardware speed
that up? Would putting a 400 GB SAS Drive (15000 RPM) in just to
handle PostgreSQL temp files help? Considering it would store "in
process" queries and not "completed transactions" I see no reason to
mirror the drive. If it fails, we'd simply unmount it, replace it,
then remount it, it could use the SATA space in the mean time.

Would that speed things up, and if so, where in the drive mappings
should that partition go?

Thank you for your help. I'm mostly interested in if I can speed
these things up from 5-6 days to < 1 day, otherwise I need to look at
optimizing it.

Alex

Search Discussions

  • Scott Marlowe at Jan 28, 2008 at 3:20 pm

    On Jan 28, 2008 7:54 AM, Alex Hochberger wrote:
    We are trying to optimize our Database server without spending a
    fortune on hardware. Here is our current setup.

    Main Drive array: 8x 750 GB SATA 2 Drives in a RAID 10 Configuration,
    this stores the OS, Applications, and PostgreSQL Data drives. 3 TB
    Array, 2 TB Parition for PostgreSQL.
    Secondary drive array: 2x 36 GB SAS 15000 RPM Drives in a RAID 1
    Configuration: the pg_xlog directory, checkpoints set to use about 18
    GB max, this way when massive numbers of small writes occur, they
    don't slow the system down. Drive failure loses no data. Checkpoints
    will be another matter, hope to keep under control with bgwriter
    tweaking. SNIP
    However, the joins of two 50GB tables really just can't be solved in
    RAM without using drive space. My question is, can hardware speed
    that up? Would putting a 400 GB SAS Drive (15000 RPM) in just to
    handle PostgreSQL temp files help? Considering it would store "in
    process" queries and not "completed transactions" I see no reason to
    mirror the drive. If it fails, we'd simply unmount it, replace it,
    then remount it, it could use the SATA space in the mean time.

    Would that speed things up, and if so, where in the drive mappings
    should that partition go?
    Do you have a maintenance window to experiment in? Try putting it on
    the pg_xlog array to see if it speeds up the selects during one. Then
    you'll know. I'm thinking it will help a little, but there's only so
    much you can do with 50g result sets.
  • Merlin Moncure at Jan 29, 2008 at 2:56 am

    On Jan 28, 2008 8:54 AM, Alex Hochberger wrote:
    We are trying to optimize our Database server without spending a
    fortune on hardware. Here is our current setup.

    Main Drive array: 8x 750 GB SATA 2 Drives in a RAID 10 Configuration,
    this stores the OS, Applications, and PostgreSQL Data drives. 3 TB
    Array, 2 TB Parition for PostgreSQL.
    Secondary drive array: 2x 36 GB SAS 15000 RPM Drives in a RAID 1
    Configuration: the pg_xlog directory, checkpoints set to use about 18
    GB max, this way when massive numbers of small writes occur, they
    don't slow the system down. Drive failure loses no data. Checkpoints
    will be another matter, hope to keep under control with bgwriter
    tweaking.

    Now our "normal" activities are really fast. Scanning data, etc., all
    runs pretty quickly. What is NOT fast is some of the massive
    queries. We have some Updates with joins that match a 100m line table
    with a 200m line table. Outside of custom coding pl/pgsql code that
    creates the subfunction on the fly (which is up for consideration) to
    try to keep the Matching joins to an O(n) problem from the current
    O(n^2) one, we are looking at hardware as an option to help speed up
    these big batch queries that sometimes run for 5-6 days.
    Well, you have already put some thought into your hardware...but the
    awful truth is that the sata drives are just terrible at seeking once
    you start seeing significant numbers of page faults to disk, and
    getting killed on sorting on top of it. Maybe the best plan of attack
    here is to post some explain times, and the relevant query. Perhaps
    there are some optimizations in indexing strategies or other query
    tactics (the pl/pgsql function smells suspicious as you have already
    noted), and hopefully the giant sort can be optimized out. You have a
    nasty problem that may require some out of the box thinking, so the
    more information you can provide the better.

    merlin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedJan 28, '08 at 1:54p
activeJan 29, '08 at 2:56a
posts3
users3
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase