----- Mensagem original -----
On 10-10-25 03:26 PM, André Volpato wrote:
On Mon, Oct 25, 2010 at 2:21 PM, André Volpato
wrote:
(...)
These times keep repeating after the second run, and I can
ensure AIX isn´t touching the disks anymore.
I´ve never seen this behaviour before. I heard about Direct I/O
and I was thinking about givng it a shot.
Any ideas?
I doubt disk/io is the problem.
Me either.
Like I said, AIX do not touch the storage when runing the query.
It became CPU-bound after data got into cache.
Have you confirmed that the hardware is ok on both servers?
The hardware was recently instaled and checked by the vendor team.
AIX box is on JS22:
PostgreSQL 8.4.4, AIX 5.3-9 64bits, SAN IBM DS3400, 8x450GB SAS 15K Raid-5
8GB RAM (DDR2 667)
# lsconf
System Model: IBM,7998-61X
Processor Type: PowerPC_POWER6
Processor Implementation Mode: POWER 6
Processor Version: PV_6
Number Of Processors: 4
Processor Clock Speed: 4005 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
Memory Size: 7680 MB
Debian box is on HS21:
PostgreSQL 8.4.4, Debian 4.3.2 64bits, SAN IBM DS3400, 5x300GB SAS 15K Raid-0
7GB RAM (DDR2 667)
We are forced to use RedHat on this machine, so we are virtualizing the Debian box.
# cpuinfo
processor : [0-7]
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz
stepping : 6
cpu MHz : 2500.148
cache size : 6144 KB
Have both OS's been tuned by people that know how to tune the
respective OS's? AIX is very different than Linux, and needs to be tuned
accordingly.
We´ve been tuning AIX for the last 3 weeks, and lots of tuneables got changed.
On Debian, we have far more experience, and it´s been a chalenge to understand how AIX works.
Most important tunes:
page_steal_method=1
lru_file_repage=0
kernel_heap_psize=64k
maxperm%=90
maxclient%=90
minperm%=20
Disk:
chdev -l hdisk8 -a queue_depth=24
chdev -l hdisk8 -a reserve_policy=no_reserve
chdev -l hdisk8 -a algorithm=round_robin
chdev -l hdisk8 -a max_transfer=0x400000
HBA:
chdev -l fcs0 -P -a max_xfer_size=0x400000 -a num_cmd_elems=1024
Postgres:
shared_buffers = 2304MB
effective_io_concurrency = 5
wal_sync_method = fdatasync
wal_buffers = 2MB
checkpoint_segments = 32
checkpoint_timeout = 10min
random_page_cost = 2.5
effective_cache_size = 7144MB
Like I said, there´s more but this is the most important.
On AIX can you trace why it is CPU bound? What else is taking the CPU
time, anything?|
We´re using iostat, svmon and vmstat to trace CPU, swap and IO activity.
On 'topas' we saw no disk activity at all, but we get a Wait% about 70%, and about 700 pages/s read in PageIn, no PageOut, no PgspIn and no PgspOut.
It´s a dedicated server, no process runing besides postgres.
Also, can you provide the output of pg_config from your AIX build?
# pg_config
BINDIR = /usr/local/pgsql/bin
DOCDIR = /usr/local/pgsql/share/doc
HTMLDIR = /usr/local/pgsql/share/doc
INCLUDEDIR = /usr/local/pgsql/include
PKGINCLUDEDIR = /usr/local/pgsql/include
INCLUDEDIR-SERVER = /usr/local/pgsql/include/server
LIBDIR = /usr/local/pgsql/lib
PKGLIBDIR = /usr/local/pgsql/lib
LOCALEDIR = /usr/local/pgsql/share/locale
MANDIR = /usr/local/pgsql/share/man
SHAREDIR = /usr/local/pgsql/share
SYSCONFDIR = /usr/local/pgsql/etc
PGXS = /usr/local/pgsql/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--enable-integer-datetimes' '--with-readline' '--with-threads' '--with-zlib' '--with-html' 'CC=gcc -maix64' 'LDFLAGS=-Wl,-bbigtoc'
CC = gcc -maix64
CPPFLAGS =
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv
CFLAGS_SL =
LDFLAGS = -Wl,-bbigtoc -Wl,-blibpath:/usr/local/pgsql/lib:/usr/lib:/lib
LDFLAGS_SL = -Wl,-bnoentry -Wl,-H512 -Wl,-bM:SRE
LIBS = -lpgport -lz -lreadline -lld -lm
VERSION = PostgreSQL 8.4.4
--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.
[]´s, Andre Volpato