|
|
Vacuum full since 15 hours
By Hervé Piedvache at Dec 9, 2007, 4:32 pm UTC
Hi, I have a small question ... right now on a 8.1.10 version of PostgreSQL I'm= doing a vacuum full verbose anaylze a table with 60 304 340 rows in 115579= pages and there were 16 835 144 unused item pointers inside and 5 index. After the first treatment of the index (appeared in the verbose) ...... More...
Hi,
I have a small question ... right now on a 8.1.10 version of PostgreSQL I'm doing a vacuum full verbose anaylze a table with 60 304 340 rows in 1155791 pages and there were 16 835 144 unused item pointers inside and 5 index.
After the first treatment of the index (appeared in the verbose) ... the vacuum is still working now since 15 hours ... There is none other activity on this server ... it's a replication server ... so the vacuum is locking most of the replication queries ... What can I do to get this doing quicker ...? It's a 8Gb server ... with a RAID 10 ... my maintenance_work_mem = 25600.
Any idea ? I'm sure that if I delete all the index an rebuild them by hand it'll be really quicker ! (done on another replication server ... took 20 min)
regards,
-- Hervé Piedvache
1 Reply
|
|
|
Combining two bitmap scans out performs a single regular index scan?
By m...@mark.mielke.cc at Dec 8, 2007, 10:07 pm UTC
For some accpac tables, I do synchronization by looking at the audtdate and audttime fields. These fields are quite annoying as they are decimal encoded dates and times stored as an integer. I do not have the freedom to "fix" this. To find records after a certain time, I must do one of: select *... More...
For some accpac tables, I do synchronization by looking at the audtdate and audttime fields. These fields are quite annoying as they are decimal encoded dates and times stored as an integer. I do not have the freedom to "fix" this.
To find records after a certain time, I must do one of:
select * from icpric where audtdate > ? or (audtdate = ? and audttime > ?)
Or:
select * from icpric where audtdate >= ? and (audtdate = ? or audttime > ?)
The fields are as follows:
audtdate | integer | not null audttime | integer | not null
I have an index as follows:
"icpric_audtdate_key" btree (audtdate, audttime)
The tables are properly analyzed and vacuumed. I am using PostgreSQL 8.2.5. The table has ~27,000 rows.
The first query generates this plan:
PCCYBER=# explain analyze select itemno, audtdate, audttime from icpric where audtdate > 20071207 or (audtdate = 20071207 and audttime > 23434145); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on icpric (cost=4.52..8.50 rows=2 width=28) (actual time=0.047..0.052 rows=4 loops=1) Recheck Cond: ((audtdate > 20071207) OR ((audtdate = 20071207) AND (audttime > 23434145))) -> BitmapOr (cost=4.52..4.52 rows=2 width=0) (actual time=0.037..0.037 rows=0 loops=1) -> Bitmap Index Scan on icpric_audtdate_key (cost=0.00..2.26 rows=1 width=0) (actual time=0.022..0.022 rows=3 loops=1) Index Cond: (audtdate > 20071207) -> Bitmap Index Scan on icpric_audtdate_key (cost=0.00..2.26 rows=1 width=0) (actual time=0.014..0.014 rows=1 loops=1) Index Cond: ((audtdate = 20071207) AND (audttime > 23434145)) Total runtime: 0.096 ms (8 rows)
Time: 0.786 ms
The second query generates this plan:
PCCYBER=# explain analyze select itemno, audtdate, audttime from icpric where audtdate >= 20071207 and (audtdate > 20071207 or audttime > 23434145); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Index Scan using icpric_audtdate_key on icpric (cost=0.00..4.27 rows=1 width=28) (actual time=0.266..0.271 rows=4 loops=1) Index Cond: (audtdate >= 20071207) Filter: ((audtdate > 20071207) OR (audttime > 23434145)) Total runtime: 0.299 ms (4 rows)
Time: 0.880 ms
Sample execution times:
PCCYBER=# select itemno, audtdate, audttime from icpric where audtdate > 20071207 or (audtdate = 20071207 and audttime > 23434145); itemno | audtdate | audttime --------------------+----------+---------- MB-AS-M2-CROSSHAIR | 20071207 | 23434154 PRT-EP-PHOTO R2400 | 20071208 | 1010323 PRT-EP-PHOTO R2400 | 20071208 | 1010339 PRT-EP-PHOTO R2400 | 20071208 | 1010350 (4 rows)
Time: 0.584 ms
PCCYBER=# select itemno, audtdate, audttime from icpric where audtdate >= 20071207 and (audtdate > 20071207 or audttime > 23434145); itemno | audtdate | audttime --------------------+----------+---------- MB-AS-M2-CROSSHAIR | 20071207 | 23434154 PRT-EP-PHOTO R2400 | 20071208 | 1010323 PRT-EP-PHOTO R2400 | 20071208 | 1010339 PRT-EP-PHOTO R2400 | 20071208 | 1010350 (4 rows)
Time: 0.831 ms
I can understand that this is a non-optimal query. What I don't understand is why two bitmap scans, combined together, should be able to out-perform a single index scan, when selecting a very small portion of the table. There are only four result rows. I am speculating that the index scan is loading the heap rows to determine whether the Filter: criteria matches, but I do not know if this makes sense? If it does make sense, would it be complicated to allow the filter to be processed from the index if all of the fields in the expression are available in the index?
Both of the queries execute in a satisfactory amount of time - so I really don't care which I use. I thought these results might be interesting to somebody?
The good thing is that bitmap scan seems to be well optimized.
Cheers, mark
-- Mark Mielke <mark@mielke.cc>
2 Replies
|
|
|
Cost-Based Vacuum Delay tuning
By Guillaume Cottenceau at Dec 8, 2007, 1:21 pm UTC
Hi, I'm currently trying to tune the Cost-Based Vacuum Delay in a 8.2.5 server. The aim is to reduce as much as possible the performance impact of vacuums on application queries, with the background idea of running autovacuum as much as possible[1]. My test involves vacuuming a large table, and... More...
Hi,
I'm currently trying to tune the Cost-Based Vacuum Delay in a 8.2.5 server. The aim is to reduce as much as possible the performance impact of vacuums on application queries, with the background idea of running autovacuum as much as possible[1].
My test involves vacuuming a large table, and measuring the completion time, as the vacuuming proceeds, of a rather long running application query (involving a table different from the one being vacuumed) which cannot fit entirely in buffers (and the completion time of the vacuum, because it needs not be too slow, of course).
I ran my tests with a few combinations of vacuum_cost_delay/vacuum_cost_limit, while keeping the other parameters set to the default from the 8.2.5 tarball:
vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20
The completion time of the query is about 16 seconds in isolation. With a vacuuming proceeding, here are the results:
vacuum_cost_delay/vacuum_cost_limit (deactivated) 20/200 40/200 100/1000 150/1000 200/1000 300/1000
VACUUM ANALYZE time 54 s 112 s 188 s 109 s 152 s 190 s 274 s SELECT time 50 s 28 s 26 s 24 s 22 s 20 s 19 s
I have noticed that others (Alvaro, Joshua) suggest to set vacuum_cost_delay as low as 10 or 20 ms, however in my situation I'd like to lower the performance impact in application queries and will probably choose 150/1000 where "only" a +40% is seen - I'm curious if anyone else has followed the same path, or is there any outstanding flaw I've missed here? I'm talking outstanding, as of course any local decision may be different in the hope of favouring a different database/application behaviour.
Other than that, it's the results obtained with the design principle of Cost-Base Vacuum Delay, which I find a little surprising. Of course, I think it has been thought through a lot, and my observations are probably naive, but I'm going to throw my ideas anyway, who knows.
I'd think that it would be possible to lower yet again the impact of vacuuming on other queries, while keeping a vacuuming time with little overhead, if dynamically changing the delays related to database activity, rather than using fixed costs and delays. For example, before and after each vacuum sleep delay is completed, pg could:
- check the amount of currently running queries (pg_stat_activity), and continue sleeping if it is above a configured threshold; by following this path, databases with peak activities could use a threshold of 1 and have zero ressource comsumption for vacuuming during peaks, still having nearly no time completion overhead for vacuuming out of peaks (since the check is performed also before the sleep delay, which would be deactivated if no queries are running); if we can afford a luxury implementation, we could always have a maximum sleep time configuration, which would allow vacuuming to proceed a little bit even when there's no timeframe with low enough database activity
- alternatively, pg could make use of some longer term statistics (load average, IO statistics) to dynamically pause the vacuuming - this I guess is related to the host OS and probably more difficult to have working correctly with multiple disks and/or processes running - however, if you want high performance from PostgreSQL, you probably won't host other IO applications on the same disk(s)
While I'm at it, a different Cost-Based Vacuum Delay issue: VACUUM FULL also follows the Cost-Based Vacuum Delay tunings. While it makes total sense when you want to perform a query on another table, it becomes a problem when your query is waiting for the exclusive lock on the vacuumed table. Potentially, you will have the vacuuming proceeding "slowly" because of the Cost-Based Vacuum Delay, and a blocked application because the application queries are just waiting.
I'm wondering if it would not be possible to dynamically ignore (or lower, if it makes more sense?) the Cost-Based Vacuum Delay during vacuum full, if a configurable amount of queries are waiting for the lock?
(please save yourself from answering "you should never run VACUUM FULL if you're vacuuming enough" - as long as VACUUM FULL is available in PostgreSQL, there's no reason to not make it as practically usable as possible, albeit with low dev priority)
Ref: [1] inspired by http://developer.postgresql.org/~wieck/vacuum_cost/
-- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland
4 Replies
|
|
|
Measuring table and index bloat
By Greg Smith at Dec 8, 2007, 07:06 am UTC
One of those things that comes up regularly on this list in particular are people whose performance issues relate to "bloated" tables or indexes. What I've always found curious is that I've never seen a good way suggested to actually measure said bloat in any useful numeric terms--until today. Greg... More...
One of those things that comes up regularly on this list in particular are people whose performance issues relate to "bloated" tables or indexes. What I've always found curious is that I've never seen a good way suggested to actually measure said bloat in any useful numeric terms--until today.
Greg Sabino Mullane just released a Nagios plug-in for PostgreSQL that you can grab at http://bucardo.org/nagios_postgres/ , and while that is itself nice the thing I found most remarkable is the bloat check. The majority of that code is an impressive bit of SQL that anyone could use even if you have no interest in Nagios, which is why I point it out for broader attention. Look in check_postgres.pl for the "check_bloat" routine and the big statement starting at the aptly labled "This was fun to write" section. If you pull that out of there and replace $MINPAGES and $MINIPAGES near the end with real values, you can pop that into a standalone query and execute it directly. Results look something like this (reformatting for e-mail):
schemaname | tablename | reltuples | relpages | otta | tbloat | public | accounts | 2500000 | 41667 | 40382 | 1.0 |
wastedpages | wastedbytes | wastedsize | iname | ituples | 1285 | 10526720 | 10 MB | accounts_pkey | 2500000 |
ipages | iotta | ibloat | wastedipages | wastedibytes | wastedisize 5594 | 35488 | 0.2 | 0 | 0 | 0 bytes
I'd be curious to hear from those of you who have struggled with this class of problem in the past as to whether you feel this quantifies the issue usefully.
0 Replies
|
|
|
TB-sized databases
By Peter Koczan at Dec 7, 2007, 8:46 pm UTC
Hi all, I have a user who is looking to store 500+ GB of data in a database be more like 3-4 TB). He is wondering how well PostgreSQL scales with TB-sized databases and what can be done to help optimize them (mostly hardware and config parameters, maybe a little advocacy). I can't speak on that... More...
Hi all,
I have a user who is looking to store 500+ GB of data in a database (and when all the indexes and metadata are factored in, it's going to be more like 3-4 TB). He is wondering how well PostgreSQL scales with TB-sized databases and what can be done to help optimize them (mostly hardware and config parameters, maybe a little advocacy). I can't speak on that since I don't have any DBs approaching that size.
The other part of this puzzle is that he's torn between MS SQL Server (running on Windows and unsupported by us) and PostgreSQL (running on Linux...which we would fully support). If any of you have ideas of how well PostgreSQL compares to SQL Server, especially in TB-sized databases, that would be much appreciated.
We're running PG 8.2.5, by the way.
Peter
51 Replies
|
|
|
database tuning
By kelvan at Dec 7, 2007, 7:13 pm UTC
hi i need to know all the database overhead sizes and block header sizes etc etc as I have a very complex database to build and it needs to be speed tuned beyond reckoning I have gathered some relevant information form the documentation such as all the data type sizes and the RM block information... More...
hi i need to know all the database overhead sizes and block header sizes etc etc as I have a very complex database to build and it needs to be speed tuned beyond reckoning
I have gathered some relevant information form the documentation such as all the data type sizes and the RM block information but I don't have any information on INDEX blocks or other general overheads
http://www.peg.com/techpapers/monographs/space/space.html
http://www.postgresql.org/docs/8.1/static/datatype.html
I am using postgres 8.1 if anyone can post links to pages containing over head information and index block header information it would be most appreciated as I cannot seem to find anything
Regards
Kelvan
3 Replies
|
|
|
Trouble with LEFT JOIN using VIEWS.
By Piotr Gasidło at Dec 7, 2007, 3:36 pm UTC
Hello, I've just hit problem, that is unusual for me. quaker=> \d sites Table "public.sites" Column | Type | Modifiers id | integer | not null default nextval('sites_id_seq'::regclass) site_name | character varying | not null user_id | integer | not null extra | integer | Indexes: "sites_pkey"... More...
Hello,
I've just hit problem, that is unusual for me.
quaker=> \d sites Table "public.sites" Column | Type | Modifiers
-----------+-------------------+---------------------------------------------------- id | integer | not null default nextval('sites_id_seq'::regclass) site_name | character varying | not null user_id | integer | not null extra | integer | Indexes: "sites_pkey" PRIMARY KEY, btree (id) "sites_site_name_key_unique" UNIQUE, btree (site_name text_pattern_ops) "sites_user_id_key" btree (user_id)
quaker=> \d users Table "public.users" Column | Type | Modifiers
-----------+-------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) user_name | character varying | not null extra | integer | Indexes: "users_pkey" PRIMARY KEY, btree (id)
Both tables filled with 100k records of random data. In users id is in range from 1..100k, same in sites. In sites user_id is random, range from 1..150k.
I've created views:
quaker=> \d users_secure View "public.users_secure" Column | Type | Modifiers -----------+-------------------+----------- id | integer | user_name | character varying | View definition: SELECT users.id, users.user_name FROM users;
quaker=> \d users_secure_with_has_extra View "public.users_secure_with_has_extra" Column | Type | Modifiers -----------+-------------------+----------- id | integer | user_name | character varying | has_extra | boolean | View definition: SELECT users.id, users.user_name, users.extra IS NOT NULL AS has_extra FROM users;
Now, when I do simple query to find all data for sites matching site_name like 'H3bh%' (there are at least one record in sites matching this condition).
quaker=> explain analyze select s.site_name,u.user_name from sites_secure s left join users_secure_with_has_extra u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=3897.02..3897.03 rows=2 width=44) (actual time=430.326..430.331 rows=1 loops=1) -> Sort (cost=3897.02..3897.03 rows=2 width=44) (actual time=430.321..430.323 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop Left Join (cost=0.00..3897.01 rows=2 width=44) (actual time=290.103..430.301 rows=1 loops=1) Join Filter: ("inner".id = "outer".user_id) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.054 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Seq Scan on users (cost=0.00..1641.00 rows=100000 width=20) (actual time=0.007..245.406 rows=100000 loops=1) Total runtime: 430.432 ms (10 rows)
When I resign from LEFT JOIN users_secure_with_has_extra, and put JOIN instead I've got:
quaker=> explain analyze select s.site_name,u.user_name from sites_secure s join users_secure_with_has_extra u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.112..0.118 rows=1 loops=1) -> Sort (cost=9.05..9.06 rows=1 width=24) (actual time=0.106..0.108 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop (cost=0.00..9.04 rows=1 width=24) (actual time=0.073..0.088 rows=1 loops=1) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.050 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Index Scan using users_pkey on users (cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.023 rows=1 loops=1) Index Cond: (users.id = "outer".user_id) Total runtime: 0.216 ms (10 rows)
As explain shows PostgreSQL is using index scan on users, instead of seq scan like in example above.
Now. When I use view with no has_extra field (important: field is a simple function on extra field) I get expectable results. Both using indexes.
quaker=> explain analyze select s.site_name,u.user_name from sites_secure s left join users_secure u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.111..0.117 rows=1 loops=1) -> Sort (cost=9.05..9.06 rows=1 width=24) (actual time=0.105..0.107 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop Left Join (cost=0.00..9.04 rows=1 width=24) (actual time=0.072..0.087 rows=1 loops=1) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.043..0.049 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Index Scan using users_pkey on users (cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.022 rows=1 loops=1) Index Cond: (users.id = "outer".user_id) Total runtime: 0.216 ms (10 rows)
quaker=> explain analyze select s.site_name,u.user_name from sites_secure s join users_secure u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.109..0.115 rows=1 loops=1) -> Sort (cost=9.05..9.06 rows=1 width=24) (actual time=0.104..0.106 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop (cost=0.00..9.04 rows=1 width=24) (actual time=0.071..0.086 rows=1 loops=1) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.042..0.048 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Index Scan using users_pkey on users (cost=0.00..3.02 rows=1 width=16) (actual time=0.018..0.021 rows=1 loops=1) Index Cond: (users.id = "outer".user_id) Total runtime: 0.214 ms (10 rows)
Why?
quaker=> select version(); version
----------------------------------------------------------------------------------------------------------------------- PostgreSQL 8.1.10 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.3 20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1) (1 row)
1 Reply
|
|
|
Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)
By Robert Bernabe at Dec 6, 2007, 8:10 pm UTC
Hi All, I've been tasked to evaluate PG as a possible replacement of our MS SQL 2000 solution. Our solution is 100% stored procedure/function centric. It's a report generation system whose sole task is to produce text files filled with processed data that is post-processed by a secondary system.... More...
Hi All, I've been tasked to evaluate PG as a possible replacement of our MS SQL 2000 solution. Our solution is 100% stored procedure/function centric. It's a report generation system whose sole task is to produce text files filled with processed data that is post-processed by a secondary system. Basically options are selected via a web interface and all these parameters are passed unto the stored procedure and then the stored procedure would run and in the process call other stored procedures until eventually a single formatted text file is produced. I decided on Fedora Core 7 and the 8.3 Beta release of Enterprise DB PostgreSQL. I decided to port 1 stored procedure plus it's several support stored procedures into pl/pgsql from T-SQL and compare the performance by measuring how long each system takes to produce the text file. For this test, the output to the text file was discarded and the stored procedure/function would end once the final temporary table is filled with the information that is eventually dumped into the text file.
Windows 2000 Professional + MSDE (/MS SQL) Box vs. FC7 + EnterpriseDB PG Box
Note that both boxes have EXACTLY the same hardware (not VMWARE or anything) AMD X2 3800 2 G RAM DDR 400 80 G Barracuda Sata
The data was copied to the Linux box and checked lightly for consistency versus the windows box (number of tables / columns and records) and they all match. After data transfer to the Linux Box, I ran REINDEX and ANALYZE.
For the actual run the following tables were used and I'm displaying the results of analyze.
INFO: analyzing "public.AreaDefn" INFO: "AreaDefn": scanned 15 of 15 pages, containing 2293 live rows and 0 dead rows; 2293 rows in sample, 2293 estimated total rows INFO: analyzing "public.AreaDefn2" INFO: "AreaDefn2": scanned 30 of 30 pages, containing 3439 live rows and 0 dead rows; 3000 rows in sample, 3439 estimated total rows INFO: analyzing "public.Areas" INFO: "Areas": scanned 2 of 2 pages, containing 164 live rows and 0 dead rows; 164 rows in sample, 164 estimated total rows INFO: analyzing "public.Brands" INFO: "Brands": scanned 1 of 1 pages, containing 11 live rows and 0 dead rows; 11 rows in sample, 11 estimated total rows INFO: analyzing "public.Categories" INFO: "Categories": scanned 1 of 1 pages, containing 26 live rows and 0 dead rows; 26 rows in sample, 26 estimated total rows INFO: analyzing "public.CategoryDefn" INFO: "CategoryDefn": scanned 1 of 1 pages, containing 133 live rows and 0 dead rows; 133 rows in sample, 133 estimated total rows INFO: analyzing "public.CategoryDefn2" INFO: "CategoryDefn2": scanned 2 of 2 pages, containing 211 live rows and 0 dead rows; 211 rows in sample, 211 estimated total rows INFO: analyzing "public.CategorySets" INFO: "CategorySets": scanned 1 of 1 pages, containing 3 live rows and 0 dead rows; 3 rows in sample, 3 estimated total rows INFO: analyzing "public.CATemplateGroup" INFO: analyzing "public.Channels" INFO: "Channels": scanned 1 of 1 pages, containing 7 live rows and 0 dead rows; 7 rows in sample, 7 estimated total rows INFO: analyzing "public.ClientCodes" INFO: analyzing "public.Clients" INFO: "Clients": scanned 7 of 7 pages, containing 366 live rows and 0 dead rows; 366 rows in sample, 366 estimated total rows INFO: analyzing "public.Customers" INFO: "Customers": scanned 2 of 2 pages, containing 129 live rows and 0 dead rows; 129 rows in sample, 129 estimated total rows NFO: analyzing "public.Databases" INFO: "Databases": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows INFO: analyzing "public.DataSources" INFO: "DataSources": scanned 1 of 1 pages, containing 8 live rows and 0 dead rows; 8 rows in sample, 8 estimated total rows INFO: analyzing "public.DateToWeekConversion" INFO: "DateToWeekConversion": scanned 4 of 4 pages, containing 371 live rows and 0 dead rows; 371 rows in sample, 371 estimated total rows INFO: analyzing "public.Divisions" INFO: "Divisions": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows INFO: analyzing "public.MetricTable" INFO: "MetricTable": scanned 1 of 1 pages, containing 48 live rows and 0 dead rows; 48 rows in sample, 48 estimated total rows INFO: analyzing "public.Offtake" INFO: "Offtake": scanned 3000 of 13824 pages, containing 141000 live rows and 0 dead rows; 3000 rows in sample, 649728 estimated total rows INFO: analyzing "public.SKUs" INFO: "SKUs": scanned 3 of 3 pages, containing 73 live rows and 0 dead rows; 73 rows in sample, 73 estimated total rows INFO: analyzing "public.SMSDefaults" INFO: "SMSDefaults": scanned 1 of 1 pages, containing 43 live rows and 0 dead rows; 43 rows in sample, 43 estimated total rows INFO: analyzing "public.StandardPeriods" INFO: "StandardPeriods": scanned 1 of 1 pages, containing 8 live rows and 0 dead rows; 8 rows in sample, 8 estimated total rows INFO: analyzing "public.StandardUnits" INFO: "StandardUnits": scanned 1 of 1 pages, containing 9 live rows and 0 dead rows; 9 rows in sample, 9 estimated total rows INFO: analyzing "public.SubDataSources" INFO: "SubDataSources": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows INFO: analyzing "public.VolumeUnitDefn" INFO: "VolumeUnitDefn": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows INFO: analyzing "public.VolumeUnits" INFO: "VolumeUnits": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows
as you can see the biggest one only has 600k records.
Here are the settings used for postgresql.conf ( will just list those that were modified) #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #---------------------------------------------------------------------------
# - Memory - #shared_buffers = 32MB shared_buffers = 128MB # min 128kB or max_connections*16kB # (change requires restart) #temp_buffers = 8MB # min 800kB temp_buffers = 32MB #max_prepared_transactions = 5 # can be 0 or more # (change requires restart) #max_prepared_transactions = 20 # Note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). #work_mem = 1MB # min 64kB work_mem = 2MB #maintenance_work_mem = 16MB # min 1MB maintenance_work_mem = 32MB #max_stack_depth = 2MB # min 100kB
Everything else was kept as is (given by the installer).
/etc/sysctl.conf settings below : kernel.shmmax = 1536000000 kernel.msgmni = 1024 fs.file-max = 8192 kernel.sem = "250 32000 32 1024"
The main stored function has approximately 1200 lines of pl/pgsql code. While it is running it calls 7 other support stored functions plus a small view.
The test basically was run two ways :
a) for the linux box, we used PG Admin III to run the core stored function and in the windows box we used query analyzer.
b) Created a small vb program that just calls the stored function for both boxes.
Since I'm a total newbie in PG, I was expecting dismal results in the initial run since our port of the code would not be optimized for PG and sure enough I got them.
Windows 2k Pro + MSDE - 4 seconds FC7 + postgresql-8.3-beta3 - 77 seconds
thinking that maybe the GUI of FC7 is hampering the load, I decided to run it again using runlevel 3. The result was approximately a 1-2 % gain on the FC7 but that is insignificant compared to the discrepancy I'm measuring so I decided to just use the w/GUI results. We noticed that the CPU for the linux box was tapped out (at least one of the cores) nearly 100% for the duration of the process. There was plenty of ram available and there was little to no disk access during the entire run.
I decided to further normalize the test and make the OS constant.
Windows 2k Pro + MSDE - 4 seconds Windows 2k Pro + postgresql-8.3-beta3 - 54 seconds
Turns out that for our code, running PG in windows is faster significantly. This was a surprise coming from all the research on the internet but it does jive with the results of this guy :
http://channel9.msdn.com/ShowPost.aspx?PostID=179064
Note that this guy used MS SQL 2005. Which we have proved to be 2-3 times slower than MS SQL 2000 and hence our research into other options. :)
Anyways I'd like to break up this request/begging for help into two parts.
1) First would be settings of postgresql.conf. Did I do it correctly? The sample data is so small....I based my settings on the recommendations researched for data centers.
2) Code optimization which I plan to start in another email thread since the discussions there would be more detailed.
Would it also make sense to optimize (as far as possible) everything (including the code) for windows first? The target here would be a linux OS but since the discrepancy is so big...the unified Windows OS might be a good place to start for now.
Many Thanks in advance.
Regards
8 Replies
|
|
|
autovacuum: recommended?
By Gábor Farkas at Dec 5, 2007, 11:55 pm UTC
hi, we are moving one database from postgresql-7.4 to postgresql-8.2.4. we have some cronjobs set up that vacuum the database (some tables more often, some tables less often). now, in pg82, there is the possibility of using the autovacuum. my question is: is it recommended to use it? or in other... More...
hi,
we are moving one database from postgresql-7.4 to postgresql-8.2.4.
we have some cronjobs set up that vacuum the database (some tables more often, some tables less often). now, in pg82, there is the possibility of using the autovacuum.
my question is: is it recommended to use it? or in other words, should i only use autovacuum? or it's better to use manual-vacuuming? which one is the "way of the future" :) ? or should i use both auto-vacuum and manual-vacuum?
in other words, i'd like to find out, if we should simply stay with the vacuuming-cronjobs, or should we move to using auto-vacuum? and if we should move, should we try to set it up the way that no manual-vacuuming is used anymore?
thanks, gabor
17 Replies
|
|
|
Optimizer Not using the Right plan
By Pallav Kalva at Dec 5, 2007, 12:49 pm UTC
Hi, Postgres 8.2.4 is not using the right plan for different values. From the below queries listing.addressvaluation table has 19million records , the other table listing.valuationchangeperiod is just lookup table with 3 records. If you can see the explain plans for the statements the first one... More...
Hi,
Postgres 8.2.4 is not using the right plan for different values.
From the below queries listing.addressvaluation table has 19million records , the other table listing.valuationchangeperiod is just lookup table with 3 records.
If you can see the explain plans for the statements the first one uses a bad plan for 737987 addressid search, does a index scan backward on the primary key "addressvaluationid" takes more time to execute and the same query for a different addressid (5851202) uses the correct optimal plan with index scan on "addressid" column which is way quicker.
Autovacuums usually vacuums these tables regularly, in fact I checked the pg_stat_user_tables the last vacuum/analyze on this table was last night. I did another manual vacuum analyze on the listing.addrevaluation table it uses the right plan for all the values now.
Can anyone explain me this wierd behavior ? why does it have different plans for different values and after doing manual vacuum analyze it works properly ? Are autovacuums not effective enough ?
Here are my autovacuum settings autovacuum_naptime = 120min autovacuum_vacuum_threshold = 500 autovacuum_analyze_threshold = 250 autovacuum_vacuum_scale_factor = 0.001 autovacuum_analyze_scale_factor = 0.001 autovacuum_freeze_max_age = 200000000 autovacuum_vacuum_cost_delay = -1 autovacuum_vacuum_cost_limit = -1 Here are the table structures
listing.addressvaluation Table "listing.addressvaluation" Column | Type | Modifiers ----------------------------+-----------------------------+--------------------------------------------------------------------------- addressvaluationid | integer | not null default nextval(('listing.addressvaluationseq'::text)::regclass) fkaddressid | integer | not null fkaddressvaluationsourceid | integer | not null sourcereference | text | createdate | timestamp without time zone | not null default ('now'::text)::timestamp(6) without time zone valuationdate | timestamp without time zone | not null valuationamount | numeric(14,2) | valuationhigh | numeric(14,2) | valuationlow | numeric(14,2) | valuationconfidence | integer | valuationchange | numeric(14,2) | fkvaluationchangeperiodid | integer | historycharturl | text | regionhistorycharturl | text | Indexes: "pk_addressvaluation_addressvaluationid" PRIMARY KEY, btree (addressvaluationid), tablespace "indexdata" "idx_addressvaluation_createdate" btree (createdate), tablespace "indexdata" "idx_addressvaluation_fkaddressid" btree (fkaddressid), tablespace "indexdata" "idx_addressvaluation_fkaddressid2" btree (fkaddressid), tablespace "indexdata" Foreign-key constraints: "fk_addressvaluation_address" FOREIGN KEY (fkaddressid) REFERENCES listing.address(addressid) "fk_addressvaluation_addressvaluationsource" FOREIGN KEY (fkaddressvaluationsourceid) REFERENCES listing.addressvaluationsource(addressvaluationsourceid) "fk_addressvaluation_valuationchangeperiod" FOREIGN KEY (fkvaluationchangeperiodid) REFERENCES listing.valuationchangeperiod(valuationchangeperiodid)
listing.valuationchangeperiod Table "listing.valuationchangeperiod" Column | Type | Modifiers -------------------------+---------+-------------------------------------------------------------------------------- valuationchangeperiodid | integer | not null default nextval(('listing.valuationchangeperiodseq'::text)::regclass) name | text | not null Indexes: "pk_valuationchangeperiod_valuationchangeperiodid" PRIMARY KEY, btree (valuationchangeperiodid), tablespace "indexdata" "uq_valuationchangeperiod_name" UNIQUE, btree (name), tablespace "indexdata"
For Addressid 737987 after autovacuum before manual vacuum analyze ------------------------------------------------------------------------------------------- explain select this_.addressvaluationid as addressv1_150_1_, this_.sourcereference as sourcere2_150_1_, this_.createdate as createdate150_1_, this_.valuationdate as valuatio4_150_1_, this_.valuationamount as valuatio5_150_1_, this_.valuationhigh as valuatio6_150_1_, this_.valuationlow as valuatio7_150_1_, this_.valuationconfidence as valuatio8_150_1_, this_.valuationchange as valuatio9_150_1_, this_.historycharturl as history10_150_1_, this_.regionhistorycharturl as regionh11_150_1_, this_.fkaddressid as fkaddre12_150_1_, this_.fkaddressvaluationsourceid as fkaddre13_150_1_, this_.fkvaluationchangeperiodid as fkvalua14_150_1_, valuationc2_.valuationchangeperiodid as valuatio1_197_0_, valuationc2_.name as name197_0_ from listing.addressvaluation this_ left outer join listing.valuationchangeperiod valuationc2_ on this_.fkvaluationchangeperiodid=valuationc2_.valuationchangeperiodid where this_.fkaddressid=737987 order by this_.addressvaluationid desc limit 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..678.21 rows=1 width=494) -> Nested Loop Left Join (cost=0.00..883026.09 rows=1302 width=494) -> Index Scan Backward using pk_addressvaluation_addressvaluationid on addressvaluation this_ (cost=0.00..882649.43 rows=1302 width=482) Filter: (fkaddressid = 737987) -> Index Scan using pk_valuationchangeperiod_valuationchangeperiodid on valuationchangeperiod valuationc2_ (cost=0.00..0.28 rows=1 width=12) Index Cond: (this_.fkvaluationchangeperiodid = valuationc2_.valuationchangeperiodid) (6 rows)
For Addressid 5851202 after autovacuum before manual vacuum analyze --------------------------------------------------------------------------------------------
select this_.addressvaluationid as addressv1_150_1_, this_.sourcereference as sourcere2_150_1_, this_.createdate as createdate150_1_, this_.valuationdate as valuatio4_150_1_, this_.valuationamount as valuatio5_150_1_, this_.valuationhigh as valuatio6_150_1_, this_.valuationlow as valuatio7_150_1_, this_.valuationconfidence as valuatio8_150_1_, this_.valuationchange as valuatio9_150_1_, this_.historycharturl as history10_150_1_, this_.regionhistorycharturl as regionh11_150_1_, this_.fkaddressid as fkaddre12_150_1_, this_.fkaddressvaluationsourceid as fkaddre13_150_1_, this_.fkvaluationchangeperiodid as fkvalua14_150_1_, valuationc2_.valuationchangeperiodid as valuatio1_197_0_, valuationc2_.name as name197_0_ from listing.addressvaluation this_ left outer join listing.valuationchangeperiod valuationc2_ on this_.fkvaluationchangeperiodid=valuationc2_.valuationchangeperiodid where this_.fkaddressid=5851202 order by this_.addressvaluationid desc limit 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=30.68..30.68 rows=1 width=494) -> Sort (cost=30.68..30.71 rows=11 width=494) Sort Key: this_.addressvaluationid -> Hash Left Join (cost=1.07..30.49 rows=11 width=494) Hash Cond: (this_.fkvaluationchangeperiodid = valuationc2_.valuationchangeperiodid) -> Index Scan using idx_addressvaluation_fkaddressid2 on addressvaluation this_ (cost=0.00..29.27 rows=11 width=482) Index Cond: (fkaddressid = 5851202) -> Hash (cost=1.03..1.03 rows=3 width=12) -> Seq Scan on valuationchangeperiod valuationc2_ (cost=0.00..1.03 rows=3 width=12) (9 rows)
After manual vacuum analyze for addressid 737987 ------------------------------------------------------------------
explain select this_.addressvaluationid as addressv1_150_1_, this_.sourcereference as sourcere2_150_1_, this_.createdate as createdate150_1_, this_.valuationdate as valuatio4_150_1_, this_.valuationamount as valuatio5_150_1_, this_.valuationhigh as valuatio6_150_1_, this_.valuationlow as valuatio7_150_1_, this_.valuationconfidence as valuatio8_150_1_, this_.valuationchange as valuatio9_150_1_, this_.historycharturl as history10_150_1_, this_.regionhistorycharturl as regionh11_150_1_, this_.fkaddressid as fkaddre12_150_1_, this_.fkaddressvaluationsourceid as fkaddre13_150_1_, this_.fkvaluationchangeperiodid as fkvalua14_150_1_, valuationc2_.valuationchangeperiodid as valuatio1_197_0_, valuationc2_.name as name197_0_ from listing.addressvaluation this_ inner join listing.valuationchangeperiod valuationc2_ on this_.fkvaluationchangeperiodid=valuationc2_.valuationchangeperiodid where this_.fkaddressid=737987 order by this_.addressvaluationid desc limit 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=31.24..31.24 rows=1 width=494) -> Sort (cost=31.24..31.27 rows=11 width=494) Sort Key: this_.addressvaluationid -> Hash Join (cost=1.07..31.05 rows=11 width=494) Hash Cond: (this_.fkvaluationchangeperiodid = valuationc2_.valuationchangeperiodid) -> Index Scan using idx_addressvaluation_fkaddressid on addressvaluation this_ (cost=0.00..29.83 rows=11 width=482) Index Cond: (fkaddressid = 737987) -> Hash (cost=1.03..1.03 rows=3 width=12) -> Seq Scan on valuationchangeperiod valuationc2_ (cost=0.00..1.03 rows=3 width=12) (9 rows)
Thanks! Pallav.
3 Replies
|
|
|
Bad query plans for queries on partitioned table
By Julian Mehnle at Dec 5, 2007, 11:26 am UTC
Hi all, I have a large database with e-mail meta-data (no bodies) for over 100 million messages. I am running PostgreSQL 8.2.4 on a server with 2GB of RAM (shared_buffers = 240MB, temp_buffers = 128MB, work_mem = 256MB, maintenance_work_mem = 256MB). I have the data split in two separate tables,... More...
Hi all,
I have a large database with e-mail meta-data (no bodies) for over 100 million messages. I am running PostgreSQL 8.2.4 on a server with 2GB of RAM (shared_buffers = 240MB, temp_buffers = 128MB, work_mem = 256MB, maintenance_work_mem = 256MB). I have the data split in two separate tables, "email" and "email_extras":
Table "public.email" Column | Type | Modifiers -------------------+-----------------------------+----------- id | bigint | not null load_id | integer | not null ts | timestamp without time zone | not null ip | inet | not null mfrom | text | not null helo | text |
Table "public.email_extras" Column | Type | Modifiers -------------------+-----------------------------+----------- id | bigint | not null ts | timestamp without time zone | not null size | integer | not null hdr_from | text |
Each of these tables has been partitioned equally based on the "ts" (timestamp) field into two dozen or so tables, each covering one week of messages. For testing purposes, I have only one week's partition filled for each of the "email" and "email_extras" tables (email_2007_week34 {,extras}).
Now if I perform the following simple join on the "email" and "email_ extras" tables ...
SELECT count(*) FROM email INNER JOIN email_extras USING (id, ts) WHERE mfrom <> hdr_from;
then I get the following horrendously inefficient plan:
QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=391396890.89..391396890.90 rows=1 width=0) -> Merge Join (cost=9338881.64..349156398.02 rows=16896197148 width=0) Merge Cond: ((public.email_extras.id = public.email.id) AND (public.email_extras.ts = public.email.ts)) Join Filter: (public.email.mfrom <> public.email_extras.hdr_from) -> Sort (cost=4592966.95..4658121.33 rows=26061752 width=48) Sort Key: public.email_extras.id, public.email_extras.ts -> Append (cost=0.00..491969.52 rows=26061752 width=48) -> Seq Scan on email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week13_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week14_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week15_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week16_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week17_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week18_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week19_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week20_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week21_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week22_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week23_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week24_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week25_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week26_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week27_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week28_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week29_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week30_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week31_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week32_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week33_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week34_extras email_extras (cost=0.00..491597.12 rows=26052512 width=33) -> Seq Scan on email_2007_week35_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week36_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week37_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week38_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week39_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Seq Scan on email_2007_week40_extras email_extras (cost=0.00..13.30 rows=330 width=48) -> Sort (cost=4745914.69..4811071.87 rows=26062872 width=48) Sort Key: public.email.id, public.email.ts -> Append (cost=0.00..644732.72 rows=26062872 width=48) -> Seq Scan on email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week13 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week14 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week15 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week16 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week17 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week18 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week19 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week20 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week21 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week22 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week23 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week24 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week25 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week26 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week27 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week28 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week29 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week30 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week31 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week32 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week33 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week34 email (cost=0.00..644349.12 rows=26052512 width=33) -> Seq Scan on email_2007_week35 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week36 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week37 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week38 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week39 email (cost=0.00..13.70 rows=370 width=48) -> Seq Scan on email_2007_week40 email (cost=0.00..13.70 rows=370 width=48) (68 rows)
However, if I restrict the query to just the partitions that actually do have data in them ...
SELECT count(*) FROM email_2007_week34 INNER JOIN email_2007_week34_extras USING (id, ts) WHERE mfrom <> hdr_from;
then I get a much better plan that uses a hash join:
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=4266338.94..4266338.95 rows=1 width=0) -> Hash Join (cost=1111362.80..4266336.07 rows=1145 width=0) Hash Cond: ((email_2007_week34.ts = email_2007_week34_extras.ts) AND (email_2007_week34.id = email_2007_week34_extras.id)) Join Filter: (email_2007_week34.mfrom <> email_2007_week34_extras.hdr_from) -> Seq Scan on email_2007_week34 (cost=0.00..644349.12 rows=26052512 width=33) -> Hash (cost=491597.12..491597.12 rows=26052512 width=33) -> Seq Scan on email_2007_week34_extras (cost=0.00..491597.12 rows=26052512 width=33) (7 rows)
Yes, I have `ANALYZE`d the database before running the queries.
How come the query planner gets thrown off that far by the simple table partitioning? What can I do to put the query planner back on the right track?
Julian.
7 Replies
|
|
|
RAID arrays and performance
By Matthew Wakeling at Dec 5, 2007, 06:47 am UTC
I have a question about how Postgres makes use of RAID arrays for performance, because we are considering buying a 12-disc array for performance reasons. I'm interested in how the performance scales with the number of discs in the array. Now, I know that for an OLTP workload (in other words, lots... More...
I have a question about how Postgres makes use of RAID arrays for performance, because we are considering buying a 12-disc array for performance reasons. I'm interested in how the performance scales with the number of discs in the array.
Now, I know that for an OLTP workload (in other words, lots of small parallel random accesses), the performance will scale almost with the number of discs. However, I'm more interested in the performance of individual queries, particularly those where Postgres has to do an index scan, which will result in a single query performing lots of random accesses to the disc system. Theoretically, this *can* scale with the number of discs too - my question is does it?
Does Postgres issue requests to each random access in turn, waiting for each one to complete before issuing the next request (in which case the performance will not exceed that of a single disc), or does it use some clever asynchronous access method to send a queue of random access requests to the OS that can be distributed among the available discs?
Any knowledgable answers or benchmark proof would be appreciated,
Matthew
-- "To err is human; to really louse things up requires root privileges." -- Alexander Pope, slightly paraphrased
24 Replies
|
|
|
Utilizing multiple cores for one query
By henk de wit at Dec 4, 2007, 1:24 pm UTC
I wonder whether the current versions of postgres (i.e. either 8.2 or 8.3) are able to utilize multiple cores for the execution of a single query? This is one thing that systems like SQL Server and Oracle have been able to do for quite some time. I haven't seen much in the documentation that hints... More...
I wonder whether the current versions of postgres (i.e. either 8.2 or 8.3) are able to utilize multiple cores for the execution of a single query?
This is one thing that systems like SQL Server and Oracle have been able to do for quite some time. I haven't seen much in the documentation that hints that this may be possible in PG, nor did I find much in the mailinglists about this. The only thing I found was a topic that discussed some patches that may eventually lead to a sequence scan being handled by multiple cores.
Could someone shed some light on the current or future abilities of PG for making use of multiple cores to execute a single query?
Thanks in advance
_________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/=
5 Replies
|
|
|
Training Recommendations
By Campbell, Lance at Dec 4, 2007, 08:15 am UTC
This is a multi-part message in MIME format. ------_=_NextPart_001_01C81FCB.13722DFE Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable PostgreSQL: 8.2.4 Does anyone have any companies they would recommend using for performance tuning training of PostgreSQL for... More...
This is a multi-part message in MIME format.
------_=_NextPart_001_01C81FCB.13722DFE Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
PostgreSQL: 8.2.4
Does anyone have any companies they would recommend using for performance tuning training of PostgreSQL for Linux? Or general DBA training?
Thanks,
Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu
------_=_NextPart_001_01C81FCB.13722DFE Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:st1="urn:schemas-microsoft-com:office:smarttags" xmlns="http://www.w3.org/TR/REC-html40">
<head> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii"> <meta name=Generator content="Microsoft Word 11 (filtered medium)"> <o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="PlaceName"/> <o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="PlaceType"/> <o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="place"/> <!--[if !mso]> <style> st1\:*{behavior:url(#default#ieooui) } </style> <![endif]--> <style> <!-- /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:12.0pt; font-family:"Times New Roman";} a:link, span.MsoHyperlink {color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {color:purple; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal-compose; font-family:Arial; color:windowtext;} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in;} div.Section1 {page:Section1;} --> </style>
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>PostgreSQL: 8.2.4<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>Does anyone have any companies they would recommend using for performance tuning training of PostgreSQL for Linux? Or general DBA training?<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>Thanks,<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>Lance Campbell</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>Project Manager/Software Architect</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>Web Services at Public Affairs</span></font><o:p></o:p></p>
<p class=MsoNormal><st1:place w:st="on"><st1:PlaceType w:st="on"><font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>University</span></font></st1:PlaceType><font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'> of <st1:PlaceName w:st="on">Illinois</st1:PlaceName></span></font></st1:place><o:p></o:p></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>217.333.0382</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt; font-family:Arial'>http://webservices.uiuc.edu</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size: 12.0pt'><o:p> </o:p></span></font></p>
</div>
</body>
</html>
------_=_NextPart_001_01C81FCB.13722DFE--
9 Replies
|
|
|
Dealing with big tables
By Mindaugas at Dec 3, 2007, 10:14 pm UTC
Hello, Started to work with big tables (like 300GB) and performance problems sta= rted to appear. :( To simplify things - table has an index on From an index on To columns. A= nd it also have several other not indexed columns. There are 100000+ of dif= ferent values for From and the same for To. I... More...
Hello,
Started to work with big tables (like 300GB) and performance problems sta= rted to appear. :(
To simplify things - table has an index on From an index on To columns. A= nd it also have several other not indexed columns. There are 100000+ of dif= ferent values for From and the same for To.
I execute simple query "select * from bigtable where From=3D'something'".= Query returns like 1000 rows and takes 5++ seconds to complete. As far as = I understand the query is slow because: - first it has to retrieve pointers to rows with data from index. That go= es fast. - retrieve all the rows one by one. There we have 100% random read becaus= e rows with the same From is distributed evenly through all the 300GB and m= ost probably nothing is cached. So there we are limited by _one_ disk perfo= rmance independently of how many disks we have in storage? And in case stor= age consists of 15k rpm Cheetahs with 3.5ms average read seek time we shoul= d expect not more than ~285 rows per second?
I feel that I'm overlooking something here. But I'm new into data warehou= sing. :)
Also this query should greatly benefit from parallel execution or async I= O. Storage (seeks/second) scales almost linearly when it has a lot of disks= . And query is completely IO bound so it should scale well on single server.
And I cannot use some index organized table or table partitioned by From = :) because there are at least 2 similar indexes by which queries can be exe= cuted - From and To.
Ideas for improvement? Greenplum or EnterpriseDB? Or I forgot something f= rom PostgreSQL features.
Thanks,
Mindaugas
8 Replies
|
|
|
EXPLAIN ANALYZE time calculations
By Michael Glaesemann at Dec 3, 2007, 01:10 am UTC
I'd like to get confirmation that I'm correctly understanding the times given in EXPLAIN ANALYZE. Taking the example given in the Using Explain section of the docs, http://www.postgresql.org/docs/current/static/using-explain EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 100... More...
I'd like to get confirmation that I'm correctly understanding the times given in EXPLAIN ANALYZE. Taking the example given in the Using Explain section of the docs,
http://www.postgresql.org/docs/current/static/using-explain
EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
QUERY PLAN ------------------------------------------------------------------------ ---------------------------------------------------------- Nested Loop (cost=2.37..553.11 rows=106 width=488) (actual time=1.392..12.700 rows=100 loops=1) -> Bitmap Heap Scan on tenk1 t1 (cost=2.37..232.35 rows=106 width=244) (actual time=0.878..2.367 rows=100 loops=1) Recheck Cond: (unique1 < 100) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.37 rows=106 width=0) (actual time=0.546..0.546 rows=100 loops=1) Index Cond: (unique1 < 100) -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..3.01 rows=1 width=244) (actual time=0.067..0.078 rows=1 loops=100) Index Cond: ("outer".unique2 = t2.unique2) Total runtime: 14.452 ms
I'm interested in figuring out what percentage of the total runtime is spent in each node. Here are my calculations.
Nested loop: actual time: 12.700 ms/loop * 1 loop = 12.700 ms percent of total runtime: 88% percent spent in subnodes: 16% + 54% = 70% percent spent in node: 18%
Bitmap Heap Scan on tenk1: actual time: 2.367 ms/loop * 1 loop = 2.367 ms percent of total runtime: 16% time spent in subnodes: 4% time spent in node: 12%
Bitmap Heap Scan on tenk1_unique1: actual time: 0.546 ms/loop * 1 loop = 0.546 ms: 4% time spent in subnodes: 0% time spent in node: 4%
Index Scan total time: actual time: 0.078 ms/loop * 100 loops = 7.80 ms percent of total runtime: 54% percent spent in subnodes: 0% percent spent in node: 54%
executor overhead: 14.452 ms - 12.700 ms = 1.752 ms: 12%
Is this correct?
Michael Glaesemann grzm seespotcode net
3 Replies
|
|
|
PostgreSQL 8.2.5 slow performance on INSERT on Linux
By Beyers Cronje at Dec 2, 2007, 11:32 pm UTC
Hi all, I'm busy evaluating PostgreSQL and I'm having performance problems on one of my servers. I have a very simple one table database, and the client using Mono 1.2.5.1 is running a loop doing INSERTs on the table. Initially I tested this on my development PC, an old P4 system with 2GB RAM and... More...
Hi all,
I'm busy evaluating PostgreSQL and I'm having performance problems on one of my servers. I have a very simple one table database, and the client using Mono 1.2.5.1 is running a loop doing INSERTs on the table. Initially I tested this on my development PC, an old P4 system with 2GB RAM and 10,000 INSERTs took ~12 secs on average, which I was fairly satisfied with. I then moved everything over to our test server, a new Dell 1950 server with quad core Xeon processors, 4GB RAM and SCSI hdd expecting to see better performance, but instead performance dropped to ~44 secs for 10,000 INSERTs. This obviously is not acceptable. Both the PC and server are running the exact same PostgreSQL version, Mono version, client application and both tests were run under very low load and on an empty table. I noticed that CPU utilization on the Dell server is very low, 1-2% utilization, so obviously it's not a load problem. Only the test application is accessing the database.
So my question is, can anyone please give me some tips on what commands or tools I can use to try and pin down where exactly the performance drop is coming from? I'm obviously new to PostgreSQL so even basic checks can be relevant.
Kind regards
Beyers Cronje
3 Replies
|
|
|
Appending "LIMIT" to query drastically decreases performance
By cluster at Nov 30, 2007, 4:01 pm UTC
Can anyone explain the following odd behavior? I have a query that completes in about 90 ms. If I append LIMIT to the very end, eg. "LIMIT 500" the evaluation time increases to about 800 ms. How can performance get *worse* by giving the database the option to stop the evaluation earlier (when it... More...
Can anyone explain the following odd behavior? I have a query that completes in about 90 ms. If I append LIMIT to the very end, eg. "LIMIT 500" the evaluation time increases to about 800 ms. How can performance get *worse* by giving the database the option to stop the evaluation earlier (when it reaches the output 500 rows)?
I have pasted both queries together with output from explain analyze here: http://pastebin.com/m3c0d1896
3 Replies
|
|
|
GiST indexing tuples
By Matthew Wakeling at Nov 30, 2007, 1:45 pm UTC
Hi all. I'm wanting to write a new GiST index system, to improve performance on some queries I am running. I have had quite a look through the docs and code, and I'm not convinced that it is possible to do what I want. This is what I am wanting to index: CREATE INDEX range_index ON table(a, b)... More...
Hi all.
I'm wanting to write a new GiST index system, to improve performance on some queries I am running. I have had quite a look through the docs and code, and I'm not convinced that it is possible to do what I want. This is what I am wanting to index:
CREATE INDEX range_index ON table(a, b) USING fancy_new_index;
and then:
SELECT * FROM table WHERE a > 1 AND b < 4;
and have that answered by the index.
Now, generating an index format that can answer that particular arrangement of constraints is easy. I can do that. However, getting multiple values into the GiST functions is something I don't know how to do. As far as I can see, I would need to create a composite type and index that, like the contrib package seg does. This would change my SQL to:
CREATE INDEX range_index ON table(fancy_type(a, b)) USING fancy_index;
SELECT * FROM table WHERE fancy_type(a, b) &^£@! fancy_type(1, 4);
which I don't want to do.
So, has this problem been solved before? Is there an already-existing index that will speed up my query? Is there a way to get more than one value into a GiST index?
Thanks,
Matthew
-- If you let your happiness depend upon how somebody else feels about you, now you have to control how somebody else feels about you. -- Abraham Hicks
8 Replies
|
|
|
Configuring a Large RAM PostgreSQL Server
By Alex Hochberger at Nov 30, 2007, 05:50 am UTC
Does anyone have any white papers or basic guides for a large RAM server? We are consolidating two databases to enable better data-mining that currently run on a 4 GB and 2 GB machine. The data issues on the 4 GB machine are numerous, things like "create index" fail and update queries fail from out... More...
Does anyone have any white papers or basic guides for a large RAM server?
We are consolidating two databases to enable better data-mining that currently run on a 4 GB and 2 GB machine. The data issues on the 4 GB machine are numerous, things like "create index" fail and update queries fail from out of memory issues. Re-factoring the data is helping, but isn't finishing the job.
The new machine will have 48 GB of RAM, so figuring out starting points for the Shared Buffers and Work_mem/Maintenance_work_mem is going to be a crap shoot, since the defaults still seem to be based upon 256MB of RAM or less.
Usage: Most of the time, the database is being hit with a handle of poorly written and unoptimized queries from a Ruby on Rails app that is being refactored as a simple Ruby-DBI app since we need to support our legacy code but don't need the Rails environment, just a lot of SQL writes. Some stored procedures should streamline this. However, each transaction will do about 5 or 6 writes. Common Usage: we have a reporting tool that is also being refactored, but does a lot of aggregate queries. None of these take more than 500 ms after indexing on the 2 GB database, so assuming that more RAM should help and eliminate the problems. Problem Usage: we have a 20GB table with 120m rows that we are splitting into some sub-tables. Generally, we do large data pulls from here, 1 million - 4 million records at a time, stored in a new table for export. These queries are problematic because we are unable to index the database for the queries that we run because we get out of memory errors. Most of my cleanup has restored to FOR-IN loops via pl-pgsql to manage the data one row at a time. This is problematic because many of these scripts are taking 4-5 days to run. Other usage: we will import between 10k and 10m rows at one time out of CSVs into the big database table. I got my gig here because this was all failing and the data was becoming worthless. These imports involve a lot of writes.
Our simultaneous queries are small, and currently run acceptably. It's the big imports, data-mining pulls, and system manipulation were we routinely wait days on the query that we are looking to speed up.
Thanks, Alex
4 Replies
|
|
 | |