I have a problem with certain queries performance. Trouble is that
while their execution plan is pretty good and mostly their execution
is great as well, their FIRST execution time (that is after you mount
the database) is abysmal.

I realize that it happens due to the loading of data from the HD to
the memory/swap and it wouldn't be too bad if I just could make the
data stay in the memory, sadly, after a few minutes the data is back
on the HD and running the query again results the same bad
performance.

Here's a query for example, though as I said, this problem occurs in
different queries.

---------------------------------------------------------------------------------------

SELECT *
FROM bv_bookgenres, bv_books
WHERE bv_books.book_id = bv_bookgenres.book_id and genre_id = 987
ORDER BY vote_avg limit 10

---------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------

QUERY PLAN
Limit (cost=2601.16..2601.18 rows=10 width=193) (actual
time=4735.097..4735.107 rows=10 loops=1)
-> Sort (cost=2601.16..2601.70 rows=219 width=193) (actual
time=4735.092..4735.095 rows=10 loops=1)
Sort Key: bv_books.vote_avg
-> Nested Loop (cost=0.00..2592.64 rows=219 width=193)
(actual time=74.615..4719.147 rows=1877 loops=1)
-> Index Scan using i_bookgenres_genre_id on
bv_bookgenres (cost=0.00..1707.03 rows=218 width=4) (actual
time=74.540..2865.366 rows=1877 loops=1)
Index Cond: (genre_id = 987)
-> Index Scan using bv_books_pkey on bv_books
(cost=0.00..4.05 rows=1 width=193) (actual time=0.968..0.971 rows=1
loops=1877)
Index Cond: (bv_books.book_id = "outer".book_id)
Total runtime: 4735.726 ms

---------------------------------------------------------------------------------------

If I run the query again after it just finished running I would get
the following timing:

---------------------------------------------------------------------------------------

Limit (cost=3937.82..3937.84 rows=10 width=204)
-> Sort (cost=3937.82..3938.38 rows=223 width=204)
Sort Key: bv_books.vote_avg
-> Nested Loop (cost=0.00..3929.12 rows=223 width=204)
-> Index Scan using i_bookgenres_genre_id on
bv_bookgenres (cost=0.00..1731.94 rows=222 width=8)
Index Cond: (genre_id = 987)
-> Index Scan using bv_books_pkey on bv_books
(cost=0.00..9.88 rows=1 width=196)
Index Cond: (bv_books.book_id = "outer".book_id)

---------------------------------------------------------------------------------------

Before going on, I should say that I am running PostgreSQL on CoLinux
under Windows 2000. From what I read/tested, the CoLinux performance
on CoLinux are matching to the performance of VMWare. Yet, I'm still
wondering if it is a side effect of my development setup or if some of
my settings are indeed wrong.

With that said, here is the information of the tables:

---------------------------------------------------------------------------------------

CREATE TABLE bv_books
(
book_id serial NOT NULL,
book_name varchar(255) NOT NULL,
series_id int4,
series_index int2,
annotation_desc_id int4,
description_desc_id int4,
book_picture varchar(255) NOT NULL,
reviews_error int4 NOT NULL,
vote_avg float4 NOT NULL,
vote_count int4 NOT NULL,
book_genre int4[],
book_name_fulltext tsearch2.tsvector,
book_name_fulltext2 tsearch2.tsvector,
CONSTRAINT bv_books_pkey PRIMARY KEY (book_id),
CONSTRAINT fk_books_annotation_desc_id FOREIGN KEY
(annotation_desc_id) REFERENCES bv_descriptions (description_id) ON
UPDATE RESTRICT ON DELETE SET NULL,
CONSTRAINT fk_books_description_desc_id FOREIGN KEY
(description_desc_id) REFERENCES bv_descriptions (description_id) ON
UPDATE RESTRICT ON DELETE SET NULL,
CONSTRAINT fk_books_series_id FOREIGN KEY (series_id) REFERENCES
bv_series (series_id) ON UPDATE RESTRICT ON DELETE RESTRICT
)
WITH OIDS;

CREATE TABLE bv_bookgenres
(
book_id int4 NOT NULL,
genre_id int4 NOT NULL,
CONSTRAINT bv_bookgenres_pkey PRIMARY KEY (book_id, genre_id),
CONSTRAINT fk_bookgenres_book_id FOREIGN KEY (book_id) REFERENCES
bv_books (book_id) ON UPDATE RESTRICT ON DELETE CASCADE,
CONSTRAINT fk_bookgenres_genre_id FOREIGN KEY (genre_id) REFERENCES
bv_genres (genre_id) ON UPDATE RESTRICT ON DELETE RESTRICT
)
WITH OIDS;

---------------------------------------------------------------------------------------

As far as the data is concerned, there are around 170,000 rows in
bv_books and 940,000 in bv_bookgenres. There are also btree index on
all the relevant (to the query) fields.

I can live up with the fact that the data has to be loaded the first
time it is accessed, but is it possible to make it stick longer in the
memory? Is it the fact that CoLinux gets only 128MB of RAM? Or one of
my settings should be fixed?

Thanks

Search Discussions

  • Josh Berkus at Aug 28, 2004 at 9:33 pm
    Vitaly,
    I have a problem with certain queries performance. Trouble is that
    while their execution plan is pretty good and mostly their execution
    is great as well, their FIRST execution time (that is after you mount
    the database) is abysmal.
    This is a well-known problem. The general approach to this is to run a
    script to do select * queries against all important tables on system
    start-up.
    I realize that it happens due to the loading of data from the HD to
    the memory/swap and it wouldn't be too bad if I just could make the
    data stay in the memory, sadly, after a few minutes the data is back
    on the HD and running the query again results the same bad
    performance.
    This could be for a variety of reasons. On a standard platform (which yours
    most definitely is not), this would be due to database vacuuming, commits of
    large updates to your data, or another application using most of the system
    memory.
    Before going on, I should say that I am running PostgreSQL on CoLinux
    under Windows 2000. From what I read/tested, the CoLinux performance
    on CoLinux are matching to the performance of VMWare. Yet, I'm still
    wondering if it is a side effect of my development setup or if some of
    my settings are indeed wrong.
    Probably you will continue to get worse-than-normal performance from both.
    You simply can't expect performance PostgreSQL running on an emulation
    environment. If you could, we wouldn't have bothered with a Windows port.
    Speaking of which, have you started testing the Windows port? I'd be
    interested in your comparison of it against running on CoLinux.
    I can live up with the fact that the data has to be loaded the first
    time it is accessed, but is it possible to make it stick longer in the
    memory? Is it the fact that CoLinux gets only 128MB of RAM? Or one of
    my settings should be fixed?
    Well, mostly it's that you should start testing 8.0, and the Windows port.
    Not only should running native be better, but 8.0 (thanks to the work of Jan
    Wieck) is now able to take advantage of a large chunk of dedicated memory,
    which earlier versions were not. Also, "lazy vacuum" and the "background
    writer", also features of 8.0 and Jan's work, should prevent PostgreSQL from
    cleaning out its own cache completely. You should test this,
    *particularly* on Windows where we could use some more performance testing.

    --
    --Josh

    Josh Berkus
    Aglio Database Solutions
    San Francisco

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedAug 28, '04 at 5:42p
activeAug 28, '04 at 9:33p
posts2
users2
websitepostgresql.org
irc#postgresql

2 users in discussion

Vitaly Belman: 1 post Josh Berkus: 1 post

People

Translate

site design / logo © 2022 Grokbase