Jan Wieck wrote:

have you read src/backend/storage/buffer/README of current CVS tip?

The algorithm in the new replacement strategy is an attempt to figure
that SMALL_TABLE_THRESHOLD automatically. Do you see anything that can
be improved in that algorithm?

I've read src/backend/storage/buffer/README rev 1.6 as you suggest. The
new algorithm looks great - many thanks for implementing that.

I'm not able to improve on this for the general case - I especially like
the automatic management that it gives, allowing you to avoid additional
DBA set parameters (and the coding to add these option

My concern was for DBT-3 performance and general Decision Support (DSS)
workloads, where large proportion of table scans occur (not on the DBT-3
single-threaded test). The new strategy is much better than the older
one and is likely to have a positive effect in this area. I don't think,
right now, that anything further should be changed, in the interests of

For the record/for the future: My observation was that two commercial
databases focused on DSS use a strategy which in terms of the new ARC
implementation is effectively: "place blocks in T1 (RECENCY/RECYCLE
buffer) and NEVER promote them to T2 (FREQUENCY/KEEP buffer)" when they
do large object scans.

In the new README, you note that:
StrategyHintVacuum(bool vacuum_active)

Because vacuum reads all relations of the entire database
through the buffer manager, it can greatly disturb the
buffer replacement strategy. This function is used by vacuum
to inform that all subsequent buffer lookups are caused
by vacuum scanning relations.
...I would say that scans of very large tables also "greatly disturb the
buffer replacement strategy", i.e. have exactly the same effect on the
cache as the Vacuum utility.

You'd clearly thought of the idea before me, though with regard to

If we know ahead of time that a large scan is going to have this effect,
why wait for the ARC to play its course, why not take exactly the same
Have large scans call StrategyHint also. (Maybe rename it...?)...of
course, some extra code to establish it IS a large scan...
...large table lookup should wait until a shared catalog cache is

Anyway, this idea can wait at least until we have extensive performance
tuning on DBT-3 with 7.5. Thanks again for adding the new algorithm.

Best Regards, Simon

Simon Riggs wrote:
My suggestion would be to:
- split the buffer cache into two, just as Oracle does: KEEP &
This could default to KEEP=66% of total memory available, but could
be settable by init parameter.
[changes to the memory management routines]
- if we do a scan on a table whose size in blocks is more than some
fraction (25%?) of KEEP bufferpool then we place the blocks into
bufferpool. This can be decided immediately following optimization,
rather than including it within the optimizer decision process since
aren't going to change the way the statement executes, we're just
to stop it from having an adverse effect on other current or future
[additional test to set parameter, then work out where to note it]

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 17 of 29 | next ›
Discussion Overview
grouppgsql-hackers @
postedDec 18, '03 at 5:15p
activeJan 28, '04 at 12:31a



site design / logo © 2021 Grokbase