Simon,

thanks for the time to give this further thought.


Simon Riggs wrote:
If we know ahead of time that a large scan is going to have this effect,
why wait for the ARC to play its course, why not take exactly the same
action?
Have large scans call StrategyHint also. (Maybe rename it...?)...of
course, some extra code to establish it IS a large scan...
...large table lookup should wait until a shared catalog cache is
implemented
The problem with this is a) how to detect that something will be a large
scan, and b) how to decide what is a large scan in the first place.

Large sequential scans in warehousing are often part of more complex
join operations. And just because something returns a large number of
result rows doesn't mean that the input data was that much.

As for the definition of "large" itself, this depends on the size of the
buffer cache and the access pattern of the application. As you surely
have noticed, the usual sizes of B1+T1 = T2+B2 = C in the algorithm.
Buffers evicted from T1 are remembered in B1, and because of that even
repeated sequential scans of the same large relation will only cycle
through T1 blocks, never cause any turbulence in T2 or B2.

The only thing that will affect T2 and B2 dramatically by adjusting the
cache split point is multiple scanning of more than one significantly
large but smaller than C table. Scanning the same large but smaller C
table over and over will have it after the second scan in T2, where it
belongs. But having two tables A and B that are both just smaller C and
having an access pattern like A, A, B, B, A, A, ... will cause many B1
hits and thereby increase the target T1 size. And it must be exactly
that access pattern, because A, A, A, B, B, B, A, A, A, ... produces a
complete MISS on the first, a B1 hit on the second and a B2 hit on the
third scan, so it will up and down the split point evenly.

Honestly, I don't even know what type of application could possibly
produce such a screwed access pattern. And I am absolutely confident one
can find corner cases to wring down Oracles complicated configuration
harness more easily.
Anyway, this idea can wait at least until we have extensive performance
tuning on DBT-3 with 7.5. Thanks again for adding the new algorithm.
Everyone is always welcome to try and show that something can be
improved. And we are in the middle of the 7.5 development cycle, so feel
free to hack around.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 19 of 29 | next ›
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedDec 18, '03 at 5:15p
activeJan 28, '04 at 12:31a
posts29
users11
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase