Re: cache control? - Mailing list pgsql-hackers
From | Jan Wieck |
---|---|
Subject | Re: cache control? |
Date | |
Msg-id | 40110F1F.4050605@Yahoo.com Whole thread Raw |
In response to | Re: cache control? ("Simon Riggs" <simon@2ndquadrant.com>) |
Responses |
Re: cache control?
|
List | pgsql-hackers |
Simon, thanks for the time to give this further thought. Simon Riggs wrote: > If we know ahead of time that a large scan is going to have this effect, > why wait for the ARC to play its course, why not take exactly the same > action? > Have large scans call StrategyHint also. (Maybe rename it...?)...of > course, some extra code to establish it IS a large scan... > ...large table lookup should wait until a shared catalog cache is > implemented The problem with this is a) how to detect that something will be a large scan, and b) how to decide what is a large scan in the first place. Large sequential scans in warehousing are often part of more complex join operations. And just because something returns a large number of result rows doesn't mean that the input data was that much. As for the definition of "large" itself, this depends on the size of the buffer cache and the access pattern of the application. As you surely have noticed, the usual sizes of B1+T1 = T2+B2 = C in the algorithm. Buffers evicted from T1 are remembered in B1, and because of that even repeated sequential scans of the same large relation will only cycle through T1 blocks, never cause any turbulence in T2 or B2. The only thing that will affect T2 and B2 dramatically by adjusting the cache split point is multiple scanning of more than one significantly large but smaller than C table. Scanning the same large but smaller C table over and over will have it after the second scan in T2, where it belongs. But having two tables A and B that are both just smaller C and having an access pattern like A, A, B, B, A, A, ... will cause many B1 hits and thereby increase the target T1 size. And it must be exactly that access pattern, because A, A, A, B, B, B, A, A, A, ... produces a complete MISS on the first, a B1 hit on the second and a B2 hit on the third scan, so it will up and down the split point evenly. Honestly, I don't even know what type of application could possibly produce such a screwed access pattern. And I am absolutely confident one can find corner cases to wring down Oracles complicated configuration harness more easily. > > Anyway, this idea can wait at least until we have extensive performance > tuning on DBT-3 with 7.5. Thanks again for adding the new algorithm. Everyone is always welcome to try and show that something can be improved. And we are in the middle of the 7.5 development cycle, so feel free to hack around. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-hackers by date: