Re: Clock sweep not caching enough B-Tree leaf pages? - Mailing list pgsql-hackers
From | Atri Sharma |
---|---|
Subject | Re: Clock sweep not caching enough B-Tree leaf pages? |
Date | |
Msg-id | CAOeZVicEUNUk-YC0KaZvR37kZWL=otm0TcaV_gzPPEfAqWf9aQ@mail.gmail.com Whole thread Raw |
In response to | Re: Clock sweep not caching enough B-Tree leaf pages? (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: Clock sweep not caching enough B-Tree leaf pages?
|
List | pgsql-hackers |
A way I have in mind about eviction policy is to introduce a way to have an ageing factor in each buffer and take the ageing factor into consideration when evicting a buffer.
Consider a case where a table is pretty huge and spread across multiple pages. The querying pattern is like a time series pattern i.e. a set of rows is queried pretty frequently for some time, making the corresponding page hot. Then, the next set of rows is queried frequently making that page hot and so on.
Consider a new page entering the shared buffers with refcount 1 and usage_count 1. If that page is a part of the workload described above, it is likely that it shall not be used for a considerable amount of time after it has entered the buffers but will be used eventually.
Now, the current hypothetical situation is that we have three pages:
1) The page that used to be hot at the previous time window but is no longer hot and is actually the correct candidate for eviction.
2) The current hot page (It wont be evicted anyway for now).
3) The new page which just got in and should not be evicted since it can be hot soon (for this workload it will be hot in the next time window).
When Clocksweep algorithm runs the next time, it will see the new buffer page as the one to be evicted (since page (1) may still have usage_count > 0 i.e. it may be 'cooling' but not 'cool' yet.)
This can be changed by introducing an ageing factor that sees how much time the current buffer has spend in shared buffers. If the time that the buffer has spent is large enough (relatively) and it is not hot currently, that means it has had its chance and can be evicted. This shall save the new page (3) from being evicted since it's time in shared buffers shall not be high enough to mandate eviction and it shall be given more chances.
Since gettimeofday() is an expensive call and hence cannot be done in the tight loop, we can count the number of clocksweeps the current buffer has seen (rather, survived). This shall give us a rough idea of the estimate of the relative age of the buffer.
When an eviction happens, all the candidates with refcount = 0 shall be taken.Then, among them, the one with highest ageing factor shall be evicted.
Of course, there may be better ways of doing the same, but I want to highlight the point (or possibility) of introducing an ageing factor to prevent eviction of relatively younger pages early in the eviction process.
The overhead isnt too big. We just need to add another attribute in buffer header for the number of clocksweeps seen (rather, survived) and check it when an eviction is taking place.The existing spinlock for buffer headers shall be good for protecting contention and access. The access rules can be similar to that of usage_count.
Thoughts and comments?
Regards,
Atri
The overhead isnt too big. We just need to add another attribute in buffer header for the number of clocksweeps seen (rather, survived) and check it when an eviction is taking place.The existing spinlock for buffer headers shall be good for protecting contention and access. The access rules can be similar to that of usage_count.
Thoughts and comments?
Regards,
Atri
--
Regards,
Atri
l'apprenant
pgsql-hackers by date: