Re: Eager page freeze criteria clarification - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Eager page freeze criteria clarification |
Date | |
Msg-id | CAAKRu_Z-ZcJm_b9OvwYNHEY6FhtdAdXOcCVK7iT=teOhi_HqGQ@mail.gmail.com Whole thread Raw |
In response to | Re: Eager page freeze criteria clarification (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Eager page freeze criteria clarification
|
List | pgsql-hackers |
On Wed, Dec 13, 2023 at 12:24 PM Robert Haas <robertmhaas@gmail.com> wrote: > > Great results. Thanks! > On Sat, Dec 9, 2023 at 5:12 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > Values can be "removed" from the accumulator by simply decrementing its > > cardinality and decreasing the sum and sum squared by a value that will > > not change the mean and standard deviation of the overall distribution. > > To adapt to a table's changing access patterns, we'll need to remove > > values from this accumulator over time, but this patch doesn't yet > > decide when to do this. A simple solution may be to cap the cardinality > > of the accumulator to the greater of 1% of the table size, or some fixed > > number of values (perhaps 200?). Even without such removal of values, > > the distribution recorded in the accumulator will eventually skew toward > > more recent data, albeit at a slower rate. > > I think we're going to need something here. Otherwise, after 6 months > of use, changing a table's perceived access pattern will be quite > difficult. > > I think one challenge here is to find something that doesn't decay too > often and end up with cases where it basically removes all the data. > > As long as you avoid that, I suspect that the algorithm might not be > terribly sensitive to other kinds of changes. If you decay after 200 > values or 2000 or 20,000, it will only affect how fast we can change > our notion of the access pattern, and my guess would be that any of > those values would produce broadly acceptable results, with some > differences in the details. If you decay after 200,000,000 values or > not at all, then I think there will be problems. I'll add the decay logic and devise a benchmark that will exercise it. I can test at least one or two of these ideas. > > The goal is to keep pages frozen for at least target_freeze_duration. > > target_freeze_duration is in seconds and pages only have a last > > modification LSN, so target_freeze_duration must be converted to LSNs. > > To accomplish this, I've added an LSNTimeline data structure, containing > > XLogRecPtr, TimestampTz pairs stored with decreasing precision as they > > age. When we need to translate the guc value to LSNs, we linearly > > interpolate it on this timeline. For the time being, the global > > LSNTimeline is in PgStat_WalStats and is only updated by vacuum. There > > is no reason it can't be updated with some other cadence and/or by some > > other process (nothing about it is inherently tied to vacuum). The > > cached translated value of target_freeze_duration is stored in each > > table's stats. This is arbitrary as it is not a table-level stat. > > However, it needs to be located somewhere that is accessible on > > update/delete. We may want to recalculate it more often than once per > > table vacuum, especially in case of long-running vacuums. > > This part sounds like it isn't quite baked yet. The idea of the data > structure seems fine, but updating it once per vacuum sounds fairly > unprincipled to me? Don't we want the updates to happen on a somewhat > regular wall clock cadence? Yes, this part was not fully baked. I actually discussed this with Andres at PGConf EU last week and he suggested that background writer update the LSNTimeline. He also suggested I propose the LSNTimeline in a new thread. I could add a pageinspect function returning the estimated time of last page modification given the page LSN (so it is proposed with a user). - Melanie
pgsql-hackers by date: