Home > mailing lists

Re: Add LSN <-> time conversion functionality - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: Add LSN <-> time conversion functionality
Date	June 27 04:35:27
Msg-id	CAAKRu_YbbZGz-X_pm2zXJA+6A22YYpaWhOjmytqFL1yF_FCv6w@mail.gmail.com Whole thread Raw
In response to	Re: Add LSN <-> time conversion functionality (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Mon, Mar 18, 2024 at 1:29 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 2/22/24 03:45, Melanie Plageman wrote:
> > The attached v3 has a new algorithm. Now, LSNTimes are added from the
> > end of the array forward until all array elements have at least one
> > logical member (array length == volume). Once array length == volume,
> > new LSNTimes will result in merging logical members in existing
> > elements. We want to merge older members because those can be less
> > precise. So, the number of logical members per array element will
> > always monotonically increase starting from the beginning of the array
> > (which contains the most recent data) and going to the end. We want to
> > use all the available space in the array. That means that each LSNTime
> > insertion will always result in a single merge. We want the timeline
> > to be inclusive of the oldest data, so merging means taking the
> > smaller value of two LSNTime values. I had to pick a rule for choosing
> > which elements to merge. So, I choose the merge target as the oldest
> > element whose logical membership is < 2x its predecessor. I merge the
> > merge target's predecessor into the merge target. Then I move all of
> > the intervening elements down 1. Then I insert the new LSNTime at
> > index 0.
> >
>
> I can't help but think about t-digest [1], which also merges data into
> variable-sized buckets (called centroids, which is a pair of values,
> just like LSNTime). But the merging is driven by something called "scale
> function" which I found like a pretty nice approach to this, and it
> yields some guarantees regarding accuracy. I wonder if we could do
> something similar here ...
>
> The t-digest is a way to approximate quantiles, and the default scale
> function is optimized for best accuracy on the extremes (close to 0.0
> and 1.0), but it's possible to use scale functions that optimize only
> for accuracy close to 1.0.
>
> This may be misguided, but I see similarity between quantiles and what
> LSNTimeline does - timestamps are ordered, and quantiles close to 0.0
> are "old timestamps" while quantiles close to 1.0 are "now".
>
> And t-digest also defines a pretty efficient algorithm to merge data in
> a way that gradually combines older "buckets" into larger ones.

I started taking a look at this paper and think the t-digest could be
applicable as a possible alternative data structure to the one I am
using to approximate page age for the actual opportunistic freeze
heuristic -- especially since the centroids are pairs of a mean and a
count. I couldn't quite understand how the t-digest is combining those
centroids. Since I am not computing quantiles over the LSNTimeStream,
though, I think I can probably do something simpler for this part of
the project.

> >> - The LSNTimeline comment claims an array of size 64 is large enough to
> >> not need to care about filling it, but maybe it should briefly explain
> >> why we can never fill it (I guess 2^64 is just too many).
-- snip --
> I guess that should be enough for (2^64-1) logical members, because it's
> a sequence 1, 2, 4, 8, ..., 2^63. Seems enough.
>
> But now that I think about it, does it make sense to do the merging
> based on the number of logical members? Shouldn't this really be driven
> by the "length" of the time interval the member covers?

After reading this, I decided to experiment with a few different
algorithms in python and plot the unabbreviated LSNTimeStream against
various ways of compressing it. You can see the results if you run the
python code here [1].

What I found is that attempting to calculate the error represented by
dropping a point and picking the point which would cause the least
additional error were it to be dropped produced more accurate results
than combining the oldest entries based on logical membership to fit
some series.

This is inspired by what you said about using the length of segments
to decide which points to merge. In my case, I want to merge segments
that have a similar slope -- those which have a point that is
essentially redundant. I loop through the LSNTimeStream and look for
the point that we can drop with the lowest impact on future
interpolation accuracy. To do this, I calculate the area of the
triangle made by each point on the stream and its adjacent points. The
idea being that if you drop that point, the triangle is the amount of
error you introduce for points being interpolated between the new pair
(previous adjacencies of the dropped point). This has some issues, but
it seems more logical than just always combining the oldest points.

If you run the python simulator code, you'll see that for the
LSNTimeStream I generated, using this method produces more accurate
results than either randomly dropping points or using the "combine
oldest members" method.

It would be nice if we could do something with the accumulated error
-- so we could use it to modify estimates when interpolating. I don't
really know how to keep it though. I thought I would just save the
calculated error in one or the other of the adjacent points after
dropping a point, but then what do we do with the error saved in a
point before it is dropped? Add it to the error value in one of the
adjacent points? If we did, what would that even mean? How would we
use it?

- Melanie

[1] https://gist.github.com/melanieplageman/95126993bcb43d4b4042099e9d0ccc11

pgsql-hackers by date:

From: Michael Paquier
Date: 27 June, 03:48:26
Subject: Re: Fix possible overflow of pg_stat DSA's refcnt

From: Masahiko Sawada
Date: 27 June, 04:39:45
Subject: Re: Vacuum statistics

Re: Add LSN <-> time conversion functionality - Mailing list pgsql-hackers

Previous

Next