Re: Add LSN <-> time conversion functionality - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: Add LSN <-> time conversion functionality
Date
Msg-id FD6CD2E9-27C5-4CBC-9AAE-23CEBDB6D15B@yesql.se
Whole thread Raw
In response to Re: Add LSN <-> time conversion functionality  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Add LSN <-> time conversion functionality
Re: Add LSN <-> time conversion functionality
List pgsql-hackers
> On 22 Feb 2024, at 03:45, Melanie Plageman <melanieplageman@gmail.com> wrote:
> On Fri, Feb 16, 2024 at 3:41 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:

>> - Not sure why we need 0001. Just so that the "estimate" functions in
>> 0002 have a convenient "start" point? Surely we could look at the
>> current LSNTimeline data and use the oldest value, or (if there's no
>> data) use the current timestamp/LSN?
>
> When there are 0 or 1 entries in the timeline you'll get an answer
> that could be very off if you just return the current timestamp or
> LSN. I guess that is okay?

I don't think that's a huge problem at such a young "lsn-age", but I might be
missing something.

>> - I wonder what happens if we lose the data - we know that if people
>> reset statistics for whatever reason (or just lose them because of a
>> crash, or because they're on a replica), bad things happen to
>> autovacuum. What's the (expected) impact on pruning?
>
> This is an important question. Because stats aren't crashsafe, we
> could return very inaccurate numbers for some time/LSN values if we
> crash. I don't actually know what we could do about that. When I use
> the LSNTimeline for the freeze heuristic it is less of an issue
> because the freeze heuristic has a fallback strategy when there aren't
> enough stats to do its calculations. But other users of this
> LSNTimeline will simply get highly inaccurate results (I think?). Is
> there anything we could do about this? It seems bad.

A complication with this over stats is that we can't recompute this in case of
a crash/corruption issue.  The simple solution is to consider this unlogged
data and start fresh at every unclean shutdown, but I have a feeling that won't
be good enough for basing heuristics on?

> Andres had brought up something at some point about, what if the
> database is simply turned off for awhile and then turned back on. Even
> if you cleanly shut down, will there be "gaps" in the timeline? I
> think that could be okay, but it is something to think about.

The gaps would represent reality, so there is nothing wrong per se with gaps,
but if they inflate the interval of a bucket which in turns impact the
precision of the results then that can be a problem.

> Just a note, all of my comments could use a lot of work, but I want to
> get consensus on the algorithm before I make sure and write about it
> in a perfect way.

I'm not sure "a lot of work" is accurate, they seem pretty much there to me,
but I think that an illustration of running through the algorithm in an
ascii-art array would be helpful.


Reading through this I think such a function has merits, not only for your
usecase but other heuristic based work and quite possibly systems debugging.

While the bucketing algorithm is a clever algorithm for degrading precision for
older entries without discarding them, I do fear that we'll risk ending up with
answers like "somewhere between in the past and even further in the past".
I've been playing around with various compression algorithms for packing the
buckets such that we can retain precision for longer.  Since you were aiming to
work on other patches leading up to the freeze, let's pick this up again once
things calm down.

When compiling I got this warning for lsntime_merge_target:

pgstat_wal.c:242:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
1 warning generated.

The issue seems to be that the can't-really-happen path is protected with an
Assertion, which is a no-op for production builds.  I think we should handle
the error rather than rely on testing catching it (since if it does happen even
though it can't, it's going to be when it's for sure not tested).  Returning an
invalid array subscript like -1 and testing for it in lsntime_insert, with an
elog(WARNING..), seems enough.


-    last_snapshot_lsn <= GetLastImportantRecPtr())
+    last_snapshot_lsn <= current_lsn)
I think we should delay extracting the LSN with GetLastImportantRecPtr until we
know that we need it, to avoid taking locks in this codepath unless needed.

I've attached a diff with the above suggestions which applies on op of your
patchset.

--
Daniel Gustafsson



Attachment

pgsql-hackers by date:

Previous
From: Anthonin Bonnefoy
Date:
Subject: Re: POC: Extension for adding distributed tracing - pg_tracing
Next
From: Alvaro Herrera
Date:
Subject: Re: speed up a logical replica setup