Home > mailing lists

Re: Add LSN <-> time conversion functionality - Mailing list pgsql-hackers

From	Daniel Gustafsson
Subject	Re: Add LSN <-> time conversion functionality
Date	March 18, 2024 14:02:51
Msg-id	FD6CD2E9-27C5-4CBC-9AAE-23CEBDB6D15B@yesql.se Whole thread Raw
In response to	Re: Add LSN <-> time conversion functionality (Melanie Plageman <melanieplageman@gmail.com>)
Responses	Re: Add LSN <-> time conversion functionality Re: Add LSN <-> time conversion functionality
List	pgsql-hackers

Tree view

> On 22 Feb 2024, at 03:45, Melanie Plageman <melanieplageman@gmail.com> wrote:
> On Fri, Feb 16, 2024 at 3:41 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:

>> - Not sure why we need 0001. Just so that the "estimate" functions in
>> 0002 have a convenient "start" point? Surely we could look at the
>> current LSNTimeline data and use the oldest value, or (if there's no
>> data) use the current timestamp/LSN?
>
> When there are 0 or 1 entries in the timeline you'll get an answer
> that could be very off if you just return the current timestamp or
> LSN. I guess that is okay?

I don't think that's a huge problem at such a young "lsn-age", but I might be
missing something.

>> - I wonder what happens if we lose the data - we know that if people
>> reset statistics for whatever reason (or just lose them because of a
>> crash, or because they're on a replica), bad things happen to
>> autovacuum. What's the (expected) impact on pruning?
>
> This is an important question. Because stats aren't crashsafe, we
> could return very inaccurate numbers for some time/LSN values if we
> crash. I don't actually know what we could do about that. When I use
> the LSNTimeline for the freeze heuristic it is less of an issue
> because the freeze heuristic has a fallback strategy when there aren't
> enough stats to do its calculations. But other users of this
> LSNTimeline will simply get highly inaccurate results (I think?). Is
> there anything we could do about this? It seems bad.

A complication with this over stats is that we can't recompute this in case of
a crash/corruption issue.  The simple solution is to consider this unlogged
data and start fresh at every unclean shutdown, but I have a feeling that won't
be good enough for basing heuristics on?

> Andres had brought up something at some point about, what if the
> database is simply turned off for awhile and then turned back on. Even
> if you cleanly shut down, will there be "gaps" in the timeline? I
> think that could be okay, but it is something to think about.

The gaps would represent reality, so there is nothing wrong per se with gaps,
but if they inflate the interval of a bucket which in turns impact the
precision of the results then that can be a problem.

> Just a note, all of my comments could use a lot of work, but I want to
> get consensus on the algorithm before I make sure and write about it
> in a perfect way.

I'm not sure "a lot of work" is accurate, they seem pretty much there to me,
but I think that an illustration of running through the algorithm in an
ascii-art array would be helpful.

Reading through this I think such a function has merits, not only for your
usecase but other heuristic based work and quite possibly systems debugging.

While the bucketing algorithm is a clever algorithm for degrading precision for
older entries without discarding them, I do fear that we'll risk ending up with
answers like "somewhere between in the past and even further in the past".
I've been playing around with various compression algorithms for packing the
buckets such that we can retain precision for longer.  Since you were aiming to
work on other patches leading up to the freeze, let's pick this up again once
things calm down.

When compiling I got this warning for lsntime_merge_target:

pgstat_wal.c:242:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
1 warning generated.

The issue seems to be that the can't-really-happen path is protected with an
Assertion, which is a no-op for production builds.  I think we should handle
the error rather than rely on testing catching it (since if it does happen even
though it can't, it's going to be when it's for sure not tested).  Returning an
invalid array subscript like -1 and testing for it in lsntime_insert, with an
elog(WARNING..), seems enough.

-    last_snapshot_lsn <= GetLastImportantRecPtr())
+    last_snapshot_lsn <= current_lsn)
I think we should delay extracting the LSN with GetLastImportantRecPtr until we
know that we need it, to avoid taking locks in this codepath unless needed.

I've attached a diff with the above suggestions which applies on op of your
patchset.

--
Daniel Gustafsson

Attachment

review.txt

pgsql-hackers by date:

From: Anthonin Bonnefoy
Date: 18 March 2024, 14:00:39
Subject: Re: POC: Extension for adding distributed tracing - pg_tracing

From: Alvaro Herrera
Date: 18 March 2024, 14:09:23
Subject: Re: speed up a logical replica setup

Re: Add LSN <-> time conversion functionality - Mailing list pgsql-hackers

Attachment

Previous

Next