Re: BUG #17804: Assertion failed in pg_stat after fetching from pg_stat_database and switching cache->snapshot - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #17804: Assertion failed in pg_stat after fetching from pg_stat_database and switching cache->snapshot
Date
Msg-id ZE9LiFc7JdNHokz/@paquier.xyz
Whole thread Raw
In response to Re: BUG #17804: Assertion failed in pg_stat after fetching from pg_stat_database and switching cache->snapshot  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: BUG #17804: Assertion failed in pg_stat after fetching from pg_stat_database and switching cache->snapshot
List pgsql-bugs
On Fri, Apr 28, 2023 at 04:07:52PM +0900, Kyotaro Horiguchi wrote:
> For the the record, I'm not saying that it is dangerous to clear
> snapshots directly in the callback. In fact, as I mentioned earlier, I
> believe there is no issue with that.  But, I believe it is simpler
> that the actual work is separate from the callback path since we don't
> need to worry about when the guc-callback will be called.

pgstat_clear_backend_activity_snapshot() and pgstat_assert_is_up() are
the two points of contention here.  The former could be called in a
non-backend context, which would be an incorrect use of this API.  I
am not completely sure that pgstat_assert_is_up() would be always
correct, as well, though I have not been able to see a problem.

Anyway, sorry for the late replay, it took me some time to study this
subsystem and understand what's going on here (like why we have fixed
snapshots but these don't list a timestamp, for example).  At the end,
I think that I'm OK with your suggestion to just force a snapshot
cleanup each time the GUC is changed, mainly because that's simple to
understand, and do the clear based on the timing where the next
snapshot build would happen.  I would be curious to hear arguments if
there would be a need for a more complex mechanism, but that seems
overkill to me as this is mainly here when switching out of the
"snapshot" mode.  A commit/abort would force a cleanup of the
snapshot, clearing the flag, as well.

I would add a small note in the docs about this behavior.  Another
thing is to add a few pg_stat_get_snapshot_timestamp() after building
a snapshot for each mode tested.

While on it, I have noticed that postgresql.conf.sample does not list
the values available for stats_fetch_consistency.  These had better be
added to the sample file, no?

At the end, I am finishing with the attached, without the sample file
part.  Thoughts?
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Maxim Boguk
Date:
Subject: Re: BUG #17913: alter column set (n_distinct=...) on partition head doesn't work for declarative partitioned tables
Next
From: Michael Paquier
Date:
Subject: Re: pg_basebackup: errors on macOS on directories with ".DS_Store" files