Thread: BUG #17804: Assertion failed in pg_stat after fetching from pg_stat_database and swithing cache->snapshot

The following bug has been logged on the website:

Bug reference:      17804
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 15.2
Operating system:   Ubuntu 22.04
Description:

The following transaction:
BEGIN;
SET LOCAL stats_fetch_consistency = cache;
SELECT * FROM pg_stat_database;
SET LOCAL stats_fetch_consistency = snapshot;
SELECT pg_stat_get_function_calls(0);

raises an assertion failure with the following stack:
Core was generated by `postgres: law regression [local] SELECT
                        '.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/1355782' in core file too small.
#0  __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140663482824512) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140663482824512) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140663482824512) at
./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140663482824512, signo=signo@entry=6) at
./nptl/pthread_kill.c:89
#3  0x00007feec4a1a476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4  0x00007feec4a007f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000555cf0603315 in ExceptionalCondition (
    conditionName=0x555cf07ecf38 "pgStatLocal.snapshot.stats->members == 0",

    errorType=0x555cf07ec974 "FailedAssertion", fileName=0x555cf07ecc8f
"pgstat.c", lineNumber=948)
    at assert.c:69
#6  0x0000555cf045e815 in pgstat_build_snapshot () at pgstat.c:948
#7  0x0000555cf045e39b in pgstat_fetch_entry (kind=PGSTAT_KIND_FUNCTION,
dboid=16384, objoid=0)
    at pgstat.c:798
#8  0x0000555cf0461fae in pgstat_fetch_stat_funcentry (func_id=0) at
pgstat_function.c:242
#9  0x0000555cf0544c66 in pg_stat_get_function_calls (fcinfo=0x555cf164cb68)
at pgstatfuncs.c:384
#10 0x0000555cf01b7396 in ExecInterpExpr (state=0x555cf164ca80,
econtext=0x555cf164c780, 
    isnull=0x7ffe50dc040f) at execExprInterp.c:751
#11 0x0000555cf01b964e in ExecInterpExprStillValid (state=0x555cf164ca80,
econtext=0x555cf164c780, 
    isNull=0x7ffe50dc040f) at execExprInterp.c:1826
#12 0x0000555cf0217315 in ExecEvalExprSwitchContext (state=0x555cf164ca80,
econtext=0x555cf164c780, 
    isNull=0x7ffe50dc040f) at ../../../src/include/executor/executor.h:341
#13 0x0000555cf021738d in ExecProject (projInfo=0x555cf164ca78)
    at ../../../src/include/executor/executor.h:375
#14 0x0000555cf02175d7 in ExecResult (pstate=0x555cf164c668) at
nodeResult.c:136
#15 0x0000555cf01d022d in ExecProcNodeFirst (node=0x555cf164c668) at
execProcnode.c:464
#16 0x0000555cf01c34ee in ExecProcNode (node=0x555cf164c668) at
../../../src/include/executor/executor.h:259
#17 0x0000555cf01c63f5 in ExecutePlan (estate=0x555cf164c430,
planstate=0x555cf164c668, 
    use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true,
numberTuples=0, 
    direction=ForwardScanDirection, dest=0x555cf16519a0, execute_once=true)
at execMain.c:1636
#18 0x0000555cf01c3c25 in standard_ExecutorRun (queryDesc=0x555cf165bba0,
direction=ForwardScanDirection, 
    count=0, execute_once=true) at execMain.c:363
#19 0x0000555cf01c3a10 in ExecutorRun (queryDesc=0x555cf165bba0,
direction=ForwardScanDirection, count=0, 
    execute_once=true) at execMain.c:307
#20 0x0000555cf043cae6 in PortalRunSelect (portal=0x555cf15eeb90,
forward=true, count=0, dest=0x555cf16519a0)
    at pquery.c:924
#21 0x0000555cf043c704 in PortalRun (portal=0x555cf15eeb90,
count=9223372036854775807, isTopLevel=true, 
    run_once=true, dest=0x555cf16519a0, altdest=0x555cf16519a0,
qc=0x7ffe50dc0810) at pquery.c:768
#22 0x0000555cf0435590 in exec_simple_query (
    query_string=0x555cf1582570 "SELECT pg_stat_get_function_calls(0);") at
postgres.c:1250
#23 0x0000555cf043a4af in PostgresMain (dbname=0x555cf15aea18 "regression",
username=0x555cf15ae9f8 "law")
    at postgres.c:4593
#24 0x0000555cf035fd2b in BackendRun (port=0x555cf15a6130) at
postmaster.c:4511
#25 0x0000555cf035f5b2 in BackendStartup (port=0x555cf15a6130) at
postmaster.c:4239
#26 0x0000555cf035b53f in ServerLoop () at postmaster.c:1806
#27 0x0000555cf035ac9c in PostmasterMain (argc=3, argv=0x555cf157da30) at
postmaster.c:1478
#28 0x0000555cf024ebdd in main (argc=3, argv=0x555cf157da30) at main.c:202

It appeared with the stats_fetch_consistency introduction (5891c7a8e).

Looks like pgstat_fetch_entry() can fill up the snapshot.stats structure
and it's contents preserved when switching the consistency mode,
what pgstat_build_snapshot() doesn't expect.


At Wed, 22 Feb 2023 06:00:01 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in 
> The following transaction:
> BEGIN;
> SET LOCAL stats_fetch_consistency = cache;
> SELECT * FROM pg_stat_database;
> SET LOCAL stats_fetch_consistency = snapshot;
> SELECT pg_stat_get_function_calls(0);
> 
> raises an assertion failure with the following stack:
> Core was generated by `postgres: law regression [local] SELECT              
>                         '.
> Program terminated with signal SIGABRT, Aborted.

Thanks for the report!  We didn't consider the case that variable is
changed in-transaction.

Perhaps we should reject changes to fetch consistency once we the
snapshot hash is created.  I can think of two ways to do this. One
approach is to perform that check at GUC-level. However, this would
result in an ERROR meassage that isn't very informative. Addition to
that, it seems like aborting the transaction would be too severe a
punishment for this operation.

ERROR:  invalid value for parameter "stats_fetch_consistency": "cache"
DETAIL:  pgstat_fetch_consistency cannot be changed in this context.
ERROR:  current transaction is aborted, commands ignored until end of transactio

The attached is an alternative solution. If pgstat_fetch_consistency
is different from the existing snapshot, pgstat_fetch_entry() will
issue a warning then continue with the old snapshot mode instead. To
make this work, I modified pgstat_prep_snapshot to determine the
actual consistency mode to use, and then the subsequent code uses the
returned mode.

WARNING:  ignored in-transaction change of statistics consistency

The patch allows the transitions only none->cahce and none->snapshot.
To dislable all types of transitions, it seems that we need either an
additional boolean in the struct PgStat_LocalState or an additional
enumerator in the enyum PgStat_FetchConsistency to keep track of
whether pgstat_prep_snapshot() is called during the current
transaction.

What are your thoughts on this?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment
Hello Kyotaro-san,
24.02.2023 10:08, Kyotaro Horiguchi wrote:
> Thanks for the report!  We didn't consider the case that variable is
> changed in-transaction.
>
>
> What are your thoughts on this?
What if we implement assign_pgstat_fetch_consistency() and execute
pgstat_clear_snapshot() in it when the mode switch happens (to avoid
possible inconsistency between the GUC and the current pgstat state)?
If snapshots in different modes contain different data sets, will such
behaviour be unexpected for a user?
For example, if a user performs:
SET LOCAL stats_fetch_consistency = snapshot;
{some-actions-1}
SET LOCAL stats_fetch_consistency = none;
{some-actions-2}
SET LOCAL stats_fetch_consistency = snapshot;
{some-actions-3}
Should she expect that some-actions-1, some-actions-2 will be
reflected in the snapshot at the end?

Best regards,
Alexander



Hello,

24.02.2023 19:00, Alexander Lakhin wrote:
> Hello Kyotaro-san,
> 24.02.2023 10:08, Kyotaro Horiguchi wrote:
>> Thanks for the report!  We didn't consider the case that variable is
>> changed in-transaction.
>>
>>
>> What are your thoughts on this?
> What if we implement assign_pgstat_fetch_consistency() and execute
> pgstat_clear_snapshot() in it when the mode switch happens (to avoid
> possible inconsistency between the GUC and the current pgstat state)?

Please look at the attached patch, which implements that idea and also
includes a simple test. The behavior change doesn't affect any of existing
tests and seems natural to me, so I haven't added a note about it in the docs.

Best regards,
Alexander
Attachment
At Thu, 27 Apr 2023 13:00:00 +0300, Alexander Lakhin <exclusion@gmail.com> wrote in
> Hello,
>
> 24.02.2023 19:00, Alexander Lakhin wrote:
> > Hello Kyotaro-san,
> > 24.02.2023 10:08, Kyotaro Horiguchi wrote:
> >> Thanks for the report!  We didn't consider the case that variable is
> >> changed in-transaction.
> >>
> >>
> >> What are your thoughts on this?
> > What if we implement assign_pgstat_fetch_consistency() and execute
> > pgstat_clear_snapshot() in it when the mode switch happens (to avoid
> > possible inconsistency between the GUC and the current pgstat state)?

Thank you for the patch. (Sorry, I forgot about this..)

> Please look at the attached patch, which implements that idea and also
> includes a simple test. The behavior change doesn't affect any of
> existing
> tests and seems natural to me, so I haven't added a note about it in
> the docs.

I tried to keep existing snapshots as long as possible, but
considering how rarely users change the variable and the lack of
necessity to to preserve the snapshots, it's simply easier to clear
them with each variable change.

Nonetheless, I'm a bit concerned about making a direct call to
pgstat_clear_snapshot() from the assign callback, it might be fine for
now, but I worry that it could an issue later on.

So, how about just settin a trigger that causes a snapshot clearing
prior to the next use, like the attached?

As for the test, I can't come up with a better one, but I think the
comment should explain its intention more clearly.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment
Hello Kyotaro-san,

28.04.2023 04:41, Kyotaro Horiguchi wrote:
> Nonetheless, I'm a bit concerned about making a direct call to
> pgstat_clear_snapshot() from the assign callback, it might be fine for
> now, but I worry that it could an issue later on.
>
> So, how about just settin a trigger that causes a snapshot clearing
> prior to the next use, like the attached?

I'm agree with the change — it's still rather simple and fixes the issue.
Maybe you would also find appropriate to update the comment for
pgstat_get_stat_snapshot_timestamp().
Perhaps add something like?:
Note that the snapshot may be cleared here, if requested.

> As for the test, I can't come up with a better one, but I think the
> comment should explain its intention more clearly.

I'd hesitated to mention a crash in the comment because I've seen
assertion failures only (and no crash with a non-assert build), so I
thought that the test should illustrate the new behavior in the first place.
But I'm not opposed to your comment change, maybe it's more prominent indeed.

Thank you for the fix!

Best regards,
Alexander



On Fri, Apr 28, 2023 at 07:00:01AM +0300, Alexander Lakhin wrote:
> 28.04.2023 04:41, Kyotaro Horiguchi wrote:
>> Nonetheless, I'm a bit concerned about making a direct call to
>> pgstat_clear_snapshot() from the assign callback, it might be fine for
>> now, but I worry that it could an issue later on.
>>
>> So, how about just settin a trigger that causes a snapshot clearing
>> prior to the next use, like the attached?
>
> I'm agree with the change — it's still rather simple and fixes the issue.
> Maybe you would also find appropriate to update the comment for
> pgstat_get_stat_snapshot_timestamp().
> Perhaps add something like?:
> Note that the snapshot may be cleared here, if requested.

FWIW, I was looking at this patch and this proposal of relying on a
static flag to conditionally clear a snapshot looks rather brittle by
design to me because this relies on quite a few assumptions that the
snapshot will always be cleared when necessary, as proved by the two
code paths of pgstat.c patched where each gain a check on
clear_existing_snapshot.  The AtEOXacts for pgstat when doing an
abort/commit/prepare would guarantee that the flag is cleared, still
that's not really exciting.  What is the problem in forcing a snapshot
cleanup each time stats_fetch_consistency is switched to a new value?
Somebody using SET LOCAL on that means, at least it sounds to me as
such, that they don't really care about the current snapshot contents
so they'd better be wiped out.  And the cleanup timing does not really
seem to matter much.

>> As for the test, I can't come up with a better one, but I think the
>> comment should explain its intention more clearly.
>
> I'd hesitated to mention a crash in the comment because I've seen
> assertion failures only (and no crash with a non-assert build), so I
> thought that the test should illustrate the new behavior in the first place.
> But I'm not opposed to your comment change, maybe it's more prominent indeed.

If the issue is fixed, there would be no crash.
--
Michael

Attachment
At Fri, 28 Apr 2023 13:37:23 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Apr 28, 2023 at 07:00:01AM +0300, Alexander Lakhin wrote:
> > 28.04.2023 04:41, Kyotaro Horiguchi wrote:
> >> Nonetheless, I'm a bit concerned about making a direct call to
> >> pgstat_clear_snapshot() from the assign callback, it might be fine for
> >> now, but I worry that it could an issue later on.
> >> 
> >> So, how about just settin a trigger that causes a snapshot clearing
> >> prior to the next use, like the attached?
> > 
> > I'm agree with the change ― it's still rather simple and fixes the issue.
> > Maybe you would also find appropriate to update the comment for
> > pgstat_get_stat_snapshot_timestamp().
> > Perhaps add something like?:
> > Note that the snapshot may be cleared here, if requested.
> 
> FWIW, I was looking at this patch and this proposal of relying on a
> static flag to conditionally clear a snapshot looks rather brittle by
> design to me because this relies on quite a few assumptions that the
> snapshot will always be cleared when necessary, as proved by the two
> code paths of pgstat.c patched where each gain a check on

pgstat_get_stat_snapshot_timestmp() necessarily requires that. Just
for seeming consistency.  The only significant part is
gstat_prep_snapshot, which is always called when the caller wants a
snapshot.

> clear_existing_snapshot.  The AtEOXacts for pgstat when doing an
> abort/commit/prepare would guarantee that the flag is cleared, still
> that's not really exciting.  What is the problem in forcing a snapshot
> cleanup each time stats_fetch_consistency is switched to a new value?

Just I wanted not to do that much in the guc callback including memory
context operations. If it is compeltly safe, I don't object just
clearing snapshots in the callback.

> Somebody using SET LOCAL on that means, at least it sounds to me as
> such, that they don't really care about the current snapshot contents
> so they'd better be wiped out.  And the cleanup timing does not really
> seem to matter much.
> 
> >> As for the test, I can't come up with a better one, but I think the
> >> comment should explain its intention more clearly.
> > 
> > I'd hesitated to mention a crash in the comment because I've seen
> > assertion failures only (and no crash with a non-assert build), so I
> > thought that the test should illustrate the new behavior in the first place.
> > But I'm not opposed to your comment change, maybe it's more prominent indeed.
> 
> If the issue is fixed, there would be no crash.

If crash is not appropriate there, I am fine with "changing the
variable clears snapshot".

Regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



At Fri, 28 Apr 2023 15:04:04 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Fri, 28 Apr 2023 13:37:23 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> > FWIW, I was looking at this patch and this proposal of relying on a
> > static flag to conditionally clear a snapshot looks rather brittle by
> > design to me because this relies on quite a few assumptions that the
> > snapshot will always be cleared when necessary, as proved by the two
> > code paths of pgstat.c patched where each gain a check on
> 
> pgstat_get_stat_snapshot_timestmp() necessarily requires that. Just

Sorry, this is actually the opposite.

pgstat_get_stat_snapshot_timestmp() doesn't necessarily requires that. Just

> for seeming consistency.  The only significant part is
> gstat_prep_snapshot, which is always called when the caller wants a
> snapshot.

Regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Fri, Apr 28, 2023 at 03:04:04PM +0900, Kyotaro Horiguchi wrote:
> Just I wanted not to do that much in the guc callback including memory
> context operations. If it is compeltly safe, I don't object just
> clearing snapshots in the callback.

I vaguely recalled some memory context deletions done in one of the
assign callbacks, like the one for the plan resets, but it doesn't
seem to be the case, looking closely..  Hm.
--
Michael

Attachment
At Fri, 28 Apr 2023 15:48:16 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Apr 28, 2023 at 03:04:04PM +0900, Kyotaro Horiguchi wrote:
> > Just I wanted not to do that much in the guc callback including memory
> > context operations. If it is compeltly safe, I don't object just
> > clearing snapshots in the callback.
> 
> I vaguely recalled some memory context deletions done in one of the
> assign callbacks, like the one for the plan resets, but it doesn't
> seem to be the case, looking closely..  Hm.

For the the record, I'm not saying that it is dangerous to clear
snapshots directly in the callback. In fact, as I mentioned earlier, I
believe there is no issue with that.  But, I believe it is simpler
that the actual work is separate from the callback path since we don't
need to worry about when the guc-callback will be called.

Regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Fri, Apr 28, 2023 at 04:07:52PM +0900, Kyotaro Horiguchi wrote:
> For the the record, I'm not saying that it is dangerous to clear
> snapshots directly in the callback. In fact, as I mentioned earlier, I
> believe there is no issue with that.  But, I believe it is simpler
> that the actual work is separate from the callback path since we don't
> need to worry about when the guc-callback will be called.

pgstat_clear_backend_activity_snapshot() and pgstat_assert_is_up() are
the two points of contention here.  The former could be called in a
non-backend context, which would be an incorrect use of this API.  I
am not completely sure that pgstat_assert_is_up() would be always
correct, as well, though I have not been able to see a problem.

Anyway, sorry for the late replay, it took me some time to study this
subsystem and understand what's going on here (like why we have fixed
snapshots but these don't list a timestamp, for example).  At the end,
I think that I'm OK with your suggestion to just force a snapshot
cleanup each time the GUC is changed, mainly because that's simple to
understand, and do the clear based on the timing where the next
snapshot build would happen.  I would be curious to hear arguments if
there would be a need for a more complex mechanism, but that seems
overkill to me as this is mainly here when switching out of the
"snapshot" mode.  A commit/abort would force a cleanup of the
snapshot, clearing the flag, as well.

I would add a small note in the docs about this behavior.  Another
thing is to add a few pg_stat_get_snapshot_timestamp() after building
a snapshot for each mode tested.

While on it, I have noticed that postgresql.conf.sample does not list
the values available for stats_fetch_consistency.  These had better be
added to the sample file, no?

At the end, I am finishing with the attached, without the sample file
part.  Thoughts?
--
Michael

Attachment
On Mon, May 01, 2023 at 02:18:00PM +0900, Michael Paquier wrote:
> At the end, I am finishing with the attached, without the sample file
> part.  Thoughts?

FYI, I am planning to get this stuff done next week after the next set
of minor releases gets tagged (time was a bit scarce here this week to
get that properly done).  If there are no objections, of course.
--
Michael

Attachment
On Sat, May 06, 2023 at 12:28:51PM +0900, Michael Paquier wrote:
> FYI, I am planning to get this stuff done next week after the next set
> of minor releases gets tagged (time was a bit scarce here this week to
> get that properly done).  If there are no objections, of course.

Okay, so applied and backpatched down to 15 after a second review of
the whole.
--
Michael

Attachment