Thread: Snapshot related assert failure on skink

Snapshot related assert failure on skink

From
Andres Freund
Date:
Hi,

I just saw a BF failure on skink (valgrind) that asserts out.

Check the 002_compare_backups failure in:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-04%2017%3A35%3A01

TRAP: failed Assert("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)"), File:
"../pgsql/src/backend/storage/ipc/procarray.c",Line: 2132, PID: 3115649
 
postgres: pitr2: bf postgres [local] SELECT(ExceptionalCondition+0x5f) [0x6e45e0]
postgres: pitr2: bf postgres [local] SELECT(+0x467c3d) [0x56fc3d]
postgres: pitr2: bf postgres [local] SELECT(GetSnapshotData+0x60) [0x570fbd]
postgres: pitr2: bf postgres [local] SELECT(GetNonHistoricCatalogSnapshot+0x53) [0x7285c5]
postgres: pitr2: bf postgres [local] SELECT(GetCatalogSnapshot+0x28) [0x72969b]
postgres: pitr2: bf postgres [local] SELECT(systable_beginscan+0xc9) [0x252602]
postgres: pitr2: bf postgres [local] SELECT(+0x5c1e56) [0x6c9e56]
postgres: pitr2: bf postgres [local] SELECT(+0x5c20e9) [0x6ca0e9]
postgres: pitr2: bf postgres [local] SELECT(SearchCatCache+0x18) [0x6ca71c]
postgres: pitr2: bf postgres [local] SELECT(SearchSysCache+0x21) [0x6deeae]
postgres: pitr2: bf postgres [local] SELECT(GetSysCacheOid+0x23) [0x6df219]
postgres: pitr2: bf postgres [local] SELECT(get_namespace_oid+0x2f) [0x2dbae0]
postgres: pitr2: bf postgres [local] SELECT(+0x1d3c13) [0x2dbc13]
postgres: pitr2: bf postgres [local] SELECT(+0x1d3e1b) [0x2dbe1b]
postgres: pitr2: bf postgres [local] SELECT(+0x1d3e72) [0x2dbe72]
postgres: pitr2: bf postgres [local] SELECT(FuncnameGetCandidates+0xaf) [0x2ddef1]
postgres: pitr2: bf postgres [local] SELECT(func_get_detail+0x96) [0x4bf521]
postgres: pitr2: bf postgres [local] SELECT(ParseFuncOrColumn+0x670) [0x4c13b9]
postgres: pitr2: bf postgres [local] SELECT(+0x3b1123) [0x4b9123]
postgres: pitr2: bf postgres [local] SELECT(+0x3b12ec) [0x4b92ec]
postgres: pitr2: bf postgres [local] SELECT(+0x3b2c60) [0x4bac60]
postgres: pitr2: bf postgres [local] SELECT(+0x3b12dc) [0x4b92dc]
postgres: pitr2: bf postgres [local] SELECT(transformExpr+0x20) [0x4b8faa]
postgres: pitr2: bf postgres [local] SELECT(transformTargetEntry+0x7c) [0x4cf4f9]
postgres: pitr2: bf postgres [local] SELECT(transformTargetList+0x96) [0x4cf5a6]
postgres: pitr2: bf postgres [local] SELECT(+0x39a5b9) [0x4a25b9]
postgres: pitr2: bf postgres [local] SELECT(transformStmt+0x141) [0x4a304b]
postgres: pitr2: bf postgres [local] SELECT(+0x39c390) [0x4a4390]
postgres: pitr2: bf postgres [local] SELECT(transformTopLevelStmt+0x19) [0x4a43cc]
postgres: pitr2: bf postgres [local] SELECT(parse_analyze_fixedparams+0x56) [0x4a4424]
postgres: pitr2: bf postgres [local] SELECT(pg_analyze_and_rewrite_fixedparams+0x40) [0x59b021]
postgres: pitr2: bf postgres [local] SELECT(+0x4937f6) [0x59b7f6]
postgres: pitr2: bf postgres [local] SELECT(PostgresMain+0x940) [0x59db54]
postgres: pitr2: bf postgres [local] SELECT(BackendMain+0x59) [0x596b38]
postgres: pitr2: bf postgres [local] SELECT(postmaster_child_launch+0xeb) [0x4ee282]
postgres: pitr2: bf postgres [local] SELECT(+0x3e9271) [0x4f1271]
postgres: pitr2: bf postgres [local] SELECT(+0x3eadb6) [0x4f2db6]
postgres: pitr2: bf postgres [local] SELECT(PostmasterMain+0x118b) [0x4f43b2]
postgres: pitr2: bf postgres [local] SELECT(main+0x1e2) [0x421da2]
/lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x5a27ca8]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x5a27d65]
postgres: pitr2: bf postgres [local] SELECT(_start+0x21) [0x1eb421]
2025-03-04 19:11:37.941 UTC [3104796][postmaster][:0] LOG:  client backend (PID 3115649) was terminated by signal 6:
Aborted
2025-03-04 19:11:37.941 UTC [3104796][postmaster][:0] DETAIL:  Failed process was running: SELECT NOT
pg_is_in_recovery();
2025-03-04 19:11:37.943 UTC [3104796][postmaster][:0] LOG:  terminating any other active server processes


The code directly in question hasn't changed in a few years. I haven't seen
this crash before.  The only recent-ish changes in the wider area are:

commit 952365cded6
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date:   2024-12-23 12:42:39 +0200
 
    Remove unnecessary GetTransactionSnapshot() calls

commit 578a7fe7b6f8484f6d7caa2fda288abb3fe87aa0
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date:   2024-12-21 23:42:39 +0200
 
    Update TransactionXmin when MyProc->xmin is updated

but neither is immediately yelling "here" to me.


I don't really see how pg_combinebackup could lead to this issue either.


One potentially interesting bit is that the statement is running *during* a
shutdown checkpoint:

2025-03-04 19:11:37.377 UTC [3106505][checkpointer][:0] LOG:  checkpoint starting: end-of-recovery immediate wait
2025-03-04 19:11:37.454 UTC [3115649][client backend][2/2:0] LOG:  statement: SELECT NOT pg_is_in_recovery();
TRAP: failed Assert("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)"), File:
"../pgsql/src/backend/storage/ipc/procarray.c",Line: 2132, PID: 3115649
 
...
2025-03-04 19:11:37.585 UTC [3106505][checkpointer][:0] LOG:  checkpoint complete: wrote 0 buffers (0.0%), wrote 3 SLRU
buffers;0 WAL file(s) added, 0 removed, 0 recycled; write=0.039 s, sync=0.010 s, total=0.198 s; sync files=2,
longest=0.006s, average=0.004 s; distance=16384 kB, estimate=16384 kB; lsn=0/5000060, redo lsn=0/5000060
 


I wonder if the problem is related to a transaction first using a snapshot
started in recovery and then a "normal" snapshot?

Greetings,

Andres Freund



Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:
On 3/4/25 23:25, Andres Freund wrote:
> Hi,
> 
> I just saw a BF failure on skink (valgrind) that asserts out.
> 
> Check the 002_compare_backups failure in:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-04%2017%3A35%3A01
> 
> TRAP: failed Assert("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)"), File:
"../pgsql/src/backend/storage/ipc/procarray.c",Line: 2132, PID: 3115649
 
> postgres: pitr2: bf postgres [local] SELECT(ExceptionalCondition+0x5f) [0x6e45e0]
> postgres: pitr2: bf postgres [local] SELECT(+0x467c3d) [0x56fc3d]
> postgres: pitr2: bf postgres [local] SELECT(GetSnapshotData+0x60) [0x570fbd]
> postgres: pitr2: bf postgres [local] SELECT(GetNonHistoricCatalogSnapshot+0x53) [0x7285c5]
> postgres: pitr2: bf postgres [local] SELECT(GetCatalogSnapshot+0x28) [0x72969b]
> postgres: pitr2: bf postgres [local] SELECT(systable_beginscan+0xc9) [0x252602]
> postgres: pitr2: bf postgres [local] SELECT(+0x5c1e56) [0x6c9e56]
> postgres: pitr2: bf postgres [local] SELECT(+0x5c20e9) [0x6ca0e9]
> postgres: pitr2: bf postgres [local] SELECT(SearchCatCache+0x18) [0x6ca71c]
> postgres: pitr2: bf postgres [local] SELECT(SearchSysCache+0x21) [0x6deeae]
> postgres: pitr2: bf postgres [local] SELECT(GetSysCacheOid+0x23) [0x6df219]
> postgres: pitr2: bf postgres [local] SELECT(get_namespace_oid+0x2f) [0x2dbae0]
> postgres: pitr2: bf postgres [local] SELECT(+0x1d3c13) [0x2dbc13]
> postgres: pitr2: bf postgres [local] SELECT(+0x1d3e1b) [0x2dbe1b]
> postgres: pitr2: bf postgres [local] SELECT(+0x1d3e72) [0x2dbe72]
> postgres: pitr2: bf postgres [local] SELECT(FuncnameGetCandidates+0xaf) [0x2ddef1]
> postgres: pitr2: bf postgres [local] SELECT(func_get_detail+0x96) [0x4bf521]
> postgres: pitr2: bf postgres [local] SELECT(ParseFuncOrColumn+0x670) [0x4c13b9]
> postgres: pitr2: bf postgres [local] SELECT(+0x3b1123) [0x4b9123]
> postgres: pitr2: bf postgres [local] SELECT(+0x3b12ec) [0x4b92ec]
> postgres: pitr2: bf postgres [local] SELECT(+0x3b2c60) [0x4bac60]
> postgres: pitr2: bf postgres [local] SELECT(+0x3b12dc) [0x4b92dc]
> postgres: pitr2: bf postgres [local] SELECT(transformExpr+0x20) [0x4b8faa]
> postgres: pitr2: bf postgres [local] SELECT(transformTargetEntry+0x7c) [0x4cf4f9]
> postgres: pitr2: bf postgres [local] SELECT(transformTargetList+0x96) [0x4cf5a6]
> postgres: pitr2: bf postgres [local] SELECT(+0x39a5b9) [0x4a25b9]
> postgres: pitr2: bf postgres [local] SELECT(transformStmt+0x141) [0x4a304b]
> postgres: pitr2: bf postgres [local] SELECT(+0x39c390) [0x4a4390]
> postgres: pitr2: bf postgres [local] SELECT(transformTopLevelStmt+0x19) [0x4a43cc]
> postgres: pitr2: bf postgres [local] SELECT(parse_analyze_fixedparams+0x56) [0x4a4424]
> postgres: pitr2: bf postgres [local] SELECT(pg_analyze_and_rewrite_fixedparams+0x40) [0x59b021]
> postgres: pitr2: bf postgres [local] SELECT(+0x4937f6) [0x59b7f6]
> postgres: pitr2: bf postgres [local] SELECT(PostgresMain+0x940) [0x59db54]
> postgres: pitr2: bf postgres [local] SELECT(BackendMain+0x59) [0x596b38]
> postgres: pitr2: bf postgres [local] SELECT(postmaster_child_launch+0xeb) [0x4ee282]
> postgres: pitr2: bf postgres [local] SELECT(+0x3e9271) [0x4f1271]
> postgres: pitr2: bf postgres [local] SELECT(+0x3eadb6) [0x4f2db6]
> postgres: pitr2: bf postgres [local] SELECT(PostmasterMain+0x118b) [0x4f43b2]
> postgres: pitr2: bf postgres [local] SELECT(main+0x1e2) [0x421da2]
> /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x5a27ca8]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x5a27d65]
> postgres: pitr2: bf postgres [local] SELECT(_start+0x21) [0x1eb421]
> 2025-03-04 19:11:37.941 UTC [3104796][postmaster][:0] LOG:  client backend (PID 3115649) was terminated by signal 6:
Aborted
> 2025-03-04 19:11:37.941 UTC [3104796][postmaster][:0] DETAIL:  Failed process was running: SELECT NOT
pg_is_in_recovery();
> 2025-03-04 19:11:37.943 UTC [3104796][postmaster][:0] LOG:  terminating any other active server processes
> 
> 
> The code directly in question hasn't changed in a few years. I haven't seen
> this crash before.  The only recent-ish changes in the wider area are:
> 
> commit 952365cded6
> Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
> Date:   2024-12-23 12:42:39 +0200
>  
>     Remove unnecessary GetTransactionSnapshot() calls
> 
> commit 578a7fe7b6f8484f6d7caa2fda288abb3fe87aa0
> Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
> Date:   2024-12-21 23:42:39 +0200
>  
>     Update TransactionXmin when MyProc->xmin is updated
> 
> but neither is immediately yelling "here" to me.
> 
> 
> I don't really see how pg_combinebackup could lead to this issue either.
> 
> 
> One potentially interesting bit is that the statement is running *during* a
> shutdown checkpoint:
> 
> 2025-03-04 19:11:37.377 UTC [3106505][checkpointer][:0] LOG:  checkpoint starting: end-of-recovery immediate wait
> 2025-03-04 19:11:37.454 UTC [3115649][client backend][2/2:0] LOG:  statement: SELECT NOT pg_is_in_recovery();
> TRAP: failed Assert("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)"), File:
"../pgsql/src/backend/storage/ipc/procarray.c",Line: 2132, PID: 3115649
 
> ...
> 2025-03-04 19:11:37.585 UTC [3106505][checkpointer][:0] LOG:  checkpoint complete: wrote 0 buffers (0.0%), wrote 3
SLRUbuffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=0.039 s, sync=0.010 s, total=0.198 s; sync files=2,
longest=0.006s, average=0.004 s; distance=16384 kB, estimate=16384 kB; lsn=0/5000060, redo lsn=0/5000060
 
> 
> 
> I wonder if the problem is related to a transaction first using a snapshot
> started in recovery and then a "normal" snapshot?
> 

Hi,

Alexander Lakhin pointed me to this thread after I posted about hitting
the same assert [1].

[1]
https://www.postgresql.org/message-id/3aee3535-184d-423e-a5eb-05161e8b5617%40vondra.me

It very much seems like the same issue - the backtrace seems to match
exactly, except for the query executed ofc. The exact mechanism may be a
big different - in my stress test (restarting the primary/standby in
different ways, while running pgbench on both instances, etc.) there's
no end-of-recovery checkpoint, for example.

In my tests, 100% of the failures hit the assert immediately after these
messages in the server log (all failures are on the standby)

  LOG:  database system is ready to accept read-only connections
  LOG:  started streaming WAL from primary at 2/86000000 on timeline 1

And by "immediately" I mean it's the very next thing in the log. It
seems something happens (to the RecentXmin?) when opening the standby
for read-only queries that breaks an existing snapshot ... I'm not sure
if the end-of-recovery checkpoint does something similar.

I'm still fiddling with the script, trying to increase the probability
of the (apparent) race condition. On one machine (old Xeon) I can hit it
very easily/reliably, while on a different machine (new Ryzen) it's very
rare. I don't know if that's due to difference in speed of the CPU, or
fewer cores, ... I guess it changes the timing just enough.

I've also tried running the stress test on PG17, and I'm yet to see a
single failure there. Not even on the xeon machine, that hits it
reliably on 18. So this seems to be a PG18-only issue.

If needed, I can try adding more logging, or test a patch.


regards

-- 
Tomas Vondra




Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:
On 3/17/25 12:36, Tomas Vondra wrote:
> ...
> 
> I'm still fiddling with the script, trying to increase the probability
> of the (apparent) race condition. On one machine (old Xeon) I can hit it
> very easily/reliably, while on a different machine (new Ryzen) it's very
> rare. I don't know if that's due to difference in speed of the CPU, or
> fewer cores, ... I guess it changes the timing just enough.
> 
> I've also tried running the stress test on PG17, and I'm yet to see a
> single failure there. Not even on the xeon machine, that hits it
> reliably on 18. So this seems to be a PG18-only issue.
> 

And of course, the moment I sent this, I got a failure on 17 too. But
it's seems much harder to hit (compared to 18).

regards

-- 
Tomas Vondra




Re: Snapshot related assert failure on skink

From
Thomas Munro
Date:
On Tue, Mar 18, 2025 at 12:59 AM Tomas Vondra <tomas@vondra.me> wrote:
> On 3/17/25 12:36, Tomas Vondra wrote:
> > I'm still fiddling with the script, trying to increase the probability
> > of the (apparent) race condition. On one machine (old Xeon) I can hit it
> > very easily/reliably, while on a different machine (new Ryzen) it's very
> > rare. I don't know if that's due to difference in speed of the CPU, or
> > fewer cores, ... I guess it changes the timing just enough.
> >
> > I've also tried running the stress test on PG17, and I'm yet to see a
> > single failure there. Not even on the xeon machine, that hits it
> > reliably on 18. So this seems to be a PG18-only issue.
> >
>
> And of course, the moment I sent this, I got a failure on 17 too. But
> it's seems much harder to hit (compared to 18).

Could there be a connection to this commit?

commit 119c23eb9819213551cbe7e7665c8b493c59ceee
Author: Nathan Bossart <nathan@postgresql.org>
Date:   Tue Sep 5 13:59:06 2023 -0700

    Replace known_assigned_xids_lck with memory barriers.



Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:

On 3/17/25 13:18, Thomas Munro wrote:
> On Tue, Mar 18, 2025 at 12:59 AM Tomas Vondra <tomas@vondra.me> wrote:
>> On 3/17/25 12:36, Tomas Vondra wrote:
>>> I'm still fiddling with the script, trying to increase the probability
>>> of the (apparent) race condition. On one machine (old Xeon) I can hit it
>>> very easily/reliably, while on a different machine (new Ryzen) it's very
>>> rare. I don't know if that's due to difference in speed of the CPU, or
>>> fewer cores, ... I guess it changes the timing just enough.
>>>
>>> I've also tried running the stress test on PG17, and I'm yet to see a
>>> single failure there. Not even on the xeon machine, that hits it
>>> reliably on 18. So this seems to be a PG18-only issue.
>>>
>>
>> And of course, the moment I sent this, I got a failure on 17 too. But
>> it's seems much harder to hit (compared to 18).
> 
> Could there be a connection to this commit?
> 
> commit 119c23eb9819213551cbe7e7665c8b493c59ceee
> Author: Nathan Bossart <nathan@postgresql.org>
> Date:   Tue Sep 5 13:59:06 2023 -0700
> 
>     Replace known_assigned_xids_lck with memory barriers.

Doesn't seem to be the case. I reverted this (on master), and I still
get the assert failures (roughly the same number / loop).


regards

-- 
Tomas Vondra




Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:
I kept stress-testing this, and while the frequency massively increased
on PG18, I managed to reproduce this all the way back to PG14. I see
~100x more corefiles on PG18.

That is not a proof the issue was introduced in PG14, maybe it's just
the assert that was added there or something. Or maybe there's another
bug in PG18, making the impact worse.

But I'd suspect this is a bug in

commit 623a9ba79bbdd11c5eccb30b8bd5c446130e521c
Author: Andres Freund <andres@anarazel.de>
Date:   Mon Aug 17 21:07:10 2020 -0700

    snapshot scalability: cache snapshots using a xact completion counter.

    Previous commits made it faster/more scalable to compute snapshots.
But not
    building a snapshot is still faster. Now that GetSnapshotData() does not
    maintain RecentGlobal* anymore, that is actually not too hard:

    ...

which is the main snapshot-related change in PG14. I added some logging
to all places changing TransactionXmin and RecendXmin in procarray.c,
and all the failures look like this:

------------------------
1742348688.896 2025-03-19 02:44:48.896 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:1] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotData TransactionXmin = 209485

1742348688.896 2025-03-19 02:44:48.896 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:2] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotData RecentXmin = 209485

1742348688.902 2025-03-19 02:44:48.902 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:3] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotDataReuse TransactionXmin = 209485

1742348688.902 2025-03-19 02:44:48.902 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:4] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotDataReuse RecentXmin = 209485

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:5] [2025-03-19 02:44:48 CET] [13/2/0]
LOG:  GetSnapshotData TransactionXmin = 209508

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:6] [2025-03-19 02:44:48 CET] [13/2/0]
STATEMENT:  SELECT abalance FROM pgbench_accounts WHERE aid = 987279;

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:7] [2025-03-19 02:44:48 CET] [13/2/0]
LOG:  GetSnapshotData RecentXmin = 209508

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:8] [2025-03-19 02:44:48 CET] [13/2/0]
STATEMENT:  SELECT abalance FROM pgbench_accounts WHERE aid = 987279;

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:9] [2025-03-19 02:44:48 CET] [13/2/0]
LOG:  GetSnapshotDataReuse RecentXmin = 209485 at character 22

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:10] [2025-03-19 02:44:48 CET] [13/2/0]
STATEMENT:  SELECT abalance FROM pgbench_accounts WHERE aid = 987279;
TRAP: failed Assert("TransactionIdPrecedesOrEquals(TransactionXmin,
RecentXmin)"), File: "procarray.c", Line: 2136, PID: 2001770
------------------------

These are all logs for pgbench backend PID 2001770. First, there's a
call GetSnapshotData and then GetSnapshotDataReuse, which sets both
variables to 209485.

    LOG:  GetSnapshotData TransactionXmin = 209485
    LOG:  GetSnapshotData RecentXmin = 209485
    LOG:  GetSnapshotDataReuse TransactionXmin = 209485
    LOG:  GetSnapshotDataReuse RecentXmin = 209485

I believe this is some very early setup, because both log lines have

    [%b:%a] = [client backend:[unknown]]

And then we get another pair of these calls, where GetSnapshotData
advances the XIDs to 209508. But the following GetSnapshotDataReuse call
updates only RecentXmin to 209485, and that is before TrasactionXmin.

Clearly, GetSnapshotDataReuse skips setting the TransactionXmin because
it still has valid XID in MyProc->xmin:

    if (!TransactionIdIsValid(MyProc->xmin))
        MyProc->xmin = TransactionXmin = snapshot->xmin;

I don't know if that's correct, and it seems a bit weird the RecentXmin
moves backwards.

But as I mentioned before, it's a bit weird this happens right after the
replica gets ready to handle read-only queries:

------------------------
1742348688.825 2025-03-19 02:44:48.825 CET [1981188] [startup:]
[67da2188.1e3b04:9] [2025-03-19 02:44:40 CET] [138/0/0] LOG:  consistent
recovery state reached at 1/5FA884E8

1742348688.825 2025-03-19 02:44:48.825 CET [1981188] [startup:]
[67da2188.1e3b04:10] [2025-03-19 02:44:40 CET] [138/0/0] LOG:  record
with incorrect prev-link 0/810 at 1/5FA884E8

1742348688.825 2025-03-19 02:44:48.825 CET [1981023] [postmaster:]
[67da2188.1e3a5f:5] [2025-03-19 02:44:40 CET] [/0] LOG:  database system
is ready to accept read-only connections

1742348688.837 2025-03-19 02:44:48.837 CET [2001731] [walreceiver:]
[67da2190.1e8b43:1] [2025-03-19 02:44:48 CET] [/0] LOG:  started
streaming WAL from primary at 1/5F000000 on timeline 1
------------------------

But there are multiple other concurrent pgbench backends, and those
don't have the same issue. None of them seems to advance the XID to 209508.


regards

-- 
Tomas Vondra



Re: Snapshot related assert failure on skink

From
Heikki Linnakangas
Date:
On 19/03/2025 04:22, Tomas Vondra wrote:
> I kept stress-testing this, and while the frequency massively increased
> on PG18, I managed to reproduce this all the way back to PG14. I see
> ~100x more corefiles on PG18.
> 
> That is not a proof the issue was introduced in PG14, maybe it's just
> the assert that was added there or something. Or maybe there's another
> bug in PG18, making the impact worse.
> 
> But I'd suspect this is a bug in
> 
> commit 623a9ba79bbdd11c5eccb30b8bd5c446130e521c
> Author: Andres Freund <andres@anarazel.de>
> Date:   Mon Aug 17 21:07:10 2020 -0700
> 
>      snapshot scalability: cache snapshots using a xact completion counter.
> 
>      Previous commits made it faster/more scalable to compute snapshots.
> But not
>      building a snapshot is still faster. Now that GetSnapshotData() does not
>      maintain RecentGlobal* anymore, that is actually not too hard:
> 
>      ...

Looking at the code, shouldn't ExpireAllKnownAssignedTransactionIds() 
and ExpireOldKnownAssignedTransactionIds() update xactCompletionCount? 
This can happen during hot standby:

1. Backend acquires snapshot A with xmin 1000
2. Startup process calls ExpireOldKnownAssignedTransactionIds(),
3. Backend acquires snapshot B with xmin 1050
4. Backend releases snapshot A, updating TransactionXmin to 1050
5. Backend acquires new snapshot, calls GetSnapshotDataReuse(), reusing 
snapshot A's data.

Because xactCompletionCount is not updated in step 2, the 
GetSnapshotDataReuse() call will reuse the snapshot A. But snapshot A 
has a lower xmin.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:
On 3/19/25 08:17, Heikki Linnakangas wrote:
> On 19/03/2025 04:22, Tomas Vondra wrote:
>> I kept stress-testing this, and while the frequency massively increased
>> on PG18, I managed to reproduce this all the way back to PG14. I see
>> ~100x more corefiles on PG18.
>>
>> That is not a proof the issue was introduced in PG14, maybe it's just
>> the assert that was added there or something. Or maybe there's another
>> bug in PG18, making the impact worse.
>>
>> But I'd suspect this is a bug in
>>
>> commit 623a9ba79bbdd11c5eccb30b8bd5c446130e521c
>> Author: Andres Freund <andres@anarazel.de>
>> Date:   Mon Aug 17 21:07:10 2020 -0700
>>
>>      snapshot scalability: cache snapshots using a xact completion
>> counter.
>>
>>      Previous commits made it faster/more scalable to compute snapshots.
>> But not
>>      building a snapshot is still faster. Now that GetSnapshotData()
>> does not
>>      maintain RecentGlobal* anymore, that is actually not too hard:
>>
>>      ...
> 
> Looking at the code, shouldn't ExpireAllKnownAssignedTransactionIds()
> and ExpireOldKnownAssignedTransactionIds() update xactCompletionCount?
> This can happen during hot standby:
> 
> 1. Backend acquires snapshot A with xmin 1000
> 2. Startup process calls ExpireOldKnownAssignedTransactionIds(),
> 3. Backend acquires snapshot B with xmin 1050
> 4. Backend releases snapshot A, updating TransactionXmin to 1050
> 5. Backend acquires new snapshot, calls GetSnapshotDataReuse(), reusing
> snapshot A's data.
> 
> Because xactCompletionCount is not updated in step 2, the
> GetSnapshotDataReuse() call will reuse the snapshot A. But snapshot A
> has a lower xmin.
> 

Could be. As an experiment I added xactCompletionCount advance to the
two functions you mentioned, and I ran the stress test again. I haven't
seen any failures so far, after ~1000 runs. Without the patch this
produced ~200 failures/core files.


regards

-- 
Tomas Vondra




Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:
On 3/19/25 13:27, Tomas Vondra wrote:
> On 3/19/25 08:17, Heikki Linnakangas wrote:
>> On 19/03/2025 04:22, Tomas Vondra wrote:
>>> I kept stress-testing this, and while the frequency massively increased
>>> on PG18, I managed to reproduce this all the way back to PG14. I see
>>> ~100x more corefiles on PG18.
>>>
>>> That is not a proof the issue was introduced in PG14, maybe it's just
>>> the assert that was added there or something. Or maybe there's another
>>> bug in PG18, making the impact worse.
>>>
>>> But I'd suspect this is a bug in
>>>
>>> commit 623a9ba79bbdd11c5eccb30b8bd5c446130e521c
>>> Author: Andres Freund <andres@anarazel.de>
>>> Date:   Mon Aug 17 21:07:10 2020 -0700
>>>
>>>      snapshot scalability: cache snapshots using a xact completion
>>> counter.
>>>
>>>      Previous commits made it faster/more scalable to compute snapshots.
>>> But not
>>>      building a snapshot is still faster. Now that GetSnapshotData()
>>> does not
>>>      maintain RecentGlobal* anymore, that is actually not too hard:
>>>
>>>      ...
>>
>> Looking at the code, shouldn't ExpireAllKnownAssignedTransactionIds()
>> and ExpireOldKnownAssignedTransactionIds() update xactCompletionCount?
>> This can happen during hot standby:
>>
>> 1. Backend acquires snapshot A with xmin 1000
>> 2. Startup process calls ExpireOldKnownAssignedTransactionIds(),
>> 3. Backend acquires snapshot B with xmin 1050
>> 4. Backend releases snapshot A, updating TransactionXmin to 1050
>> 5. Backend acquires new snapshot, calls GetSnapshotDataReuse(), reusing
>> snapshot A's data.
>>
>> Because xactCompletionCount is not updated in step 2, the
>> GetSnapshotDataReuse() call will reuse the snapshot A. But snapshot A
>> has a lower xmin.
>>
> 
> Could be. As an experiment I added xactCompletionCount advance to the
> two functions you mentioned, and I ran the stress test again. I haven't
> seen any failures so far, after ~1000 runs. Without the patch this
> produced ~200 failures/core files.
> 

I kept stress-testing this (without the fix), trying to figure out why
the frequency of failures got so much higher on PG18. While discussing
this off-list with Andres, he was wondering if maybe there's a second
independent bug on PG18, with the same symptoms.

It took quite a bit of time, but I managed to narrow this down to ~mid
December 2024, commit 952365cded6. The following table shows the number
of assert failures on a single run of the stress test, on 4 different
machines.

                   xeon      ryzen     rpi5-32    rpi5-64
  -------------------------------------------------------
  1585ff7387d       247         52          69         28
  952365cded6       199         30           -          -
  7ec4b9ff80d         1          7           -          -
  7f97b4734f9         3         17           -          -
  578a7fe7b6f         -          -           -          -
  db448ce5ad3         -          -           -          -
  1f81b48a9d5         -          -           -          -
  d5a7bd5670c         0         11           1          0

There's a fair amount of randomness - partially due to the bug being a
race condition, and thus time-sensitive. And then also the test suite is
randomized.

But I think it pretty clearly shows a clear change in these commits. The
rpi5 machines take much longer, so I only have the two results for now.

But it seems it changed in 952365cded6, which is:

    commit 952365cded635e54c4177399c0280cb7a5e34c11
    Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
    Date:   Mon Dec 23 12:42:39 2024 +0200

    Remove unnecessary GetTransactionSnapshot() calls

    In get_database_list() and get_subscription_list(), the
    GetTransactionSnapshot() call is not required because the catalog
    table scans use the catalog snapshot, which is held until the end of
    the scan. See table_beginscan_catalog(), which calls
    RegisterSnapshot(GetCatalogSnapshot(relid)).

    ...

I'm not claiming the commit is buggy - it might be, but I think it's
more likely it just made the preexisting bug easier to hit. I base this
on the observation that incrementing the xactCompletionCount made the
assert failures go away.


regards

-- 
Tomas Vondra




Re: Snapshot related assert failure on skink

From
Heikki Linnakangas
Date:
On 21/03/2025 12:28, Tomas Vondra wrote:
> But it seems it changed in 952365cded6, which is:
> 
>      commit 952365cded635e54c4177399c0280cb7a5e34c11
>      Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
>      Date:   Mon Dec 23 12:42:39 2024 +0200
> 
>      Remove unnecessary GetTransactionSnapshot() calls
> 
>      In get_database_list() and get_subscription_list(), the
>      GetTransactionSnapshot() call is not required because the catalog
>      table scans use the catalog snapshot, which is held until the end of
>      the scan. See table_beginscan_catalog(), which calls
>      RegisterSnapshot(GetCatalogSnapshot(relid)).
> 
>      ...
> 
> I'm not claiming the commit is buggy - it might be, but I think it's
> more likely it just made the preexisting bug easier to hit. I base this
> on the observation that incrementing the xactCompletionCount made the
> assert failures go away.

That commit removed three GetTransactionSnapshot() calls: one in 
autovacuum, one in logical replication launcher, and one in 
InitPostgres(). It must be the one in InitPostgres() that makes the 
difference, because the other two are not executed in a standby.

I was able to reproduce this and to catch it in action with 'rr'. After 
commit 952365cded6, the sequence of events looks like this:

1. At backend startup, GetCatalogSnapshot() sets 
CatalogSnapshotData.xmin=1000, xactCompletionCount=10
2. On first query, GetTransactionSnapshot() sets 
CurrentSnapshotData.xmin=1002, xactCompletionCount=10
3. GetCatalogSnapshot() is called again. It tries to reuse the snapshot 
with xmin=1000 and hits the assertion.

Before commit 952365cded6, which removed the GetTransactionSnapshot() 
call from InitPostgres, this usually happens instead:

0. At backend startup, GetTransactionSnapshot() sets 
CurrentSnapshotData.xmin=1000, xactCompletionCount=10
1. At backend startup, GetCatalogSnapshot() sets 
CatalogSnapshotData.xmin=1000, xactCompletionCount=10
2. On first query, GetTransactionSnapshot() reuses the snapshot with 
CurrentSnapshotData.xmin=1002, xactCompletionCount=10
3. GetCatalogSnapshot() is called again. It successfully reuses the 
snapshot  with xmin=1000

In other words, the GetTransactionSnapshot() call initializes the 
CurrentSnapshotData with the snapshot with an earlier xmin. The first 
GetCatalogSnapshot() call happens almost immediately after that, so 
they're very likely - but not guaranteed - to get the same snapshot. 
When I removed the GetTransactionSnapshot() call, the gap between the 
first GetTransactionSnapshot() and GetCatalogSnapshot() calls in the 
backend became much longer, making this more likely to happen.

In any case, the missing "xactCompletionCount++" in 
ExpireAllKnownAssignedTransactionIds() and 
ExpireOldKnownAssignedTransactionIds() is a clear bug and explains this. 
I will prepare and commit patches to fix that.

Thanks for the testing!

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: Snapshot related assert failure on skink

From
Andres Freund
Date:
Hi,

On 2025-03-19 09:17:23 +0200, Heikki Linnakangas wrote:
> On 19/03/2025 04:22, Tomas Vondra wrote:
> > I kept stress-testing this, and while the frequency massively increased
> > on PG18, I managed to reproduce this all the way back to PG14. I see
> > ~100x more corefiles on PG18.
> > 
> > That is not a proof the issue was introduced in PG14, maybe it's just
> > the assert that was added there or something. Or maybe there's another
> > bug in PG18, making the impact worse.
> > 
> > But I'd suspect this is a bug in
> > 
> > commit 623a9ba79bbdd11c5eccb30b8bd5c446130e521c
> > Author: Andres Freund <andres@anarazel.de>
> > Date:   Mon Aug 17 21:07:10 2020 -0700
> > 
> >      snapshot scalability: cache snapshots using a xact completion counter.
> > 
> >      Previous commits made it faster/more scalable to compute snapshots.
> > But not
> >      building a snapshot is still faster. Now that GetSnapshotData() does not
> >      maintain RecentGlobal* anymore, that is actually not too hard:
> > 
> >      ...

Thanks for debugging and analyzing this!


> Looking at the code, shouldn't ExpireAllKnownAssignedTransactionIds() and
> ExpireOldKnownAssignedTransactionIds() update xactCompletionCount? This can
> happen during hot standby:
> 
> 1. Backend acquires snapshot A with xmin 1000
> 2. Startup process calls ExpireOldKnownAssignedTransactionIds(),
> 3. Backend acquires snapshot B with xmin 1050
> 4. Backend releases snapshot A, updating TransactionXmin to 1050
> 5. Backend acquires new snapshot, calls GetSnapshotDataReuse(), reusing
> snapshot A's data.
> 
> Because xactCompletionCount is not updated in step 2, the
> GetSnapshotDataReuse() call will reuse the snapshot A. But snapshot A has a
> lower xmin.

I've swapped a lot of the KnownAssigned* code out:

Am I right in understanding that the only scenario (when in
STANDBY_SNAPSHOT_READY), where ExpireOldKnownAssignedTransactionIds() would
"legally" remove a transaction, rather than the commit / abort records doing
so, is if the primary crash-restarted while transactions were ongoing?

Those transactions won't have a commit/abort records and thus won't trigger
ExpireTreeKnownAssignedTransactionIds(), which otherwise would have updated
->xactCompletionCount?


When writing the snapshot caching patch, I tried to make sure that all the
places that maintain ->latestCompletedXid also update
->xactCompletionCount. Afaict that's still the case. Which, I think, means
that we're also missing calls to MaintainLatestCompletedXidRecovery()?

If latestCompletedXid is incorrect visibility determinations end up wrong...

Greetings,

Andres Freund



Re: Snapshot related assert failure on skink

From
Heikki Linnakangas
Date:
On 21/03/2025 17:16, Andres Freund wrote:
> Am I right in understanding that the only scenario (when in
> STANDBY_SNAPSHOT_READY), where ExpireOldKnownAssignedTransactionIds() would
> "legally" remove a transaction, rather than the commit / abort records doing
> so, is if the primary crash-restarted while transactions were ongoing?
> 
> Those transactions won't have a commit/abort records and thus won't trigger
> ExpireTreeKnownAssignedTransactionIds(), which otherwise would have updated
> ->xactCompletionCount?

Correct.

> When writing the snapshot caching patch, I tried to make sure that all the
> places that maintain ->latestCompletedXid also update
> ->xactCompletionCount. Afaict that's still the case. Which, I think, means
> that we're also missing calls to MaintainLatestCompletedXidRecovery()?

Yep, I was just looking at that too.

> If latestCompletedXid is incorrect visibility determinations end up wrong...

I think it happens to work, because the transactions are effectively 
aborted. latestCompletedXid is used to initialize xmax in 
GetSnapshotData. If, for example, latestCompletedXid is incorrectly set 
to 1000 even though XID 1001 already aborted, a snapshot with xmax=1000 
still correctly considers XID 1001 as "not visible". As soon as a 
transaction commits, latestCompletedXid is fixed.

AFAICS we could skip updating latestCompletedXid on aborts altogether 
and rename it to latestCommittedXid. But it's hardly worth optimizing 
for aborts.

For the same reason, I believe the assertion failure we're discussing 
here is also harmless. Even though the reused snapshot has a smaller 
xmin than expected, all those transactions aborted and are thus not 
visible anyway.

In any case, let's be tidy and fix both latestCompletedXid and 
xactCompletionCount.

-- 
Heikki Linnakangas
Neon (https://neon.tech)



Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:

On 3/23/25 17:43, Heikki Linnakangas wrote:
> On 21/03/2025 17:16, Andres Freund wrote:
>> Am I right in understanding that the only scenario (when in
>> STANDBY_SNAPSHOT_READY), where ExpireOldKnownAssignedTransactionIds()
>> would
>> "legally" remove a transaction, rather than the commit / abort records
>> doing
>> so, is if the primary crash-restarted while transactions were ongoing?
>>
>> Those transactions won't have a commit/abort records and thus won't
>> trigger
>> ExpireTreeKnownAssignedTransactionIds(), which otherwise would have
>> updated
>> ->xactCompletionCount?
> 
> Correct.
> 
>> When writing the snapshot caching patch, I tried to make sure that all
>> the
>> places that maintain ->latestCompletedXid also update
>> ->xactCompletionCount. Afaict that's still the case. Which, I think,
>> means
>> that we're also missing calls to MaintainLatestCompletedXidRecovery()?
> 
> Yep, I was just looking at that too.
> 
>> If latestCompletedXid is incorrect visibility determinations end up
>> wrong...
> 
> I think it happens to work, because the transactions are effectively
> aborted. latestCompletedXid is used to initialize xmax in
> GetSnapshotData. If, for example, latestCompletedXid is incorrectly set
> to 1000 even though XID 1001 already aborted, a snapshot with xmax=1000
> still correctly considers XID 1001 as "not visible". As soon as a
> transaction commits, latestCompletedXid is fixed.
> 
> AFAICS we could skip updating latestCompletedXid on aborts altogether
> and rename it to latestCommittedXid. But it's hardly worth optimizing
> for aborts.
> 
> For the same reason, I believe the assertion failure we're discussing
> here is also harmless. Even though the reused snapshot has a smaller
> xmin than expected, all those transactions aborted and are thus not
> visible anyway.
> 
> In any case, let's be tidy and fix both latestCompletedXid and
> xactCompletionCount.
> 

Thanks for looking into this and pushing the fix.

Would it make sense to add a comment documenting this reasoning about
not handling aborts? Otherwise someone will get to rediscover this in
the future ...


regards

-- 
Tomas Vondra




Re: Snapshot related assert failure on skink

From
Heikki Linnakangas
Date:
On 24/03/2025 16:56, Tomas Vondra wrote:
> 
> 
> On 3/23/25 17:43, Heikki Linnakangas wrote:
>> On 21/03/2025 17:16, Andres Freund wrote:
>>> Am I right in understanding that the only scenario (when in
>>> STANDBY_SNAPSHOT_READY), where ExpireOldKnownAssignedTransactionIds()
>>> would
>>> "legally" remove a transaction, rather than the commit / abort records
>>> doing
>>> so, is if the primary crash-restarted while transactions were ongoing?
>>>
>>> Those transactions won't have a commit/abort records and thus won't
>>> trigger
>>> ExpireTreeKnownAssignedTransactionIds(), which otherwise would have
>>> updated
>>> ->xactCompletionCount?
>>
>> Correct.
>>
>>> When writing the snapshot caching patch, I tried to make sure that all
>>> the
>>> places that maintain ->latestCompletedXid also update
>>> ->xactCompletionCount. Afaict that's still the case. Which, I think,
>>> means
>>> that we're also missing calls to MaintainLatestCompletedXidRecovery()?
>>
>> Yep, I was just looking at that too.
>>
>>> If latestCompletedXid is incorrect visibility determinations end up
>>> wrong...
>>
>> I think it happens to work, because the transactions are effectively
>> aborted. latestCompletedXid is used to initialize xmax in
>> GetSnapshotData. If, for example, latestCompletedXid is incorrectly set
>> to 1000 even though XID 1001 already aborted, a snapshot with xmax=1000
>> still correctly considers XID 1001 as "not visible". As soon as a
>> transaction commits, latestCompletedXid is fixed.
>>
>> AFAICS we could skip updating latestCompletedXid on aborts altogether
>> and rename it to latestCommittedXid. But it's hardly worth optimizing
>> for aborts.
>>
>> For the same reason, I believe the assertion failure we're discussing
>> here is also harmless. Even though the reused snapshot has a smaller
>> xmin than expected, all those transactions aborted and are thus not
>> visible anyway.
>>
>> In any case, let's be tidy and fix both latestCompletedXid and
>> xactCompletionCount.
>>
> 
> Thanks for looking into this and pushing the fix.
> 
> Would it make sense to add a comment documenting this reasoning about
> not handling aborts? Otherwise someone will get to rediscover this in
> the future ...

Well, we're currently _not_ relying on the reasoning about aborts not 
changing visibility. So it's not like someone will forget the reasoning 
and have a bug as a result. I guess someone could rediscover that 
reasoning and think that they can skip advancing those counters on 
aborts as an optimization, re-introducing the assertion failure. But I 
believe that was not why we had this bug, it was just a simple omission.

So the comment would look like this:

"In principle, we would not need to advance xactCompletionCount and 
latestCompletedXid because aborting transactions don't change 
visibility. But that could make xmin within a transaction go backwards, 
if a snapshot with an older xmin but same xactCompletionCount is reused, 
triggering the assertion in GetSnapshotDataReuse()."

If we add that, I guess it should go into ProcArrayEndTransaction() 
which currently just notes that it's used for commits and aborts 
interchangeably. Do you think it's worth it? Want to propose a patch?

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: Snapshot related assert failure on skink

From
Tomas Vondra
Date:

On 3/24/25 16:25, Heikki Linnakangas wrote:
> On 24/03/2025 16:56, Tomas Vondra wrote:
>>
>>
>> On 3/23/25 17:43, Heikki Linnakangas wrote:
>>> On 21/03/2025 17:16, Andres Freund wrote:
>>>> Am I right in understanding that the only scenario (when in
>>>> STANDBY_SNAPSHOT_READY), where ExpireOldKnownAssignedTransactionIds()
>>>> would
>>>> "legally" remove a transaction, rather than the commit / abort records
>>>> doing
>>>> so, is if the primary crash-restarted while transactions were ongoing?
>>>>
>>>> Those transactions won't have a commit/abort records and thus won't
>>>> trigger
>>>> ExpireTreeKnownAssignedTransactionIds(), which otherwise would have
>>>> updated
>>>> ->xactCompletionCount?
>>>
>>> Correct.
>>>
>>>> When writing the snapshot caching patch, I tried to make sure that all
>>>> the
>>>> places that maintain ->latestCompletedXid also update
>>>> ->xactCompletionCount. Afaict that's still the case. Which, I think,
>>>> means
>>>> that we're also missing calls to MaintainLatestCompletedXidRecovery()?
>>>
>>> Yep, I was just looking at that too.
>>>
>>>> If latestCompletedXid is incorrect visibility determinations end up
>>>> wrong...
>>>
>>> I think it happens to work, because the transactions are effectively
>>> aborted. latestCompletedXid is used to initialize xmax in
>>> GetSnapshotData. If, for example, latestCompletedXid is incorrectly set
>>> to 1000 even though XID 1001 already aborted, a snapshot with xmax=1000
>>> still correctly considers XID 1001 as "not visible". As soon as a
>>> transaction commits, latestCompletedXid is fixed.
>>>
>>> AFAICS we could skip updating latestCompletedXid on aborts altogether
>>> and rename it to latestCommittedXid. But it's hardly worth optimizing
>>> for aborts.
>>>
>>> For the same reason, I believe the assertion failure we're discussing
>>> here is also harmless. Even though the reused snapshot has a smaller
>>> xmin than expected, all those transactions aborted and are thus not
>>> visible anyway.
>>>
>>> In any case, let's be tidy and fix both latestCompletedXid and
>>> xactCompletionCount.
>>>
>>
>> Thanks for looking into this and pushing the fix.
>>
>> Would it make sense to add a comment documenting this reasoning about
>> not handling aborts? Otherwise someone will get to rediscover this in
>> the future ...
> 
> Well, we're currently _not_ relying on the reasoning about aborts not
> changing visibility. So it's not like someone will forget the reasoning
> and have a bug as a result. I guess someone could rediscover that
> reasoning and think that they can skip advancing those counters on
> aborts as an optimization, re-introducing the assertion failure. But I
> believe that was not why we had this bug, it was just a simple omission.
> 
> So the comment would look like this:
> 
> "In principle, we would not need to advance xactCompletionCount and
> latestCompletedXid because aborting transactions don't change
> visibility. But that could make xmin within a transaction go backwards,
> if a snapshot with an older xmin but same xactCompletionCount is reused,
> triggering the assertion in GetSnapshotDataReuse()."
> 
> If we add that, I guess it should go into ProcArrayEndTransaction()
> which currently just notes that it's used for commits and aborts
> interchangeably. Do you think it's worth it? Want to propose a patch?
> 

I think the wording you suggested above is reasonable - I certainly
don't have a better one. I don't know what's the likelihood of someone
breaking this, but I think it makes sense to mention this because the
assert is non-trivial to hit.

regards

-- 
Tomas Vondra