Re: Snapshot related assert failure on skink - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Snapshot related assert failure on skink
Date
Msg-id cb473aa0-1347-4454-9b46-016cba29f007@vondra.me
Whole thread Raw
In response to Re: Snapshot related assert failure on skink  (Tomas Vondra <tomas@vondra.me>)
Responses Re: Snapshot related assert failure on skink
List pgsql-hackers
I kept stress-testing this, and while the frequency massively increased
on PG18, I managed to reproduce this all the way back to PG14. I see
~100x more corefiles on PG18.

That is not a proof the issue was introduced in PG14, maybe it's just
the assert that was added there or something. Or maybe there's another
bug in PG18, making the impact worse.

But I'd suspect this is a bug in

commit 623a9ba79bbdd11c5eccb30b8bd5c446130e521c
Author: Andres Freund <andres@anarazel.de>
Date:   Mon Aug 17 21:07:10 2020 -0700

    snapshot scalability: cache snapshots using a xact completion counter.

    Previous commits made it faster/more scalable to compute snapshots.
But not
    building a snapshot is still faster. Now that GetSnapshotData() does not
    maintain RecentGlobal* anymore, that is actually not too hard:

    ...

which is the main snapshot-related change in PG14. I added some logging
to all places changing TransactionXmin and RecendXmin in procarray.c,
and all the failures look like this:

------------------------
1742348688.896 2025-03-19 02:44:48.896 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:1] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotData TransactionXmin = 209485

1742348688.896 2025-03-19 02:44:48.896 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:2] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotData RecentXmin = 209485

1742348688.902 2025-03-19 02:44:48.902 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:3] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotDataReuse TransactionXmin = 209485

1742348688.902 2025-03-19 02:44:48.902 CET [2001770] [client
backend:[unknown]] [67da2190.1e8b6a:4] [2025-03-19 02:44:48 CET]
[13/1/0] LOG:  GetSnapshotDataReuse RecentXmin = 209485

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:5] [2025-03-19 02:44:48 CET] [13/2/0]
LOG:  GetSnapshotData TransactionXmin = 209508

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:6] [2025-03-19 02:44:48 CET] [13/2/0]
STATEMENT:  SELECT abalance FROM pgbench_accounts WHERE aid = 987279;

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:7] [2025-03-19 02:44:48 CET] [13/2/0]
LOG:  GetSnapshotData RecentXmin = 209508

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:8] [2025-03-19 02:44:48 CET] [13/2/0]
STATEMENT:  SELECT abalance FROM pgbench_accounts WHERE aid = 987279;

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:9] [2025-03-19 02:44:48 CET] [13/2/0]
LOG:  GetSnapshotDataReuse RecentXmin = 209485 at character 22

1742348688.922 2025-03-19 02:44:48.922 CET [2001770] [client
backend:pgbench] [67da2190.1e8b6a:10] [2025-03-19 02:44:48 CET] [13/2/0]
STATEMENT:  SELECT abalance FROM pgbench_accounts WHERE aid = 987279;
TRAP: failed Assert("TransactionIdPrecedesOrEquals(TransactionXmin,
RecentXmin)"), File: "procarray.c", Line: 2136, PID: 2001770
------------------------

These are all logs for pgbench backend PID 2001770. First, there's a
call GetSnapshotData and then GetSnapshotDataReuse, which sets both
variables to 209485.

    LOG:  GetSnapshotData TransactionXmin = 209485
    LOG:  GetSnapshotData RecentXmin = 209485
    LOG:  GetSnapshotDataReuse TransactionXmin = 209485
    LOG:  GetSnapshotDataReuse RecentXmin = 209485

I believe this is some very early setup, because both log lines have

    [%b:%a] = [client backend:[unknown]]

And then we get another pair of these calls, where GetSnapshotData
advances the XIDs to 209508. But the following GetSnapshotDataReuse call
updates only RecentXmin to 209485, and that is before TrasactionXmin.

Clearly, GetSnapshotDataReuse skips setting the TransactionXmin because
it still has valid XID in MyProc->xmin:

    if (!TransactionIdIsValid(MyProc->xmin))
        MyProc->xmin = TransactionXmin = snapshot->xmin;

I don't know if that's correct, and it seems a bit weird the RecentXmin
moves backwards.

But as I mentioned before, it's a bit weird this happens right after the
replica gets ready to handle read-only queries:

------------------------
1742348688.825 2025-03-19 02:44:48.825 CET [1981188] [startup:]
[67da2188.1e3b04:9] [2025-03-19 02:44:40 CET] [138/0/0] LOG:  consistent
recovery state reached at 1/5FA884E8

1742348688.825 2025-03-19 02:44:48.825 CET [1981188] [startup:]
[67da2188.1e3b04:10] [2025-03-19 02:44:40 CET] [138/0/0] LOG:  record
with incorrect prev-link 0/810 at 1/5FA884E8

1742348688.825 2025-03-19 02:44:48.825 CET [1981023] [postmaster:]
[67da2188.1e3a5f:5] [2025-03-19 02:44:40 CET] [/0] LOG:  database system
is ready to accept read-only connections

1742348688.837 2025-03-19 02:44:48.837 CET [2001731] [walreceiver:]
[67da2190.1e8b43:1] [2025-03-19 02:44:48 CET] [/0] LOG:  started
streaming WAL from primary at 1/5F000000 on timeline 1
------------------------

But there are multiple other concurrent pgbench backends, and those
don't have the same issue. None of them seems to advance the XID to 209508.


regards

-- 
Tomas Vondra



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Release freeze April 8
Next
From: Michael Paquier
Date:
Subject: Re: Add Pipelining support in psql