Thanks for your feedback!
I applied the patch to print more information about the error. Here’s what I got:
2022-11-23 20:33:03 UTC [638 test_database]: [5458] ERROR: new multixact has more than one updating member: 0 2[248477 (nokeyupd), 248645 (nokeyupd)]
2022-11-23 20:33:03 UTC [638 test_database]: [5459] STATEMENT: UPDATE warehouse1
SET w_ytd = w_ytd + 498
WHERE w_id = 5
I then inspected the WAL and I found the log records for these 2 transactions:
…
rmgr: MultiXact len (rec/tot): 54/ 54, tx: 248477, lsn: 0/66DB82A8, prev 0/66DB8260, desc: CREATE_ID 133 offset 265 nmembers 2: 248477 (nokeyupd) 248500 (keysh)
rmgr: Heap len (rec/tot): 70/ 70, tx: 248477, lsn: 0/66DB82E0, prev 0/66DB82A8, desc: HOT_UPDATE off 20 xmax 133 flags 0x20 IS_MULTI EXCL_LOCK ; new off 59 xmax 132, blkref #0: rel 1663/16384/16385 blk 422
rmgr: Transaction len (rec/tot): 34/ 34, tx: 248477, lsn: 0/66DBA710, prev 0/66DBA6D0, desc: ABORT 2022-11-23 20:33:03.712298 UTC
…
rmgr: Transaction len (rec/tot): 34/ 34, tx: 248645, lsn: 0/66DBB060, prev 0/66DBB020, desc: ABORT 2022-11-23 20:33:03.712388 UTC
Attached please find the relevant portion of the WAL.
Thanks for your help on this!
Dimos
[ServiceNow]
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Thursday, 24. November 2022 at 10:49
To: Peter Geoghegan <pg@bowt.ie>
Cc: Dimos Stamatakis <dimos.stamatakis@servicenow.com>, pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
Subject: Re: Fix for visibility check on 14.5 fails on tpcc with high concurrency
[External Email]
On 2022-Nov-23, Peter Geoghegan wrote:
> On Wed, Nov 23, 2022 at 2:54 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > Something like the attached. It would result in output like this:
> > WARNING: new multixact has more than one updating member: 0 2[17378 (keysh), 17381 (nokeyupd)]
> >
> > Then it should be possible to trace (in pg_waldump output) the
> > operations of each of the transactions that have any status in the
> > multixact that includes some form of "upd".
>
> That seems very useful.
Okay, pushed to all branches.
> Separately, I wonder if it would make sense to add additional
> defensive checks to FreezeMultiXactId() for this. There is an
> assertion that should catch the presence of multiple updaters in a
> single Multi when it looks like we have to generate a new Multi to
> carry the XID members forward (typically something we only need to do
> during a VACUUM FREEZE). We could at least make that
> "Assert(!TransactionIdIsValid(update_xid));" line into a defensive
> "can't happen" ereport(). It couldn't hurt, at least -- we already
> have a similar relfrozenxid check nearby, added after the "freeze the
> dead" bug was fixed.
Hmm, agreed. I'll see about that separately.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"I'm always right, but sometimes I'm more right than other times."
(Linus Torvalds)