Re: Fix for visibility check on 14.5 fails on tpcc with high concurrency - Mailing list pgsql-hackers

From Dimos Stamatakis
Subject Re: Fix for visibility check on 14.5 fails on tpcc with high concurrency
Date
Msg-id CO2PR0801MB231058F0579FB6762CECCAF4E20F9@CO2PR0801MB2310.namprd08.prod.outlook.com
Whole thread Raw
In response to Re: Fix for visibility check on 14.5 fails on tpcc with high concurrency  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers

Thanks for your feedback!

I applied the patch to print more information about the error. Here’s what I got:

 

2022-11-23 20:33:03 UTC [638 test_database]: [5458] ERROR:  new multixact has more than one updating member: 0 2[248477 (nokeyupd), 248645 (nokeyupd)]

2022-11-23 20:33:03 UTC [638 test_database]: [5459] STATEMENT:  UPDATE warehouse1

                          SET w_ytd = w_ytd + 498

                        WHERE w_id = 5

 

I then inspected the WAL and I found the log records for these 2 transactions:

 

rmgr: MultiXact   len (rec/tot):     54/    54, tx:     248477, lsn: 0/66DB82A8, prev 0/66DB8260, desc: CREATE_ID 133 offset 265 nmembers 2: 248477 (nokeyupd) 248500 (keysh)

rmgr: Heap        len (rec/tot):     70/    70, tx:     248477, lsn: 0/66DB82E0, prev 0/66DB82A8, desc: HOT_UPDATE off 20 xmax 133 flags 0x20 IS_MULTI EXCL_LOCK ; new off 59 xmax 132, blkref #0: rel 1663/16384/16385 blk 422

rmgr: Transaction len (rec/tot):     34/    34, tx:     248477, lsn: 0/66DBA710, prev 0/66DBA6D0, desc: ABORT 2022-11-23 20:33:03.712298 UTC

rmgr: Transaction len (rec/tot):     34/    34, tx:     248645, lsn: 0/66DBB060, prev 0/66DBB020, desc: ABORT 2022-11-23 20:33:03.712388 UTC

 

Attached please find the relevant portion of the WAL.

Thanks for your help on this!

 

Dimos

[ServiceNow]

 

 

From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Thursday, 24. November 2022 at 10:49
To: Peter Geoghegan <pg@bowt.ie>
Cc: Dimos Stamatakis <dimos.stamatakis@servicenow.com>, pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
Subject: Re: Fix for visibility check on 14.5 fails on tpcc with high concurrency

[External Email]

On 2022-Nov-23, Peter Geoghegan wrote:

> On Wed, Nov 23, 2022 at 2:54 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > Something like the attached. It would result in output like this:
> > WARNING: new multixact has more than one updating member: 0 2[17378 (keysh), 17381 (nokeyupd)]
> >
> > Then it should be possible to trace (in pg_waldump output) the
> > operations of each of the transactions that have any status in the
> > multixact that includes some form of "upd".
>
> That seems very useful.

Okay, pushed to all branches.

> Separately, I wonder if it would make sense to add additional
> defensive checks to FreezeMultiXactId() for this. There is an
> assertion that should catch the presence of multiple updaters in a
> single Multi when it looks like we have to generate a new Multi to
> carry the XID members forward (typically something we only need to do
> during a VACUUM FREEZE). We could at least make that
> "Assert(!TransactionIdIsValid(update_xid));" line into a defensive
> "can't happen" ereport(). It couldn't hurt, at least -- we already
> have a similar relfrozenxid check nearby, added after the "freeze the
> dead" bug was fixed.

Hmm, agreed. I'll see about that separately.

--
Álvaro Herrera PostgreSQL Developer —
https://www.EnterpriseDB.com/
"I'm always right, but sometimes I'm more right than other times."
(Linus Torvalds)

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: indentation in _hash_pgaddtup()
Next
From: Daniel Gustafsson
Date:
Subject: Re: TAP output format in pg_regress