Re: IPC/MultixactCreation on the Standby server - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: IPC/MultixactCreation on the Standby server
Date
Msg-id 6eb048a2-239d-4a47-984c-7e5f5e826cc5@iki.fi
Whole thread Raw
In response to Re: IPC/MultixactCreation on the Standby server  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: IPC/MultixactCreation on the Standby server
List pgsql-hackers
On 30/11/2025 14:15, Andrey Borodin wrote:
> On 29 Nov 2025, at 00:51, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> I didn't understand why the 'kill9' and 'poll_start' stuff is
>> needed. We have plenty of tests that kill the server with regular
>> "$node->stop('immediate')", and restart the server normally. The
>> checkpoint in the middle of the tests seems unnecessary too. I
>> removed all that, and the test still seems to work. Was there a
>> particular reason for them?
> 
> In current shutdown sequence test seems to be reproducing corruption
> without checkpointing. I recollect that in July standby deadlock was
> reachable without checkpoint, but corruption was not. But now it
> seems test is working.

Ok.

>> I moved the wraparound test to a separate test file and commit.
>> More test coverage is good, but it's quite separate from the
>> bugfix and the wraparound related test shares very little with the
>> other test. The wraparound test needs a little more cleanup: use
>> plain perl instead of 'dd' and 'rm' for the file operations, for
>> example. (I did that with the tests in the 64-bit mxoff patches,
>> so we could copy from there.)
> 
> PFA test version without dd and rm.

Thanks! I will focus on the main patch and TAP test now, but will commit 
the wraparound test separately afterwards. At quick glance, it looks 
good now.

> Did I get your right, that we do not backport wraparound test,
> backport fixes for 001_multixact.pl test down to 17 where it
> appeared?
Yes, that's my plan. Except that 001_multixact.pl appeared in v18, not v17.

> First two patches are v13 intact, second pair is my suggestions.

Thanks, here's a new set of patches, now with backpatched versions for 
all the branches. As you said, there were a number of differences 
between branches:

- On master, don't include the compatibility hacks for reading WAL 
generated with older minor versions. Because WAL is not compatible 
across major versions anyway.

- REL_18_STABLE didn't have the SimpleLruZeroAndWritePage() function 
(introduced in commit c616785516).

- REL_17_STABLE didn't have the 001_multixact.pl TAP test. So I didn't 
backport the new TAP test to v17 and below either.

- REL_16_STABLE used 32-bit SLRU page numbers, didn't have bank locks, 
and used a simple sleep-loop instead of the condition variable.

- REL_15_STABLE and REL_14_STABLE: no conflicts from REL_16_STABLE

All of those conflicts were pretty straightforward to handle, but it's 
enough code churn for silly mistakes to slip in, especially when the TAP 
test didn't apply. So if you have a chance, please help to review and 
test each of these backpatched versions too.

In addition to the backpatching, I did some more cosmetic cleanups to 
the TAP test.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: POC: make mxidoff 64 bits
Next
From: Tatsuo Ishii
Date:
Subject: Re: Row pattern recognition