Re: Fix bug in multixact Oldest*MXactId initialization and access - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Fix bug in multixact Oldest*MXactId initialization and access
Date
Msg-id 366b6f95-613d-4ef8-82e4-bb2fd06d9c4d@iki.fi
Whole thread Raw
In response to Re: Fix bug in multixact Oldest*MXactId initialization and access  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Responses Re: Fix bug in multixact Oldest*MXactId initialization and access
List pgsql-hackers
On 27/02/2026 07:00, Yura Sokolov wrote:
> 26.02.2026 23:44, Sami Imseih пишет:
>>> (Sorry, I've used Google Translate to write this sentence).
>>>
>>> When you write assert, you protect yourself from shooting your leg far in
>>> the future. Believe me.
>>
>> The issue to me seems that we a few code paths that rely on the
>> total proc calculation in several places, and changing one and
>> forgetting to change another is what broke things.
> 
> If there were asserts, then tests would have failed.
> Period.
> 
> There would no the bug. There would no this discussion.
> If only array bounds were checked with asserts.

Yeah, more asserts == good, usually.

Here's my version of this. It's the same basic idea, some minor changes:

- Instead of the MaxChildren, TotalProcs, TotalXactProcs variables, I 
added FIRST_PREPARED_XACT_PROC_NUMBER. Rationale: The variables still 
required you to know the layout of the PGPROCs, i.e. you still had to 
know that aux processes come after regular backends and dummy pgprocs 
after aux processes. An FIRST_PREPARED_XACT_PROC_NUMBER is more explicit 
in the places where it's needed.

   I have been thinking of adding variables like that anyway, because 
the current calculations with MaxBackends et al. are just so confusing. 
So I'm vaguely in favor of doing something like that in the future. But 
I think FIRST_PREPARED_XACT_PROC_NUMBER is more clear for this patch, 
and those variable names as done here (MaxBackends, MaxChildren, 
Totalprocs and TotalXactProcs) were still super confusing.

- I added a slightly different the set of Getter/Setter functions: 
MyOldestMemberMXactIdSlot(), 
PreparedXactOldestMemberMXactIdSlot(ProcNumber), and 
MyOldestVisibleMXactIdSlot(). I think this is slightly less error-prone, 
making it harder to pass proc number to wrong function (although the 
assertions would catch such misuses pretty quick anyway). And these 
functions return a pointer to the slot, so we don't need separate getter 
and setter functions. That's a matter of taste, having a little less 
code looks nicer to me.

What do you think?

I've been pondering what kind of issues this bug could cause. Because 
the OldestVisibleMXactId array is allocated just after the 
OldestMemberMXactId array, you don't get a buffer overflow, but a 
prepared xact can overwrite a backend's value in the 
OldestVisibleMXactId array. That can surely cause trouble later, but not 
sure what exactly, and it seems pretty hard to write a repro script for it.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Re: Portable StaticAssertExpr
Next
From: Amit Kapila
Date:
Subject: Re: [PATCH] Support automatic sequence replication