On 27/02/2026 07:00, Yura Sokolov wrote:
> 26.02.2026 23:44, Sami Imseih пишет:
>>> (Sorry, I've used Google Translate to write this sentence).
>>>
>>> When you write assert, you protect yourself from shooting your leg far in
>>> the future. Believe me.
>>
>> The issue to me seems that we a few code paths that rely on the
>> total proc calculation in several places, and changing one and
>> forgetting to change another is what broke things.
>
> If there were asserts, then tests would have failed.
> Period.
>
> There would no the bug. There would no this discussion.
> If only array bounds were checked with asserts.
Yeah, more asserts == good, usually.
Here's my version of this. It's the same basic idea, some minor changes:
- Instead of the MaxChildren, TotalProcs, TotalXactProcs variables, I
added FIRST_PREPARED_XACT_PROC_NUMBER. Rationale: The variables still
required you to know the layout of the PGPROCs, i.e. you still had to
know that aux processes come after regular backends and dummy pgprocs
after aux processes. An FIRST_PREPARED_XACT_PROC_NUMBER is more explicit
in the places where it's needed.
I have been thinking of adding variables like that anyway, because
the current calculations with MaxBackends et al. are just so confusing.
So I'm vaguely in favor of doing something like that in the future. But
I think FIRST_PREPARED_XACT_PROC_NUMBER is more clear for this patch,
and those variable names as done here (MaxBackends, MaxChildren,
Totalprocs and TotalXactProcs) were still super confusing.
- I added a slightly different the set of Getter/Setter functions:
MyOldestMemberMXactIdSlot(),
PreparedXactOldestMemberMXactIdSlot(ProcNumber), and
MyOldestVisibleMXactIdSlot(). I think this is slightly less error-prone,
making it harder to pass proc number to wrong function (although the
assertions would catch such misuses pretty quick anyway). And these
functions return a pointer to the slot, so we don't need separate getter
and setter functions. That's a matter of taste, having a little less
code looks nicer to me.
What do you think?
I've been pondering what kind of issues this bug could cause. Because
the OldestVisibleMXactId array is allocated just after the
OldestMemberMXactId array, you don't get a buffer overflow, but a
prepared xact can overwrite a backend's value in the
OldestVisibleMXactId array. That can surely cause trouble later, but not
sure what exactly, and it seems pretty hard to write a repro script for it.
- Heikki