Re: POC: make mxidoff 64 bits - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: POC: make mxidoff 64 bits
Date
Msg-id d29ac46d-4761-401c-b073-46884426c13a@iki.fi
Whole thread Raw
In response to Re: POC: make mxidoff 64 bits  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: POC: make mxidoff 64 bits
List pgsql-hackers
On 26/11/2025 17:50, Heikki Linnakangas wrote:
> On 26/11/2025 17:23, Maxim Orlov wrote:
>> On Tue, 25 Nov 2025 at 13:07, Heikki Linnakangas <hlinnaka@iki.fi 
>> <mailto:hlinnaka@iki.fi>> wrote:
>>> GetOldMultiXactIdSingleMember() currently asserts that the offset is
>>> never zero, but it should try to do something sensible in that case
>>> instead of just failing.
>>
>> Correct me if I'm wrong, but we added the assertion that offsets are
>> never 0, based on the idea that case #2 will never take place during an
>> update. If this isn't the case, this assertion could be removed.
>> The rest of the function appears to work correctly.
>>
>> I even think that, as an experiment, we could randomly reset some of the
>> offsets to zero and nothing would happen, except that some data would
>> be lost.
> 
> +1
> 
>> The most sensible thing we can do is give the user a warning, right?
>> Something like, "During the update, we encountered some weird offset
>> that shouldn't have been there, but there's nothing we can do about it,
>> just take note."
> 
> Yep, makes sense.

I read through the SLRU reading codepath, looking for all the things 
that could go wrong (not sure I got them all):

1. An SLRU file does not exist
2. An SLRU file is too short, i.e. a page does not exist
3. The offset in 'offsets' page is 0
4. The offset in 'offsets' page looks invalid, i.e. it's greater than 
nextOffset or smaller than oldestOffset.
5. The offset is out of order compared to its neighbors
6. The multixid has no members
7. The multixid has an invalid (0) member
8. A multixid has more than one updating member

Some of those situations are theoretically are possible if there was a 
crash. We don't follow the WAL-before-data rule for these SLRUs. 
Instead, we piggyback on the WAL-before-data of the heap page that would 
reference the multixid. In other words, we rely on the fact that if a 
multixid write is missed or torn because of a crash, that multixid will 
not be referenced from anywhwere and will never be read.

However, that doesn't hold for pg_upgrade. pg_upgrade will try to read 
all the multixids. So we need to make the multixact reading code 
tolerant of the situations that could be present after a crash. I think 
the right philosophy here is that we try to read all the old multixids, 
and do our best to interpret them the same way that the old server 
would. For those situations that can legitimately be present if the old 
server crashed at some point, be silent. For cases that should not 
happen, even if there was a crash, print a warning. For example, I think 
an SLRU file should never be missing (1) or truncated (2). But the zero 
offset (3), and (6) can happen.

Perhaps we should check that all the files exist and have the correct 
sizes in the pre-check stage, and abort the upgrade early if anything is 
missing. That would be pretty cheap to check.

- Heikki




pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: POC: make mxidoff 64 bits
Next
From: Amit Kapila
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication