Re: Rework SLRU I/O errors handle - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Rework SLRU I/O errors handle
Date
Msg-id aa7908db-b88e-4bb6-a2bc-5b3cee9a28cb@iki.fi
Whole thread Raw
In response to Re: Rework SLRU I/O errors handle  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Rework SLRU I/O errors handle
List pgsql-hackers
On 06/03/2026 18:22, Heikki Linnakangas wrote:
> On 06/03/2026 17:33, Álvaro Herrera wrote:
>> I'm not a fan of the split.  I think it all these patches should be
>> pushed as a single commit, and avoid introducing xact_errmsg_for_io_error
>> as an exposed function.  I think that doesn't make a lot of sense.  Each
>> SLRU should have a correct and appropriate error reporting callback.
> 
> Agreed.
> 
>> The comment added in 0005 is bogus too.  It mentions 
>> InvalidTransactionId,
>> but the problem is not that the value is 0 but rather that we get no
>> pointer.  Also, in all other callbacks the pointer is asserted to not be
>> NULL, so why don't we do the same here, and avoid an error message
>> that's not going to help anyone?  I see however that in the patch we're
>> passing a NULL to SlruReportIOError(), which means if you get an IO
>> error with any SLRU other than xact, you're going to get either a crash
>> on the assertion, or (on non-debug builds) a crash on dereferencing the
>> NULL pointer.
> 
> Hmm, I thought we could just never pass a NULL pointer, but if a Write 
> fails, slru.c has no context where to pull that opaque_data. Perhaps we 
> should just not call the callback in that case.
> 
> I'm starting to feel that what SlruReportIOError() currently uses as the 
> errdetail, could well be the main error message, and the callback could 
> provide the errdetail. I.e. swap the errmsg and errdetail we have now. 
> That way, we can just leave out the errdetail for Write failures. The 
> current errmsg we have for Write failures is pretty bad: "ERROR: could 
> not access status of transaction 0". What's currently the errdetail, 
> e.g. "Could not read from file \"%s\" at offset %d: %m", is a lot more 
> informative.

Attached version that does that.

As another data point, we have a lot of error messages like "could not 
open file \"%s\": %m" in the source tree. slru.c was the only place 
where that was in the errdetail() part.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Rework SLRU I/O errors handle
Next
From: Andrew Dunstan
Date:
Subject: Re: Allow specifying NULL default in pg_proc.dat for "any" arguments