Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt - Mailing list pgsql-hackers

From Yura Sokolov
Subject Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt
Date
Msg-id ad5d9d90-6a97-4b42-8fb2-561b99d7f44b@postgrespro.ru
Whole thread Raw
In response to Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Responses Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt
List pgsql-hackers
Good day,

27.03.2026 12:04, Yura Sokolov wrote:
> 26.03.2026 23:29, Andres Freund wrote:
>> Hi,
>>
>> On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
>>> UnlockBufHdrExt does:
>>>
>>>    buf_state |= set_bits;
>>>    buf_state &= ~unset_bits;
>>>    buf_state &= ~BM_LOCKED;
>>>
>>> TerminateBufferIO unconditionally does:
>>>
>>>    unset_flag_bits |= BM_IO_ERROR;
>>>
>>> Due to this, AbortBufferIO and buffer_readv_complete_one are failed
>>> to set BM_IO_ERROR with call to TerminateBufferIO.
>>>
>>> It was found with proprietary code that was triggered on
>>> PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
>>
>> That's clearly not right.  Care to write a patch?  I think we should add a
>> test for this in src/test/modules/test_aio too. As we don't rely on things
>> like BM_IO_ERROR this is quite easy to not notice.
> I thought, fix is too small to go the way of patches.
> I don't mind if you just push it.
> 
> But if I can save your time on writing test, I'll try.
> 
> ...
> 
> I believe, proper way is to add assertion in UnlockBufHdrExt:
> 
>     Assert(!(set_bits & unset_bits));
> 
> And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
> set_flag_bits.
> 
> Is it ok?

Here patchset is.

I've tried to modify 001_aio.pl to test presence BM_IO_ERROR after read
failure.

I've also added FlushBuffer failure test in second patch (v1-002) using
injection point. I don't know if 001-aio.pl is a good place for since write
is not async yet.

v1-003 just mades DebugPrintBufferRefcount prettier.

-- 
regards
Yura Sokolov aka funny-falcon
Attachment

pgsql-hackers by date:

Previous
From: SATYANARAYANA NARLAPURAM
Date:
Subject: Re: [Proposal] pg_stat_wal_records – per-record-type WAL generation statistics
Next
From: Payal Singh
Date:
Subject: Review - Patch for pg_bsd_indent: improve formatting of multiline comments