On 3/5/26 21:25, Andres Freund wrote:
> Hi,
>
> On 2026-03-05 10:52:03 -0800, Noah Misch wrote:
>> On Thu, Mar 05, 2026 at 12:10:11PM -0500, Andres Freund wrote:
>>> Tomas encountered a crash with the index prefetching patchset. One of the
>>> patches included therein is a generalization of the gistGetFakeLSN()
>>> mechanism, which is then used by other indexes as well. That triggered an
>>> occasional, hard to locally reproduce, ERROR or PANIC in CI, about
>>>
>>> ERROR: xlog flush request 0/01BD2018 is not satisfied --- flushed only to 0/01BD2000
>>
>>> To be safe, this code would need to use a version of GetXLogInsertRecPtr()
>>> that does use XLogBytePosToEndRecPtr() instead of XLogBytePosToRecPtr().
>>
>> I agree. Thanks for diagnosing it. Feel free to move forward with that
>> strategy, or let me know if you'd like me to do it.
>
> I'd appreciate if you could do it.
>
Here's a fix for master (and backpatching). It introduces a new function
GetXLogInsertEndRecPtr() and then uses that in gistGetFakeLSN(). I still
need to test this a bit more, but it did fix the issue in our dev branch
(where we saw regular failures). So I'm 99% sure it's fine.
After writing the fix I had the idea to grep for GetXLogInsertRecPtr
calls that might have similar issue (being passed to XLogFlush), and
sure enough walsender does this:
XLogFlush(GetXLogInsertRecPtr());
Which AFAICS has the same issue, right? Funnily enough, this is a very
new call, from 2026/03/06. Before 6eedb2a5fd88 walsender might flush not
far enough, now it may be flushing too far ;-) AFAIK it should call the
same GetXLogInsertEndRecPtr() once we have it.
regards
--
Tomas Vondra