On 3/17/25 13:18, Thomas Munro wrote:
> On Tue, Mar 18, 2025 at 12:59 AM Tomas Vondra <tomas@vondra.me> wrote:
>> On 3/17/25 12:36, Tomas Vondra wrote:
>>> I'm still fiddling with the script, trying to increase the probability
>>> of the (apparent) race condition. On one machine (old Xeon) I can hit it
>>> very easily/reliably, while on a different machine (new Ryzen) it's very
>>> rare. I don't know if that's due to difference in speed of the CPU, or
>>> fewer cores, ... I guess it changes the timing just enough.
>>>
>>> I've also tried running the stress test on PG17, and I'm yet to see a
>>> single failure there. Not even on the xeon machine, that hits it
>>> reliably on 18. So this seems to be a PG18-only issue.
>>>
>>
>> And of course, the moment I sent this, I got a failure on 17 too. But
>> it's seems much harder to hit (compared to 18).
>
> Could there be a connection to this commit?
>
> commit 119c23eb9819213551cbe7e7665c8b493c59ceee
> Author: Nathan Bossart <nathan@postgresql.org>
> Date: Tue Sep 5 13:59:06 2023 -0700
>
> Replace known_assigned_xids_lck with memory barriers.
Doesn't seem to be the case. I reverted this (on master), and I still
get the assert failures (roughly the same number / loop).
regards
--
Tomas Vondra