Hi,
On 2024-04-15 10:54:16 -0400, Robert Haas wrote:
> On Fri, Apr 12, 2024 at 3:33 PM Andres Freund <andres@anarazel.de> wrote:
> > Here's a patch implementing this approach. I confirmed that before we trigger
> > the stuck spinlock logic very quickly and after we don't. However, if most
> > sleeps are interrupted, it can delay the stuck spinlock detection a good
> > bit. But that seems much better than triggering it too quickly.
>
> +1 for doing something about this. I'm not sure if it goes far enough,
> but it definitely seems much better than doing nothing.
One thing I started to be worried about is whether a patch ought to prevent
the timeout used by perform_spin_delay() from increasing when
interrupted. Otherwise a few signals can trigger quite long waits.
But as a I can't quite see a way to make this accurate in the backbranches, I
suspect something like what I posted is still a good first version.
> Given your findings, I'm honestly kind of surprised that I haven't seen
> problems of this type more frequently.
Same. I did a bunch of searches for the error, but found surprisingly
little.
I think in practice most spinlocks just aren't contended enough to reach
perform_spin_delay(). And we have improved some on that over time.
Greetings,
Andres Freund