Hi,
On 2023-06-21 21:50:39 -0700, Noah Misch wrote:
> On Wed, Jun 21, 2023 at 03:12:08PM -0700, Andres Freund wrote:
> > When vac_truncate_clog() returns early
> ...
> > we haven't released the lwlock that we acquired earlier
>
> > Until there's some cause for the session to call LWLockReleaseAll(), the lock
> > is held. Until then neither the process holding the lock, nor any other
> > process, can finish vacuuming. We don't even have an assert against a
> > self-deadlock with an already held lock, oddly enough.
>
> I agree with this finding. Would you like to add the lwlock releases, or
> would you like me to?
Happy with either. I do have code and testcase, so I guess it would make
sense for me to do it?
> The bug has been in all released versions for 2.5 years, yet it escaped
> notice. That tells us something. Bogus values have gotten rare? The
> affected session tends to get lucky and call LWLockReleaseAll() soon?
I am not sure either. I suspect that part of it is that people couldn't even
pinpoint the problem when it happened. Process exit calls LWLockReleaseAll(),
which I assume would avoid the problem in many cases.
Greetings,
Andres Freund