Re: Assert while autovacuum was executing - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Assert while autovacuum was executing
Date
Msg-id CAA4eK1JQavtFgHtrS1UzuL4=OZdDZVBemjFCHEiShhVPaaA03w@mail.gmail.com
Whole thread Raw
In response to Re: Assert while autovacuum was executing  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Assert while autovacuum was executing
Re: Assert while autovacuum was executing
List pgsql-hackers
On Thu, Jun 22, 2023 at 9:16 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 21, 2023 at 10:57 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > As far as I can tell 72e78d831a as-is is just bogus. Unfortunately that likely
> > also means 3ba59ccc89 is not right.
> >
>
> Indeed. I was thinking of a fix but couldn't find one yet. One idea I
> am considering is to allow catalog table locks after page lock but I
> think apart from hacky that also won't work because we still need to
> remove the check added for page locks in the deadlock code path in
> commit 3ba59ccc89 and may need to do something for group locking.
>

I have further thought about this part and I think even if we remove
the changes in commit 72e78d831a (remove the assertion for page locks
in LockAcquireExtended()) and remove the check added for page locks in
FindLockCycleRecurseMember() via commit 3ba59ccc89, it is still okay
to keep the change related to "allow page lock to conflict among
parallel group members" in LockCheckConflicts(). This is because locks
on catalog tables don't conflict among group members. So, we shouldn't
see a deadlock among parallel group members. Let me try to explain
this thought via an example:

Begin;
Lock pg_enum in Access Exclusive mode;
gin_clean_pending_list() -- assume this function is executed by both
leader and parallel worker; also this requires a lock on pg_enum as
shown by Andres in email [1]

Say the parallel worker acquires page lock first and it will also get
lock on pg_enum because of group locking, so, the leader backend will
wait for page lock for the parallel worker. Eventually, the parallel
worker will release the page lock and the leader backend can get the
lock. So, we should be still okay with parallelism.

OTOH, if the above theory is wrong or people are not convinced, I am
okay with removing all the changes in commits 72e78d831a and
3ba59ccc89.

[1] - https://www.postgresql.org/message-id/20230621052713.wc5377dyslxpckfj%40awork3.anarazel.de

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "Joel Jacobson"
Date:
Subject: Re: Do we want a hashset type?
Next
From: torikoshia
Date:
Subject: Re: Allow pg_archivecleanup to remove backup history files