Re: locking [user] catalog tables vs 2pc vs logical rep - Mailing list pgsql-hackers

From vignesh C
Subject Re: locking [user] catalog tables vs 2pc vs logical rep
Date
Msg-id CALDaNm0gs_8Bxr_6vKoJTd13EVDRkRy3-Hu7=MpuKpOGeNUueQ@mail.gmail.com
Whole thread Raw
In response to Re: locking [user] catalog tables vs 2pc vs logical rep  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Tue, May 25, 2021 at 12:40 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, May 24, 2021 at 10:03:01AM +0530, Amit Kapila wrote:
> > So, this appears to be an existing caveat of synchronous replication.
> > If that is the case, I am not sure if it is a good idea to just block
> > such ops for the prepared transaction. Also, what about other
> > operations which acquire an exclusive lock on [user]_catalog_tables
> > like:
> > cluster pg_trigger using pg_class_oid_index, similarly cluster on any
> > user_catalog_table, then the other problematic operation could
> > truncate of user_catalog_table as is discussed in another thread [1].
> > I think all such operations can block even with synchronous
> > replication. I am not sure if we can create examples for all cases
> > because for ex. we don't have use of user_catalog_tables in-core but
> > maybe for others, we can try to create examples and see what happens?
> >
> > If all such operations can block for synchronous replication and
> > prepared transactions replication then we might want to document them
> > as caveats at page:
> > https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
> > and then also give the reference for these caveats at prepared
> > transactions page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-commits.html
> >
> > What do you think?
>
> It seems to me that the 2PC issues on catalog tables and the issues
> related to logical replication in synchonous mode are two distinct
> things that need to be fixed separately.
>
> The issue with LOCK taken on a catalog while a PREPARE TRANSACTION
> holds locks around is bad enough in itself as it could lock down a
> user from a cluster as long as the PREPARE TRANSACTION is not removed
> from WAL (say the relation is critical for the connection startup).
> This could be really disruptive for the user even if he tried to take
> a lock on an object he owns, and the way to recover is not easy here,
> and the way to recover involves either an old backup or worse,
> pg_resetwal.
>
> The second issue with logical replication is still disruptive, but it
> looks to me more like a don't-do-it issue, and documenting the caveats
> sounds fine enough.
>
> Looking at the patch from upthread..
>
> +           /*
> +            * Make note that we've locked a system table or an user catalog
> +            * table. This flag will be checked later during prepare transaction
> +            * to fail the prepare transaction.
> +            */
> +           if (lockstmt->mode >= ExclusiveLock &&
> +               (IsCatalogRelationOid(reloid) ||
> +                RelationIsUsedAsCatalogTable(rel)))
> +               MyXactFlags |= XACT_FLAGS_ACQUIREDEXCLUSIVELOCK_SYSREL;
> I think that I'd just use IsCatalogRelationOid() here, and I'd be more
> severe and restrict all attempts for any lock levels.  It seems to me
> that this needs to happen within RangeVarCallbackForLockTable().
> I would also rename the flag as just XACT_FLAGS_LOCKEDCATALOG.
>
> +                errmsg("cannot PREPARE a transaction that has an exclusive lock on user catalog/system
table(s)")));
> What about "cannot PREPARE a transaction that has locked a catalog
> relation"?

At this point it is not clear if we are planning to fix this issue by
throwing an error or document it. I will fix these comments once we
come to consensus.

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Amit Kapila
Date:
Subject: Re: Skipping logical replication transactions on subscriber side