Re: logical replication worker accesses catalogs in error context callback - Mailing list pgsql-hackers

From Tom Lane
Subject Re: logical replication worker accesses catalogs in error context callback
Date
Msg-id 1270252.1625329862@sss.pgh.pa.us
Whole thread Raw
In response to Re: logical replication worker accesses catalogs in error context callback  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> Isn't it better to have the below comment before
> slot_store_error_callback, similar to what's there before
> conversion_error_callback in v7-0002.
>  * Note that this function mustn't do any catalog lookups, since we are in
>  * an already-failed transaction.

Not really, as there's not much temptation to do so in the current form
of that function.  I have no desire to go around and plaster every
errcontext callback with such comments.

> Maybe it's worth considering
> avoid_sys_cache_lookup_in_error_callback_v2.patch from [1] as another
> way to enforce the above statement.

That looks fundamentally broken from here.  Wouldn't it forbid
perfectly OK code like this randomly-chosen example from tablecmds.c?

        if (list_member_oid(inheritOids, parentOid))
            ereport(ERROR,
                    (errcode(ERRCODE_DUPLICATE_TABLE),
                     errmsg("relation \"%s\" would be inherited from more than once",
                            get_rel_name(parentOid))));

That is, it's a bit hard to say exactly where in the error processing
sequence we should start deeming it unsafe to do a catalog lookup;
but the mere presence of an error stack entry can't mean that.

In a lot of situations, there wouldn't be any need to consider the
transaction broken until we've done a longjmp, so that catalog
lookups in errcontext callbacks would be perfectly safe.  (Which
indeed is why we've not yet seen an actual problem in either of
the spots discussed in this thread.)  The reason for being paranoid
about what an errcontext callback can do is that the callback cannot
know what the current failure's cause is.  If we're trying to report
some internal error that means that the transaction really is pretty
broken, then it'd be problematic to initiate new catalog accesses.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Gilles Darold
Date:
Subject: Re: [PATCH] Hooks at XactCommand level
Next
From: Tom Lane
Date:
Subject: Re: logical replication worker accesses catalogs in error context callback