Re: logical replication worker accesses catalogs in error context callback - Mailing list pgsql-hackers

From Zhihong Yu
Subject Re: logical replication worker accesses catalogs in error context callback
Date
Msg-id CALNJ-vR3YPD+sh8x4kR8_i3BdHgqmJn5Sfpz2+7zqvHevk2KHA@mail.gmail.com
Whole thread Raw
In response to Re: logical replication worker accesses catalogs in error context callback  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: logical replication worker accesses catalogs in error context callback
List pgsql-hackers
For 0002, a few comments on the description:

bq. Avoid accessing system catalogues inside conversion_error_callback

catalogues -> catalog

bq. error context callback, because the the entire transaction might

There is redundant 'the'

bq. Since get_attname() and get_rel_name() could search the sys cache by
store the required information

store -> storing

Cheers

On Tue, Jan 26, 2021 at 5:17 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Tue, Jan 26, 2021 at 11:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > IMHO, adding an assertion in SearchCatCacheInternal(which is a most
> > generic code part within the server) with a few error context global
> > variables may not be always safe. Because we might miss using the
> > error context variables properly. Instead, we could add a comment in
> > ErrorContextCallback structure saying something like, "it's not
> > recommended to access any system catalogues within an error context
> > callback when the callback is expected to be called while processing
> > an error, because the transaction might have been broken in that
> > case." And let the future callback developers take care of it.
> >
> > Thoughts?
>
> That sounds good to me. But I still also see the value to add an
> assertion in SearchCatCacheInternal(). If we had it, we could find
> these two bugs earlier.
>
> Anyway, this seems to be unrelated to this bug fixing so we can start
> a new thread for that.

+1 to start a new thread for that.

> > As I said earlier [1], currently only two(there could be more) error
> > context callbacks access the sys catalogues, one is in
> > slot_store_error_callback which will be fixed with your patch. Another
> > is in conversion_error_callback, we can also fix this by storing the
> > relname, attname and other required info in ConversionLocation,
> > something like the attached. Please have a look.
> >
> > Thoughts?
>
> Thank you for the patch!
>
> Here are some comments:

Thanks for the review comments.

> +static void set_error_callback_info(ConversionLocation *errpos,
> +                                          Relation rel, int cur_attno,
> +                                          ForeignScanState *fsstate);
>
> I'm concerned a bit that the function name is generic. How about
> set_conversion_error_callback_info() or something to make the name
> clear?

Done.

> ---
> +static void
> +conversion_error_callback(void *arg)
> +{
> +       ConversionLocation *errpos = (ConversionLocation *) arg;
> +
> +       if (errpos->show_generic_message)
> +               errcontext("processing expression at position %d in
> select list",
> +                                       errpos->cur_attno);
> +
>
> I think we can set this generic message to the error context when
> errpos->relname is NULL instead of using show_generic_message.

Right. Done.

> ---
> +       /*
> +        * Set error context callback info, so that we could avoid accessing
> +        * the system catalogues while processing the error in
> +        * conversion_error_callback. System catalogue accesses are not safe in
> +        * error context callbacks because the transaction might have been
> +        * broken by then.
> +        */
> +       set_error_callback_info(&errpos, rel, i, fsstate);
>
> Looking at other code, we use "catalog" rather than "catalogue" in
> most places. Is it better to use "catalog" for consistency? FYI, the
> "catalogue" is used at only three places in the server code:

Changed it to "catalog".

> FYI I've added those bug fixes to the next Commitfest - https://commitfest.postgresql.org/32/2955/

Thanks. I'm attaching 2 patches to make it easy for reviewing and also
they will get a chance to be tested by cf bot.

0001 - for avoiding system catalog access in slot_store_error_callback.
0002 - for avoiding system catalog access in conversion_error_callback

Please review it further.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add SQL function for SHA1
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: archive status ".ready" files may be created too early