On Mon, Jan 23, 2023 at 1:29 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Another thing that has a bad smell about it is the fact that
> process_syncing_tables_for_sync uses two transactions in the first
> place. There's a comment there claiming that it's for crash safety,
> but I can't help suspecting it's really because this case becomes a
> hard deadlock without that mid-function commit.
>
> It's not great in any case that the apply worker can move on in
> the belief that the tablesync worker is done when in fact the latter
> still has catalog state updates to make. And I wonder what we're
> doing with having both of them calling replorigin_drop_by_name
> ... shouldn't that responsibility belong to just one of them?
>
Originally, it was being dropped at one place only (via tablesync
worker) but we found a race condition as mentioned in the comments in
process_syncing_tables_for_sync() before the start of the second
transaction which leads to this change. See the report and discussion
about that race condition in the email [1].
[1] - https://www.postgresql.org/message-id/CAD21AoAw0Oofi4kiDpJBOwpYyBBBkJj=sLUOn4Gd2GjUAKG-fw@mail.gmail.com
--
With Regards,
Amit Kapila.