Re: Tablesync early exit - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Tablesync early exit
Date
Msg-id CAHut+Pu14kyNE7OHFQnq2gyUQAF2xR4Pra1frV=kHC6cwUqH-w@mail.gmail.com
Whole thread Raw
In response to Re: Tablesync early exit  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Tablesync early exit  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Mar 16, 2022 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 8:50 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Patch v2 is the same; it only needed re-basing to the latest HEAD.
> >
>
> Why do you think it is correct to exit before trying to receive any
> message?

I think the STATE_CATCHUP state guarantees the apply worker must have
received (or tried to receive) a message. See the next answer.

> How will we ensure whether the apply worker has processed any
> message?

All this patch code does is call process_syncing_tables, which
delegates to process_syncing_tables_for_sync (because the call is from
a tablesync worker). This function code can’t do anything unless the
tablesync worker is in STATE_CATCHUP state, and that cannot happen
unless it was explicitly set to that state by the apply worker.

On the other side of the coin, the apply worker can only set that
syncworker->relstate = SUBREL_STATE_CATCHUP from within function
process_syncing_tables_for_apply, and AFAIK that function is only
called when the apply worker has either handled a message, (or the
walrcv_receive in the  LogicalRepApplyLoop received nothing).

So I think the STATE_CATCHUP mechanism itself ensures the apply worker
*must* have already processed a message (or there was no message to
process).

> At the beginning of function LogicalRepApplyLoop(),
> last_received is the LSN where the copy has finished in the case of
> tablesync worker. I think we need to receive the message before trying
> to ensure whether we have synced with the apply worker or not.
>

I think the STATE_CATCHUP guarantees the apply worker must have
received (or tried to receive) a message. See the previous answer.

~~~

AFAIK this patch is OK, but since it is not particularly urgent I've
bumped this to the next CommitFest [1] instead of trying to jam it
into PG15 at the last minute.

BTW - There were some useful logfiles I captured a very long time ago
[2]. They show the behaviour without/with this patch.

------
[1] https://commitfest.postgresql.org/37/3062/
[2] https://www.postgresql.org/message-id/CAHut+Ptjk-Qgd3R1a1_tr62CmiswcYphuv0pLmVA-+2s8r0Bkw@mail.gmail.com

Kind Regards,
Peter Smith
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)