Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] logical decoding of two-phase transactions
Date
Msg-id CAA4eK1L=dhuCRvyDvrXX5wZgc7s1hLRD29CKCK6oaHtVCPgiFA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] logical decoding of two-phase transactions  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Thu, Feb 18, 2021 at 5:48 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Please find attached the new patch set v41*
>

I see one issue here. Currently, when we create a subscription, we
first launch apply-worker and create the main apply worker slot and
then launch table sync workers as required. Now, assume, the apply
worker slot is created and after that, we launch tablesync worker,
which will initiate its slot (sync_slot) creation. Then, on the
publisher-side, the situation is such that there is a prepared
transaction that happens before we reach a consistent snapshot. We can
assume the exact scenario as we have in twophase_snapshot.spec where
we skip prepared xact due to this reason.

Because the WALSender corresponding to apply worker is already running
so it will be in consistent state, for it, such a prepared xact can be
decoded and it will send the same to the subscriber. On the
subscriber-side, it can skip applying the data-modification operations
because the corresponding rel is still not in a ready state (see
should_apply_changes_for_rel and its callers) simply because the
corresponding table sync worker is not finished yet. But prepare will
occur and it will lead to a prepared transaction on the subscriber.

In this situation, tablesync worker has skipped prepare because the
snapshot was not consistent and then it exited because it is in sync
with the apply worker. And apply worker has skipped because tablesync
was in-progress. Later when Commit prepared will come, the
apply-worker will simply commit the previously prepared transaction
and we will never see the prepared transaction data.

So, the basic premise is that we can't allow tablesync workers to skip
prepared transactions (which can be processed by apply worker) and
process later commits.

I have one idea to address this. When we get the first begin_prepare
in the apply-worker, we can check if there are any relations in
"not_ready" state and if so then just wait till all the relations
become in sync with the apply worker. This is to avoid that any of the
tablesync workers might skip prepared xact and we don't want apply
worker to also skip the same.

Now, it is possible that some tablesync worker has copied the data and
moved the sync position ahead of where the current apply worker's
position is. In such a case, we need to process transactions in apply
worker such that we can process commits if any, and write prepared
transactions to file. For prepared transactions, we can take decisions
only once the commit prepared for them has arrived.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: repeated decoding of prepared transactions
Next
From: "Hou, Zhijie"
Date:
Subject: RE: A reloption for partitioned tables - parallel_workers