Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAA4eK1LnfTnnqLH4Fd0mTAKN2VtBN+sprQRSEE1R+jSsvv++ow@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Mon, Jun 28, 2021 at 10:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > > Now, if this function is used by super
> > > > users then we can probably trust that they provide the XIDs that we
> > > > can trust to be skipped but OTOH making a restriction to allow these
> > > > functions to be used by superusers might restrict the usage of this
> > > > repair tool.
> > >
> > > If we specify the subscription id or name, maybe we can allow also the
> > > owner of subscription to do that operation?
> >
> > Ah, the owner of the subscription must be superuser.
>
> I've attached PoC patches.
>
> 0001 patch introduces the ability to skip transactions on the
> subscriber side. We can specify XID to the subscription by like ALTER
> SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation
> seems straightforward except for setting origin state. After skipping
> the transaction we have to update the session origin state so that we
> can start streaming the transaction next to the one that we just
> skipped in case of the server crash or restarting the apply worker. We
> set origin state to the commit WAL record. However, since we skip all
> changes we don’t write any WAL even if we call CommitTransaction() at
> the end of the skipped transaction. So the patch sets the origin state
> to the transaction that updates the pg_subscription system catalog to
> reset the skip XID. I think we need a discussion of this part.
>

IIUC, for streaming transactions you are allowing stream file to be
created and then remove it at stream_commit/stream_abort time, is that
right? If so, in which cases are you imagining the files to be
created, is it in the case of relation message
(LOGICAL_REP_MSG_RELATION)? Assuming the previous two statements are
correct, this will skip the relation message as well as part of the
removal of stream files which might lead to a problem because the
publisher won't know that we have skipped the relation message and it
won't send it again. This can cause problems while processing the next
messages.

> With 0002 and 0003 patches, we report the error information in server
> logs and the stats view, respectively. 0002 patch adds errcontext for
> messages that happened during applying the changes:
>
> ERROR:  duplicate key value violates unique constraint "hoge_pkey"
> DETAIL:  Key (c)=(1) already exists.
> CONTEXT:  during apply of "INSERT" for relation "public.hoge" in
> transaction with xid 736 committs 2021-06-27 12:12:30.053887+09
>
> 0003 patch adds pg_stat_logical_replication_error statistics view
> discussed on another thread[1]. The apply worker sends the error
> information to the stats collector if an error happens during applying
> changes. We can check those errors as follow:
>
> postgres(1:25250)=# select * from pg_stat_logical_replication_error;
>  subname  | relid | action | xid |         last_failure
> ----------+-------+--------+-----+-------------------------------
>  test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
> (1 row)
>
> I added only columns required for the skipping transaction feature to
> the view for now.
>

Isn't it better to add an error message if possible?

> Please note that those patches are meant to evaluate the concept we've
> discussed so far. Those don't have the doc update yet.
>

I think your patch is on the lines of what we have discussed. It would
be good if you can update docs and add few tests.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: PITR: enhance getRecordTimestamp()
Next
From: David Rowley
Date:
Subject: Re: Use simplehash.h instead of dynahash in SMgr