Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAA4eK1JR2GR9jjaz9T1ZxzgLVS0h089EE8ZB=F2EsVHbM_5sfA@mail.gmail.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Thu, Aug 18, 2022 at 3:40 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Aug 18, 2022 at 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > 47. src/include/replication/logicalproto.h
> > >
> > > @@ -32,12 +32,17 @@
> > >   *
> > >   * LOGICALREP_PROTO_TWOPHASE_VERSION_NUM is the minimum protocol version with
> > >   * support for two-phase commit decoding (at prepare time). Introduced in PG15.
> > > + *
> > > + * LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM is the minimum protocol version
> > > + * with support for streaming large transactions using apply background
> > > + * workers. Introduced in PG16.
> > >   */
> > >  #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > >  #define LOGICALREP_PROTO_VERSION_NUM 1
> > >  #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > >  #define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
> > > -#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_TWOPHASE_VERSION_NUM
> > > +#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
> > > +#define LOGICALREP_PROTO_MAX_VERSION_NUM
> > > LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
> > >
> > > 47a.
> > > I don't think that comment is strictly true. IIUC the new protocol
> > > version 4 is currently only affecting the *extra* STREAM_ABORT members
> > > – but in fact streaming=parallel is still functional without using
> > > those extra members, isn't it? So maybe this description needed to be
> > > modified a bit to be more accurate?
> > >
> >
> > The reason for sending this extra abort members is to ensure that
> > after aborting the transaction, if the subscriber/apply worker
> > restarts, it doesn't need to request the transaction again. Do you
> > have suggestions for improving this comment?
> >
>
> I gave three review comments for v21-0001 that were all related to
> this same point:
> i- #4b (commit message)
> ii- #7 (protocol pgdocs)
> iii- #47a (code comment)
>
> The point was:
> AFAIK protocol 4 is only to let the parallel streaming logic behave
> *better* in how it can handle restarts after aborts. But that does not
> mean that protocol 4 is a *pre-requisite* for "allowing"
> streaming=parallel to work in the first place. I thought that a PG15
> publisher and PG16 subscriber can still work using streaming=parallel
> even with protocol 3, but it just won't be quite as clever for
> handling restarts after abort as protocol 4 (PG16 -> PG16) would be.
>

It is not only that it makes it better but one can say that it is a
break of a replication protocol that after the client (subscriber) has
applied some transaction, it requests the same transaction again. So,
I think it is better to make the parallelism work only when the server
version is also 16.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Next
From: Anant ngo
Date:
Subject: Data caching