Home > mailing lists

Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Support logical replication of DDLs
Date	May 10, 2022 09:27:28
Msg-id	CAA4eK1+3YkeSZeZAeB7no0fSQHSWmM_wZeJPZvbLCTE-mAUY=Q@mail.gmail.com Whole thread
In response to	Re: Support logical replication of DDLs (Masahiko Sawada <sawada.mshk@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, May 10, 2022 at 12:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 13, 2022 at 6:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Apr 13, 2022 at 2:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > > The *initial* DDL replication is a different problem than DDL replication. The
> > > > > former requires a snapshot to read the current catalog data and build a CREATE
> > > > > command as part of the subscription process. The subsequent DDLs in that object
> > > > > will be handled by a different approach that is being discussed here.
> > > > >
> > > >
> > > > I think they are not completely independent because of the current way
> > > > to do initial sync followed by replication. The initial sync and
> > > > replication need some mechanism to ensure that one of those doesn't
> > > > overwrite the work done by the other. Now, the initial idea and patch
> > > > can be developed separately but I think both the patches have some
> > > > dependency.
> > >
> > > I agree with the point that their design can not be completely
> > > independent.  They have some logical relationship of what schema will
> > > be copied by the initial sync and where is the exact boundary from
> > > which we will start sending as replication.  And suppose first we only
> > > plan to implement the replication part then how the user will know
> > > what all schema user has to create and what will be replicated using
> > > DDL replication?  Suppose the user takes a dump and copies all the
> > > schema and then creates the subscription, then how we are we going to
> > > handle the DDL concurrent to the subscription command?
> > >
> >
> > Right, I also don't see how it can be done in the current
> > implementation. So, I think even if we want to develop these two as
> > separate patches they need to be integrated to make the solution
> > complete.
>
> It would be better to develop them separately in terms of development
> speed but, yes, we perhaps need to integrate them at some points.
>
> I think that the initial DDL replication can be done when the
> relation's state is SUBREL_STATE_INIT. That is, at the very beginning
> of the table synchronization, the syncworker copies the table schema
> somehow, then starts the initial data copy. After that, syncworker or
> applyworker applies DML/DDL changes while catching up and streaming
> changes, respectively. Probably we can have it optional whether to
> copy schema only, data only, or both.
>

This sounds okay for copying table schema but we can have other
objects like functions, procedures, views, etc. So, we may need
altogether a separate mechanism to copy all the published objects.

-- 
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Simon Riggs
Date: 10 May 2022, 09:13:13
Subject: Allowing REINDEX to have an optional name

From: Simon Riggs
Date: 10 May 2022, 09:42:59
Subject: Re: Hash index build performance tweak from sorting

Re: Support logical replication of DDLs - Mailing list pgsql-hackers

Previous

Next