RE: bogus: logical replication rows/cols combinations - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: bogus: logical replication rows/cols combinations
Date
Msg-id OS0PR01MB5716A594C58DE4FFD1F8100B94C89@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: bogus: logical replication rows/cols combinations  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: bogus: logical replication rows/cols combinations
List pgsql-hackers
On Wednesday, May 11, 2022 11:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, May 11, 2022 at 12:35 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> > On 5/9/22 05:45, Amit Kapila wrote:
> > > On Sun, May 8, 2022 at 11:41 PM Tomas Vondra
> > > <tomas.vondra@enterprisedb.com> wrote:
> > >>
> > >> On 5/7/22 07:36, Amit Kapila wrote:
> > >>> On Mon, May 2, 2022 at 6:11 PM Alvaro Herrera
> <alvherre@alvh.no-ip.org> wrote:
> > >>>>
> > >>>> On 2022-May-02, Amit Kapila wrote:
> > >>>>
> > >>>>
> > >>>>> I think it is possible to expose a list of publications for each
> > >>>>> walsender as it is stored in each walsenders
> > >>>>> LogicalDecodingContext->output_plugin_private. AFAIK, each
> > >>>>> LogicalDecodingContext->walsender
> > >>>>> can have one such LogicalDecodingContext and we can probably
> > >>>>> share it via shared memory?
> > >>>>
> > >>>> I guess we need to create a DSM each time a walsender opens a
> > >>>> connection, at START_REPLICATION time.  Then ALTER PUBLICATION
> > >>>> needs to connect to all DSMs of all running walsenders and see if
> > >>>> they are reading from it.  Is that what you have in mind?
> > >>>> Alternatively, we could have one DSM per publication with a PID
> > >>>> array of all walsenders that are sending it (each walsender needs to
> add its PID as it starts).
> > >>>> The latter might be better.
> > >>>>
> > >>>
> > >>> While thinking about using DSM here, I came across one of your
> > >>> commits
> > >>> f2f9fcb303 which seems to indicate that it is not a good idea to
> > >>> rely on it but I think you have changed dynamic shared memory to
> > >>> fixed shared memory usage because that was more suitable rather
> > >>> than DSM is not portable. Because I see a commit bcbd940806 where
> > >>> we have removed the 'none' option of dynamic_shared_memory_type.
> > >>> So, I think it should be okay to use DSM in this context. What do you
> think?
> > >>>
> > >>
> > >> Why would any of this be needed?
> > >>
> > >> ALTER PUBLICATION will invalidate the RelationSyncEntry entries in
> > >> all walsenders, no? So AFAICS it should be enough to enforce the
> > >> limitations in get_rel_sync_entry,
> > >>
> > >
> > > Yes, that should be sufficient to enforce limitations in
> > > get_rel_sync_entry() but it will lead to the following behavior:
> > > a. The Alter Publication command will be successful but later in the
> > > logs, the error will be logged and the user needs to check it and
> > > take appropriate action. Till that time the walsender will be in an
> > > error loop which means it will restart and again lead to the same
> > > error till the user takes some action.
> > > b. As we use historic snapshots, so even after the user takes action
> > > say by changing publication, it won't be reflected. So, the option
> > > for the user would be to drop their subscription.
> > >
> > > Am, I missing something? If not, then are we okay with such behavior?
> > > If yes, then I think it would be much easier implementation-wise and
> > > probably advisable at this point. We can document it so that users
> > > are careful and can take necessary action if they get into such a
> > > situation. Any way we can improve this in future as you also
> > > suggested earlier.
> > >
> > >> which is necessary anyway because the subscriber may not be
> > >> connected when ALTER PUBLICATION gets executed.
> > >>
> > >
> > > If we are not okay with the resultant behavior of detecting this in
> > > get_rel_sync_entry(), then we can solve this in some other way as
> > > Alvaro has indicated in one of his responses which is to detect that
> > > at start replication time probably in the subscriber-side.
> > >
> >
> > IMO that behavior is acceptable.
> >
> 
> Fair enough, then we should go with a simpler approach to detect it in
> pgoutput.c (get_rel_sync_entry).

OK, here is the patch that try to check column list in that way. The patch also
check the column list when CREATE SUBSCRIPTION and when starting initial copy.

Best regards,
Hou zj



Attachment

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: gitmaster access
Next
From: Masahiko Sawada
Date:
Subject: Re: Support logical replication of DDLs