Re: bogus: logical replication rows/cols combinations - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: bogus: logical replication rows/cols combinations |
Date | |
Msg-id | CAA4eK1JC8v8ysNS9L1et0_yyZJOEE1wB+GrvOcar_f5dVadvBg@mail.gmail.com Whole thread Raw |
In response to | Re: bogus: logical replication rows/cols combinations (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
RE: bogus: logical replication rows/cols combinations
|
List | pgsql-hackers |
On Wed, May 11, 2022 at 12:35 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 5/9/22 05:45, Amit Kapila wrote: > > On Sun, May 8, 2022 at 11:41 PM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> > >> On 5/7/22 07:36, Amit Kapila wrote: > >>> On Mon, May 2, 2022 at 6:11 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > >>>> > >>>> On 2022-May-02, Amit Kapila wrote: > >>>> > >>>> > >>>>> I think it is possible to expose a list of publications for each > >>>>> walsender as it is stored in each walsenders > >>>>> LogicalDecodingContext->output_plugin_private. AFAIK, each walsender > >>>>> can have one such LogicalDecodingContext and we can probably share it > >>>>> via shared memory? > >>>> > >>>> I guess we need to create a DSM each time a walsender opens a > >>>> connection, at START_REPLICATION time. Then ALTER PUBLICATION needs to > >>>> connect to all DSMs of all running walsenders and see if they are > >>>> reading from it. Is that what you have in mind? Alternatively, we > >>>> could have one DSM per publication with a PID array of all walsenders > >>>> that are sending it (each walsender needs to add its PID as it starts). > >>>> The latter might be better. > >>>> > >>> > >>> While thinking about using DSM here, I came across one of your commits > >>> f2f9fcb303 which seems to indicate that it is not a good idea to rely > >>> on it but I think you have changed dynamic shared memory to fixed > >>> shared memory usage because that was more suitable rather than DSM is > >>> not portable. Because I see a commit bcbd940806 where we have removed > >>> the 'none' option of dynamic_shared_memory_type. So, I think it should > >>> be okay to use DSM in this context. What do you think? > >>> > >> > >> Why would any of this be needed? > >> > >> ALTER PUBLICATION will invalidate the RelationSyncEntry entries in all > >> walsenders, no? So AFAICS it should be enough to enforce the limitations > >> in get_rel_sync_entry, > >> > > > > Yes, that should be sufficient to enforce limitations in > > get_rel_sync_entry() but it will lead to the following behavior: > > a. The Alter Publication command will be successful but later in the > > logs, the error will be logged and the user needs to check it and take > > appropriate action. Till that time the walsender will be in an error > > loop which means it will restart and again lead to the same error till > > the user takes some action. > > b. As we use historic snapshots, so even after the user takes action > > say by changing publication, it won't be reflected. So, the option for > > the user would be to drop their subscription. > > > > Am, I missing something? If not, then are we okay with such behavior? > > If yes, then I think it would be much easier implementation-wise and > > probably advisable at this point. We can document it so that users are > > careful and can take necessary action if they get into such a > > situation. Any way we can improve this in future as you also suggested > > earlier. > > > >> which is necessary anyway because the subscriber > >> may not be connected when ALTER PUBLICATION gets executed. > >> > > > > If we are not okay with the resultant behavior of detecting this in > > get_rel_sync_entry(), then we can solve this in some other way as > > Alvaro has indicated in one of his responses which is to detect that > > at start replication time probably in the subscriber-side. > > > > IMO that behavior is acceptable. > Fair enough, then we should go with a simpler approach to detect it in pgoutput.c (get_rel_sync_entry). > We have to do that check anyway, and > the subscription may start failing after ALTER PUBLICATION for a number > of other reasons anyway so the user needs/should check the logs. > I think ALTER PUBLICATION won't ever lead to failure in walsender. Sure, users can do something due to which subscriber-side failures can happen due to constraint failures. Do you have some specific cases in mind? > And if needed, we can improve this and start doing the proactive-checks > during ALTER PUBLICATION too. > Agreed. -- With Regards, Amit Kapila.
pgsql-hackers by date: