Thread: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18644 Logged by: Maxim Boguk Email address: maxim.boguk@gmail.com PostgreSQL version: 15.8 Operating system: Ubuntu Description: Hi, When ALTER PUBLICATION ... SET (publish_via_partition_root) executed on the existing logical replication with data (following ALTER SUBSCRIPTION ... REFRESH PUBLICATION), the database start copy whole partitioned table from start (thus breaking existing logical replication). What's worse - I didn't found any way get out of such situation less than redo all multi-TB logical replication from blank db. Also ALTER SUBSCRIPTION ... REFRESH PUBLICATION (copy_data=false) - cannot be used as workaround because it lead to loose any changes in partitioned table between run ALTER PUBLICATION and ALTER SUBSCRIPTION. Afterthought this behavior not surprising at all but I think better to document it somewhere (or even better disable ALTER PUBLICATION ... SET (publish_via_partition_root) for any publication with existing subscriptions because it will break things for sure). After I look into pub/sub code - I feel it will be very complicated task to make it work correctly. -- Maxim
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Amit Kapila
Date:
On Wed, Oct 2, 2024 at 12:14 AM PG Bug reporting form <noreply@postgresql.org> wrote: > > The following bug has been logged on the website: ... > > When ALTER PUBLICATION ... SET (publish_via_partition_root) executed on the > existing logical replication with data > (following ALTER SUBSCRIPTION ... REFRESH PUBLICATION), the database start > copy whole partitioned table from start (thus breaking existing logical > replication). > What's worse - I didn't found any way get out of such situation less than > redo all multi-TB logical replication from blank db. > > Also ALTER SUBSCRIPTION ... REFRESH PUBLICATION (copy_data=false) - cannot > be used as workaround because it lead to loose any changes in partitioned > table between run ALTER PUBLICATION and ALTER SUBSCRIPTION. > > Afterthought this behavior not surprising at all but I think better to > document it somewhere. > Can you tell us what additional information you want to document other than what is already documented in CREATE PUBLICATION [1] related to this parameter? It would be useful if you can create a small test case to show the exact problem and what is your usecase for the same? [1] - https://www.postgresql.org/docs/devel/sql-createpublication.html -- With Regards, Amit Kapila.
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Maxim Boguk
Date:
On Fri, Oct 4, 2024 at 7:30 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Maxim, > > Thanks for reporting the issue. Before discussing more about it, I want to confirm > the exact workload you did. Is below sequence same as you expected? Yes it is exactly the my case/sequence of actions (but with 10TB database (). Thank you for creating a reproducer (I was working on it but due time constraints I didn't finish it before you posted your test case). -- Maxim Boguk Senior Postgresql DBA Phone UA: +380 99 143 0000 Phone AU: +61 45 218 5678
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Maxim Boguk
Date:
> Can you tell us what additional information you want to document other > than what is already documented in CREATE PUBLICATION [1] related to > this parameter? At least put a warning that alter publication set (publish_via_partition_root) will unrecoverable break any existing logical replication subscribers if there at least one partitioned table with data in replication set, so alter publication set (publish_via_partition_root) should be run only before any subscription activation. Ideally - make this use case actually working (but looking into the code - it would be very complicated I feel). >>It would be useful if you can create a small test case to show the exact problem and what is your usecase for the same? Usecase - after initial load of logical replica I decided that on the replica I better split future data into weekly partitions due huge size (instead of monthly partitions on the master/publisher) exactly the case for "alter publication set (publish_via_partition_root)". My main issues with this case - there is no way to fix this problem if it happened less than reloading whole logical replication from blank. -- Maxim Boguk Senior Postgresql DBA Phone UA: +380 99 143 0000 Phone AU: +61 45 218 5678
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Amit Kapila
Date:
On Fri, Oct 4, 2024 at 7:38 PM Maxim Boguk <maxim.boguk@gmail.com> wrote: > > >>It would be useful if you can create a small test case to show the exact problem and what is your usecase for the same? > Usecase - after initial load of logical replica I decided that on the > replica I better split future data into weekly partitions due huge > size (instead of monthly partitions on the master/publisher) > exactly the case for "alter publication set (publish_via_partition_root)". > > My main issues with this case - there is no way to fix this problem if > it happened less than reloading whole logical replication from blank. > You can prevent the problem by avoiding writes to the partitioned tables between the Alter Pub and Alter Sub steps. One idea could be that in a parallel session on publisher lock the parent table in Access Exclusive mode till the Alter Sub command (with copy_data=false) is finished. -- With Regards, Amit Kapila.
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Maxim Boguk
Date:
On Tue, Oct 8, 2024 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Oct 4, 2024 at 7:38 PM Maxim Boguk <maxim.boguk@gmail.com> wrote: > > > > >>It would be useful if you can create a small test case to show the exact problem and what is your usecase for the same? > > Usecase - after initial load of logical replica I decided that on the > > replica I better split future data into weekly partitions due huge > > size (instead of monthly partitions on the master/publisher) > > exactly the case for "alter publication set (publish_via_partition_root)". > > > > My main issues with this case - there is no way to fix this problem if > > it happened less than reloading whole logical replication from blank. > > > > You can prevent the problem by avoiding writes to the partitioned > tables between the Alter Pub and Alter Sub steps. One idea could be > that in a parallel session on publisher lock the parent table in > Access Exclusive mode till the Alter Sub command (with > copy_data=false) is finished. > > -- > With Regards, > Amit Kapila. Thank you, it should be work. Unfortunately my English writing now is not good enough to suggest correct and easy to understand warnings in the documentation about this issue. -- Maxim Boguk Senior Postgresql DBA Phone UA: +380 99 143 0000 Phone AU: +61 45 218 5678
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
vignesh C
Date:
On Wed, 9 Oct 2024 at 11:01, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Maxim, Amit, > > > Thank you, it should be work. Unfortunately my English writing now is > > not good enough to suggest correct and easy to understand warnings in > > the documentation about this issue. > > I've written a doc patch to add a caution in ALTER PUBLICATION page. How do you feel? Few comments: 1) How about changing: + <para> + Altering the <literal>publish_via_partition_root</literal> during the + replication can lead to data loss or duplication. This is because the + publisher replicates changes as the partitioned table after the + altering, but it is not listed in + <link linkend="catalog-pg-subscription-rel"><literal>pg_subscription_rel</literal></link>. + </para> to: When the partition root table is specified as the replication target instead of its leaf tables, changing the publish_via_partition_root parameter during replication can cause data loss or duplication. This happens because the subscriber initially subscribed to the leaf tables. After running the ALTER PUBLICATION ... SET and ALTER SUBSCRIPTION ... REFRESH PUBLICATION command, the subscriber will subscribe to the partition root table. 2) Should we change: + To prevent the issue, you can avoid modifying the partitioned table + between the <command>ALTER PUBLICATION ... SET</command> and + <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>. to: + To prevent the issue, you can avoid modifying the partitioned table + between the <command>ALTER PUBLICATION ... SET</command> and + <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link> and refresh publication with copy_data as off. Regards, Vignesh
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Amit Kapila
Date:
On Wed, Oct 9, 2024 at 6:04 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Thanks for giving comments! Since I'm not a native I followed your points. > I ran Grammaly just in case and it said OK. > > PSA new version. > + <caution> + <para> + When the partition root table is specified as the replication target + instead of its leaf tables, altering the + <literal>publish_via_partition_root</literal> parameter during + replication can cause data loss or duplication. This happens because + the subscriber initially subscribed to the leaf tables. This assumes that the user is always changing publish_via_partition_root from 'false' to 'true'. Can't she change from 'false' to 'true' as well? + <para> + To prevent the issue, you can avoid modifying the partitioned table + between the <command>ALTER PUBLICATION ... SET</command> and Can't the problem happen when any of the leaf tables are modified? If so, that is not clear from the above statement. -- With Regards, Amit Kapila.
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Maxim Boguk
Date:
> + <para> > + To prevent the issue, you can avoid modifying the partitioned table > + between the <command>ALTER PUBLICATION ... SET</command> and > > Can't the problem happen when any of the leaf tables are modified? If > so, that is not clear from the above statement. Problem also happens if publication is created via FOR ALL TABLES / FOR TABLES IN SCHEMA (without directly including partition head into publication). I have a suspicion that including partition head (directly or via FOR ALL TABLES) into the initial publication - really required to get a problem. What is expected behavior in case when only partitions are included into publication/subscription set, but partition head isn't included and publish_via_partition_root='true' executed? -- Maxim Boguk Senior Postgresql DBA Phone UA: +380 99 143 0000 Phone AU: +61 45 218 5678
RE: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Maxim, > Problem also happens if publication is created via FOR ALL TABLES / > FOR TABLES IN SCHEMA (without directly including partition head into > publication). Thanks for pointing out. I also confirmed this issue can happen for FOR ALL TABLES/FOR TABLES IN SCHEMA publications. I modified a documentation. > What is expected behavior in case when only partitions are included > into publication/subscription set, but partition head isn't included > and publish_via_partition_root='true' executed? Just to confirm - you meant like below definitions, right? ``` create table tab (a int) partition by range (a); create table tab_1 partition of tab for values from (-10) to (0); create table tab_2 partition of tab for values from (0) to (10); create publication pub for table tab_1, tab_2 with (publish_via_partition_root = true); ``` For now, changes are replicated as leaf table's one in above case. And I think it is the expected behavior because users expressly specifies them. Best regards, Hayato Kuroda FUJITSU LIMITED
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Amit Kapila
Date:
On Wed, Oct 23, 2024 at 3:42 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > > Problem also happens if publication is created via FOR ALL TABLES / > > FOR TABLES IN SCHEMA (without directly including partition head into > > publication). > > Thanks for pointing out. I also confirmed this issue can happen for > FOR ALL TABLES/FOR TABLES IN SCHEMA publications. I modified a documentation. > I don't think we need to mention this in docs, the following change: "For now, changes are replicated as leaf table's one in above case. And I think it is the expected behavior because users expressly specifies them." added by you appears odd to me. -- With Regards, Amit Kapila.
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Peter Smith
Date:
Hi Kuroda-san, This is only a caution for ALTER of 'publish_via_partition_root', so that should be immediately clear up-front in the first few words of the text. Also, the current explanation appears to give more details about how and why it happens than I felt was necessary. e.g. this is for user documentation, not for a code comment. Basically, I'm wondering if the whole thing could be "dumbed down" a little bit, keeping only the information essential so the users can protect themselves. Maybe something a bit more like below? SUGGESTION <caution> <para> Altering the <literal>publish_via_partition_root</literal> parameter can lead to data loss or duplication at the subscriber because it changes the identity and schema of the published tables. </para> <para> This problem can be avoided by refraining from modifying partition leaf tables after the <command>ALTER PUBLICATION ... SET</command> until the <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link> is executed, and by only refreshing using the <literal>copy_data = off</literal> option. </para> </caution> ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Amit Kapila
Date:
On Fri, Oct 25, 2024 at 6:12 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Kuroda-san, > > This is only a caution for ALTER of 'publish_via_partition_root', so > that should be immediately clear up-front in the first few words of > the text. > > Also, the current explanation appears to give more details about how > and why it happens than I felt was necessary. e.g. this is for user > documentation, not for a code comment. Basically, I'm wondering if the > whole thing could be "dumbed down" a little bit, keeping only the > information essential so the users can protect themselves. Maybe > something a bit more like below? > > SUGGESTION > <caution> > <para> > Altering the <literal>publish_via_partition_root</literal> parameter > can lead to data loss or duplication at the subscriber because it > changes the identity and schema of the published tables. > </para> This appears precise but lacks the key information that the problem can happen when a partitioned root table is specified as a replication target. So, how about one of the following: * Altering the <literal>publish_via_partition_root</literal> parameter when the partition root table is specified as the replication target can lead to data loss or duplication at the subscriber because it changes the identity and schema of the published tables. * Altering the <literal>publish_via_partition_root</literal> parameter can lead to data loss or duplication at the subscriber because it changes the identity and schema of the published tables. Note this happens only when the partition root table is specified as the replication target. > <para> > This problem can be avoided by refraining from modifying > partition leaf tables > after the <command>ALTER PUBLICATION ... SET</command> until the > <link linkend="sql-altersubscription"><command>ALTER > SUBSCRIPTION ... REFRESH PUBLICATION</command></link> > is executed, and by only refreshing using the <literal>copy_data > = off</literal> option. > </para> > </caution> > We can keep this part as you proposed. -- With Regards, Amit Kapila.
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Peter Smith
Date:
On Fri, Oct 25, 2024 at 2:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Oct 25, 2024 at 6:12 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Kuroda-san, > > > > This is only a caution for ALTER of 'publish_via_partition_root', so > > that should be immediately clear up-front in the first few words of > > the text. > > > > Also, the current explanation appears to give more details about how > > and why it happens than I felt was necessary. e.g. this is for user > > documentation, not for a code comment. Basically, I'm wondering if the > > whole thing could be "dumbed down" a little bit, keeping only the > > information essential so the users can protect themselves. Maybe > > something a bit more like below? > > > > SUGGESTION > > <caution> > > <para> > > Altering the <literal>publish_via_partition_root</literal> parameter > > can lead to data loss or duplication at the subscriber because it > > changes the identity and schema of the published tables. > > </para> > > This appears precise but lacks the key information that the problem > can happen when a partitioned root table is specified as a replication > target. So, how about one of the following: > > * Altering the <literal>publish_via_partition_root</literal> parameter > when the partition root table is specified as the replication target > can lead to data loss or duplication at the subscriber because it > changes the identity and schema of the published tables. > > * Altering the <literal>publish_via_partition_root</literal> parameter > can lead to data loss or duplication at the subscriber because it > changes the identity and schema of the published tables. Note this > happens only when the partition root table is specified as the > replication target. +1 for the second one. (but maybe say "a partition root table" instead of "the partition root table") > > > <para> > > This problem can be avoided by refraining from modifying > > partition leaf tables > > after the <command>ALTER PUBLICATION ... SET</command> until the > > <link linkend="sql-altersubscription"><command>ALTER > > SUBSCRIPTION ... REFRESH PUBLICATION</command></link> > > is executed, and by only refreshing using the <literal>copy_data > > = off</literal> option. > > </para> > > </caution> > > > > We can keep this part as you proposed. > > -- > With Regards, > Amit Kapila.
Re: BUG #18644: ALTER PUBLICATION ... SET (publish_via_partition_root) wrong/undocumented behavior.
From
Amit Kapila
Date:
On Fri, Oct 25, 2024 at 12:21 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Thanks for suggestions. I updated as you pointed out. I removed a comma from > "is executed, and..." because my Grammarly said like that. > PSA new version. > Pushed. -- With Regards, Amit Kapila.