Thread: Fix instability in subscription regression test
Hi, I noticed BF member wrasse failed in 028_row_filter.pl. # Failed test 'check publish_via_partition_root behavior' # at t/028_row_filter.pl line 669. # got: '' # expected: '1|100 # ... Log: 2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG: statement: ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH(copy_data = true) 2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG: disconnection: session time: 0:00:00.098 user=nm database=postgreshost=[local] 2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table synchronization worker for subscription "tap_sub",table "tab_rowfilter_partitioned" has started ... 2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG: statement: SELECT a, b FROM tab_rowfilter_partitioned ORDERBY 1, 2 ... 2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table synchronization worker for subscription "tap_sub",table "tab_rowfilter_partitioned" has finished From the Log, I can see it query the target table before the table sync is over. So, I think the reason is that we didn't wait for table sync to finish after refreshing the publication. Sorry for not catching that ealier. Here is a patch to fix it. Best regards, Hou zj
Attachment
On Thu, Jun 23, 2022 at 8:28 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Hi, > > I noticed BF member wrasse failed in 028_row_filter.pl. > > # Failed test 'check publish_via_partition_root behavior' > # at t/028_row_filter.pl line 669. > # got: '' > # expected: '1|100 > # ... > > Log: > 2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG: statement: ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATIONWITH (copy_data = true) > 2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG: disconnection: session time: 0:00:00.098 user=nm database=postgreshost=[local] > 2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table synchronization worker for subscription "tap_sub",table "tab_rowfilter_partitioned" has started > ... > 2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG: statement: SELECT a, b FROM tab_rowfilter_partitioned ORDERBY 1, 2 > ... > 2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table synchronization worker for subscription "tap_sub",table "tab_rowfilter_partitioned" has finished > > From the Log, I can see it query the target table before the table sync is > over. So, I think the reason is that we didn't wait for table sync to > finish after refreshing the publication. Sorry for not catching that > ealier. Here is a patch to fix it. +1 The patch looks good to me. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Thu, Jun 23, 2022 at 4:58 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Hi, > > I noticed BF member wrasse failed in 028_row_filter.pl. > > # Failed test 'check publish_via_partition_root behavior' > # at t/028_row_filter.pl line 669. > # got: '' > # expected: '1|100 > # ... > > Log: > 2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG: statement: ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATIONWITH (copy_data = true) > 2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG: disconnection: session time: 0:00:00.098 user=nm database=postgreshost=[local] > 2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table synchronization worker for subscription "tap_sub",table "tab_rowfilter_partitioned" has started > ... > 2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG: statement: SELECT a, b FROM tab_rowfilter_partitioned ORDERBY 1, 2 > ... > 2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table synchronization worker for subscription "tap_sub",table "tab_rowfilter_partitioned" has finished > > From the Log, I can see it query the target table before the table sync is > over. So, I think the reason is that we didn't wait for table sync to > finish after refreshing the publication. Sorry for not catching that > ealier. Here is a patch to fix it. > +# wait for initial table synchronization to finish +$node_subscriber->poll_query_until('postgres', $synced_query) We could probably slightly change the comment to say: "wait for table sync to finish". Normally, we use initial table sync after CREATE SUBSCRIPTION. This is a minor nitpick and I can take care of it before committing unless you think otherwise. Your analysis and patch look good to me. -- With Regards, Amit Kapila.
> -----Original Message----- > From: Amit Kapila <amit.kapila16@gmail.com> > Sent: Friday, June 24, 2022 10:28 AM > To: Hou, Zhijie/侯 志杰 <houzj.fnst@fujitsu.com> > Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org> > Subject: Re: Fix instability in subscription regression test > > On Thu, Jun 23, 2022 at 4:58 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > Hi, > > > > I noticed BF member wrasse failed in 028_row_filter.pl. > > > > # Failed test 'check publish_via_partition_root behavior' > > # at t/028_row_filter.pl line 669. > > # got: '' > > # expected: '1|100 > > # ... > > > > Log: > > 2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG: > > statement: ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH > > (copy_data = true) > > 2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG: > > disconnection: session time: 0:00:00.098 user=nm database=postgres > > host=[local] > > 2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table > > synchronization worker for subscription "tap_sub", table > "tab_rowfilter_partitioned" has started ... > > 2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG: > > statement: SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2 ... > > 2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table > > synchronization worker for subscription "tap_sub", table > > "tab_rowfilter_partitioned" has finished > > > > From the Log, I can see it query the target table before the table > > sync is over. So, I think the reason is that we didn't wait for table > > sync to finish after refreshing the publication. Sorry for not > > catching that ealier. Here is a patch to fix it. > > > > +# wait for initial table synchronization to finish > +$node_subscriber->poll_query_until('postgres', $synced_query) > > We could probably slightly change the comment to say: "wait for table sync to > finish". Normally, we use initial table sync after CREATE SUBSCRIPTION. This is a > minor nitpick and I can take care of it before committing unless you think > otherwise. Thanks for reviewing, the suggestion looks good to me. Best regards, Hou zj
On Fri, Jun 24, 2022 at 8:07 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > > > > +# wait for initial table synchronization to finish > > +$node_subscriber->poll_query_until('postgres', $synced_query) > > > > We could probably slightly change the comment to say: "wait for table sync to > > finish". Normally, we use initial table sync after CREATE SUBSCRIPTION. This is a > > minor nitpick and I can take care of it before committing unless you think > > otherwise. > > Thanks for reviewing, the suggestion looks good to me. > Pushed! -- With Regards, Amit Kapila.