Thread: Fix instability in subscription regression test

Fix instability in subscription regression test

From
"houzj.fnst@fujitsu.com"
Date:
Hi,

I noticed BF member wrasse failed in 028_row_filter.pl.

#   Failed test 'check publish_via_partition_root behavior'
#   at t/028_row_filter.pl line 669.
#          got: ''
#     expected: '1|100
#              ...

Log:
2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG: statement: ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION
WITH(copy_data = true) 
2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG: disconnection: session time: 0:00:00.098 user=nm
database=postgreshost=[local] 
2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table synchronization worker for subscription
"tap_sub",table "tab_rowfilter_partitioned" has started 
...
2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG: statement: SELECT a, b FROM tab_rowfilter_partitioned
ORDERBY 1, 2 
...
2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table synchronization worker for subscription
"tap_sub",table "tab_rowfilter_partitioned" has finished  

From the Log, I can see it query the target table before the table sync is
over. So, I think the reason is that we didn't wait for table sync to
finish after refreshing the publication. Sorry for not catching that
ealier. Here is a patch to fix it.


Best regards,
Hou zj

Attachment

Re: Fix instability in subscription regression test

From
Masahiko Sawada
Date:
On Thu, Jun 23, 2022 at 8:28 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Hi,
>
> I noticed BF member wrasse failed in 028_row_filter.pl.
>
> #   Failed test 'check publish_via_partition_root behavior'
> #   at t/028_row_filter.pl line 669.
> #          got: ''
> #     expected: '1|100
> #              ...
>
> Log:
> 2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG: statement: ALTER SUBSCRIPTION tap_sub REFRESH
PUBLICATIONWITH (copy_data = true)
 
> 2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG: disconnection: session time: 0:00:00.098 user=nm
database=postgreshost=[local]
 
> 2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table synchronization worker for subscription
"tap_sub",table "tab_rowfilter_partitioned" has started
 
> ...
> 2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG: statement: SELECT a, b FROM tab_rowfilter_partitioned
ORDERBY 1, 2
 
> ...
> 2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table synchronization worker for subscription
"tap_sub",table "tab_rowfilter_partitioned" has finished
 
>
> From the Log, I can see it query the target table before the table sync is
> over. So, I think the reason is that we didn't wait for table sync to
> finish after refreshing the publication. Sorry for not catching that
> ealier. Here is a patch to fix it.

+1

The patch looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Fix instability in subscription regression test

From
Amit Kapila
Date:
On Thu, Jun 23, 2022 at 4:58 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Hi,
>
> I noticed BF member wrasse failed in 028_row_filter.pl.
>
> #   Failed test 'check publish_via_partition_root behavior'
> #   at t/028_row_filter.pl line 669.
> #          got: ''
> #     expected: '1|100
> #              ...
>
> Log:
> 2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG: statement: ALTER SUBSCRIPTION tap_sub REFRESH
PUBLICATIONWITH (copy_data = true)
 
> 2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG: disconnection: session time: 0:00:00.098 user=nm
database=postgreshost=[local]
 
> 2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table synchronization worker for subscription
"tap_sub",table "tab_rowfilter_partitioned" has started
 
> ...
> 2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG: statement: SELECT a, b FROM tab_rowfilter_partitioned
ORDERBY 1, 2
 
> ...
> 2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table synchronization worker for subscription
"tap_sub",table "tab_rowfilter_partitioned" has finished
 
>
> From the Log, I can see it query the target table before the table sync is
> over. So, I think the reason is that we didn't wait for table sync to
> finish after refreshing the publication. Sorry for not catching that
> ealier. Here is a patch to fix it.
>

+# wait for initial table synchronization to finish
+$node_subscriber->poll_query_until('postgres', $synced_query)

We could probably slightly change the comment to say: "wait for table
sync to finish". Normally, we use initial table sync after CREATE
SUBSCRIPTION. This is a minor nitpick and I can take care of it before
committing unless you think otherwise.

Your analysis and patch look good to me.

-- 
With Regards,
Amit Kapila.



RE: Fix instability in subscription regression test

From
"houzj.fnst@fujitsu.com"
Date:

> -----Original Message-----
> From: Amit Kapila <amit.kapila16@gmail.com>
> Sent: Friday, June 24, 2022 10:28 AM
> To: Hou, Zhijie/侯 志杰 <houzj.fnst@fujitsu.com>
> Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
> Subject: Re: Fix instability in subscription regression test
> 
> On Thu, Jun 23, 2022 at 4:58 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Hi,
> >
> > I noticed BF member wrasse failed in 028_row_filter.pl.
> >
> > #   Failed test 'check publish_via_partition_root behavior'
> > #   at t/028_row_filter.pl line 669.
> > #          got: ''
> > #     expected: '1|100
> > #              ...
> >
> > Log:
> > 2022-06-23 11:27:42.387 CEST [20589:3] 028_row_filter.pl LOG:
> > statement: ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH
> > (copy_data = true)
> > 2022-06-23 11:27:42.470 CEST [20589:4] 028_row_filter.pl LOG:
> > disconnection: session time: 0:00:00.098 user=nm database=postgres
> > host=[local]
> > 2022-06-23 11:27:42.611 CEST [20593:1] LOG: logical replication table
> > synchronization worker for subscription "tap_sub", table
> "tab_rowfilter_partitioned" has started ...
> > 2022-06-23 11:27:43.197 CEST [20610:3] 028_row_filter.pl LOG:
> > statement: SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2 ...
> > 2022-06-23 11:27:43.689 CEST [20593:2] LOG: logical replication table
> > synchronization worker for subscription "tap_sub", table
> > "tab_rowfilter_partitioned" has finished
> >
> > From the Log, I can see it query the target table before the table
> > sync is over. So, I think the reason is that we didn't wait for table
> > sync to finish after refreshing the publication. Sorry for not
> > catching that ealier. Here is a patch to fix it.
> >
> 
> +# wait for initial table synchronization to finish
> +$node_subscriber->poll_query_until('postgres', $synced_query)
> 
> We could probably slightly change the comment to say: "wait for table sync to
> finish". Normally, we use initial table sync after CREATE SUBSCRIPTION. This is a
> minor nitpick and I can take care of it before committing unless you think
> otherwise.

Thanks for reviewing, the suggestion looks good to me.

Best regards,
Hou zj

Re: Fix instability in subscription regression test

From
Amit Kapila
Date:
On Fri, Jun 24, 2022 at 8:07 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> >
> > +# wait for initial table synchronization to finish
> > +$node_subscriber->poll_query_until('postgres', $synced_query)
> >
> > We could probably slightly change the comment to say: "wait for table sync to
> > finish". Normally, we use initial table sync after CREATE SUBSCRIPTION. This is a
> > minor nitpick and I can take care of it before committing unless you think
> > otherwise.
>
> Thanks for reviewing, the suggestion looks good to me.
>

Pushed!

-- 
With Regards,
Amit Kapila.