Thread: pgsql: Improve stability of TAP test for synchronous replication

pgsql: Improve stability of TAP test for synchronous replication

From
Michael Paquier
Date:
Improve stability of TAP test for synchronous replication

Slow buildfarm machines have run into issues with this TAP test caused
by a race condition related to the startup of a set of standbys, where
it is possible to finish with an unexpected order in the WAL sender
array of the primary.

This closes the race condition by making sure that any standby started
is registered into the WAL sender array of the primary before starting
the next one based on lookups of pg_stat_replication.

Backpatch down to 9.6 where the test has been introduced.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Noah Misch
Discussion: https://postgr.es/m/20190617055145.GB18917@paquier.xyz
Backpatch-through: 9.6

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/7d81bdc8c0ce838efa248928065e9b2da829f981

Modified Files
--------------
src/test/recovery/t/007_sync_rep.pl | 42 +++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 9 deletions(-)


Re: pgsql: Improve stability of TAP test for synchronous replication

From
Andrew Dunstan
Date:
On 7/23/19 9:55 PM, Michael Paquier wrote:
> Improve stability of TAP test for synchronous replication
>
> Slow buildfarm machines have run into issues with this TAP test caused
> by a race condition related to the startup of a set of standbys, where
> it is possible to finish with an unexpected order in the WAL sender
> array of the primary.
>
> This closes the race condition by making sure that any standby started
> is registered into the WAL sender array of the primary before starting
> the next one based on lookups of pg_stat_replication.
>
> Backpatch down to 9.6 where the test has been introduced.
>
> Author: Michael Paquier
> Reviewed-by: Álvaro Herrera, Noah Misch
> Discussion: https://postgr.es/m/20190617055145.GB18917@paquier.xyz
> Backpatch-through: 9.6
>
> Branch
> ------
> master
>
> Details
> -------
> https://git.postgresql.org/pg/commitdiff/7d81bdc8c0ce838efa248928065e9b2da829f981
>
> Modified Files
> --------------
> src/test/recovery/t/007_sync_rep.pl | 42 +++++++++++++++++++++++++++++--------
> 1 file changed, 33 insertions(+), 9 deletions(-)
>


This broke our perl coding rules:


./src/test/recovery/t/007_sync_rep.pl: Subroutine "start_standby_and_wait" does not end with "return" at line 33,
column1.  See page 197 of PBP.  (Severity: 5)
 

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgsql: Improve stability of TAP test for synchronous replication

From
Michael Paquier
Date:
On Wed, Jul 24, 2019 at 02:17:14PM -0400, Andrew Dunstan wrote:
> This broke our perl coding rules:
>
> ./src/test/recovery/t/007_sync_rep.pl: Subroutine
> "start_standby_and_wait" does not end with "return" at line 33,
> column 1.  See page 197 of PBP.  (Severity: 5)

Fixed, thanks.  Indeed I can see that pgperlcritic complains here, and
I have added a call in my pre-commit scripts for the future.
--
Michael

Attachment