On 26 June 2017 at 19:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> So this looks like a pretty obvious race condition in the postmaster,
>> which should be resolved by having it set a flag on receipt of
>> PMSIGNAL_START_WALRECEIVER that's cleared only when it does start a
>> new walreceiver.
>
> Concretely, I propose the attached patch. Together with reducing
> wal_retrieve_retry_interval to 500ms, which I propose having
> PostgresNode::init do in its standard postgresql.conf adjustments,
> this takes the runtime of the recovery TAP tests down from 2m50s
> (after the patches I posted yesterday) to 1m30s.
Patch looks good
> I think there's still gold to be mined, because "top" is still
> showing pretty low CPU load over most of the run, but this is
> lots better than 4m30s.
Thanks for looking into this
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services