Thread: waiting for reload in tests

waiting for reload in tests

From
Andres Freund
Date:
Hi,

In a couple tests I (IIRC others as well) had the problem that a config reload
isn't actually synchronous. I.e. a sequence like

$node_primary->reload;
$node_primary->safe_psql('postgres',...)

isn't actually guaranteed to observe the config as reloaded in the the
safe_psql(). It *typically* will see the new config results, but if the system
busy and/or slow, the sighup might not yet have been propagated by postmaster
and/or not yet received by the relevant process.

I don't really see a way to guarantee this with reasonable effort in the
back-branches. In HEAD we could (with some difficulties around postmaster and
UI) use a global barrier to wait for the reload to complete. For the
backbranches I guess we could hack something using retries and setting a
pseudo-guc to check whether the reload has been processed - but that's not
bulletproof at all, some process(es) could take longer to receive the signal.

Anybody got a better idea?

Greetings,

Andres Freund



Re: waiting for reload in tests

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> In a couple tests I (IIRC others as well) had the problem that a config reload
> isn't actually synchronous. I.e. a sequence like

> $node_primary->reload;
> $node_primary->safe_psql('postgres',...)

> isn't actually guaranteed to observe the config as reloaded in the the
> safe_psql().

Brute force way: s/reload/restart/

Less brute force: wait for "SHOW variable-you-changed" to report the
value you expect.

            regards, tom lane



Re: waiting for reload in tests

From
Michael Paquier
Date:
On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote:
> Brute force way: s/reload/restart/

That was my first thought, as it can be tricky to make sure that all
the processes got the update because we don't publish such a state.

One thing I was also thinking about would be to update
pg_stat_activity.state_change when a reload is processed on top of its
current updates, then wait for it to be effective in all the processes
reported.  The field remains NULL for most non-backend processes,
which would be a compatibility change.

> Less brute force: wait for "SHOW variable-you-changed" to report the
> value you expect.

This method may still be unreliable in some processes like a logirep
launcher/receiver or just autovacuum, no?
--
Michael

Attachment

Re: waiting for reload in tests

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote:
>> Less brute force: wait for "SHOW variable-you-changed" to report the
>> value you expect.

> This method may still be unreliable in some processes like a logirep
> launcher/receiver or just autovacuum, no?

Yeah, if your test case requires knowing that some background process
has gotten the word, it's a *lot* harder.  I think we'd have to add a
last-config-update-time column in pg_stat_activity or something like that.

            regards, tom lane



Re: waiting for reload in tests

From
Andres Freund
Date:
Hi,

On 2022-05-09 21:42:20 -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote:
> >> Less brute force: wait for "SHOW variable-you-changed" to report the
> >> value you expect.
> 
> > This method may still be unreliable in some processes like a logirep
> > launcher/receiver or just autovacuum, no?

Yept, that's the problem. In my case it's the startup process...


> Yeah, if your test case requires knowing that some background process
> has gotten the word, it's a *lot* harder.  I think we'd have to add a
> last-config-update-time column in pg_stat_activity or something like that.

That's basically what I was referencing with global barriers...

Greetings,

Andres Freund