Thread: waiting for reload in tests
Hi, In a couple tests I (IIRC others as well) had the problem that a config reload isn't actually synchronous. I.e. a sequence like $node_primary->reload; $node_primary->safe_psql('postgres',...) isn't actually guaranteed to observe the config as reloaded in the the safe_psql(). It *typically* will see the new config results, but if the system busy and/or slow, the sighup might not yet have been propagated by postmaster and/or not yet received by the relevant process. I don't really see a way to guarantee this with reasonable effort in the back-branches. In HEAD we could (with some difficulties around postmaster and UI) use a global barrier to wait for the reload to complete. For the backbranches I guess we could hack something using retries and setting a pseudo-guc to check whether the reload has been processed - but that's not bulletproof at all, some process(es) could take longer to receive the signal. Anybody got a better idea? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > In a couple tests I (IIRC others as well) had the problem that a config reload > isn't actually synchronous. I.e. a sequence like > $node_primary->reload; > $node_primary->safe_psql('postgres',...) > isn't actually guaranteed to observe the config as reloaded in the the > safe_psql(). Brute force way: s/reload/restart/ Less brute force: wait for "SHOW variable-you-changed" to report the value you expect. regards, tom lane
On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote: > Brute force way: s/reload/restart/ That was my first thought, as it can be tricky to make sure that all the processes got the update because we don't publish such a state. One thing I was also thinking about would be to update pg_stat_activity.state_change when a reload is processed on top of its current updates, then wait for it to be effective in all the processes reported. The field remains NULL for most non-backend processes, which would be a compatibility change. > Less brute force: wait for "SHOW variable-you-changed" to report the > value you expect. This method may still be unreliable in some processes like a logirep launcher/receiver or just autovacuum, no? -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote: >> Less brute force: wait for "SHOW variable-you-changed" to report the >> value you expect. > This method may still be unreliable in some processes like a logirep > launcher/receiver or just autovacuum, no? Yeah, if your test case requires knowing that some background process has gotten the word, it's a *lot* harder. I think we'd have to add a last-config-update-time column in pg_stat_activity or something like that. regards, tom lane
Hi, On 2022-05-09 21:42:20 -0400, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: > > On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote: > >> Less brute force: wait for "SHOW variable-you-changed" to report the > >> value you expect. > > > This method may still be unreliable in some processes like a logirep > > launcher/receiver or just autovacuum, no? Yept, that's the problem. In my case it's the startup process... > Yeah, if your test case requires knowing that some background process > has gotten the word, it's a *lot* harder. I think we'd have to add a > last-config-update-time column in pg_stat_activity or something like that. That's basically what I was referencing with global barriers... Greetings, Andres Freund