Hi,
On 2025-03-05 16:19:04 -0800, Jacob Champion wrote:
> On Wed, Mar 5, 2025 at 9:28 AM Andres Freund <andres@anarazel.de> wrote:
> > Unrelated to the change in this patch, but tests really shouldn't use while(1)
> > loops without a termination condition. If something is wrong, the test will
> > hang indefinitely, instead of timing out. On the buildfarm that can take out
> > an animal if it hasn't configured a timeout (with autoconf at least, meson
> > terminates tests after a timeout).
>
> With the current patchset, if I pull the PG_TEST_TIMEOUT_DEFAULT down
> low, and modify the backend so that either one of the two conditions
> never completes, the tests still stop due to BackgroundPsql's session
> timeout. This is true for Meson and Autoconf. Is there a different
> situation where I can't rely on that?
Oh, I see. I missed that it's relying on the timeout and that the timeout
isn't reset in the loop. Sorry for the noise.
FWIW, I still don't like open-coded poll loops, as I'd really like our tests
to use some smarter retry/backoff logic than a single hardcoded
usleep(100_000).
The first few iterations that's too long, commonly the state isn't reached
in the first iteration, but will be within a millisecond or two. Waiting
100ms is obviously way too long.
Once we've slept for 10+ seconds without reaching the target, sleeping for
100us is way too short a sleep and just wastes CPU cycles. A decent portion
of the CPU time when running under valgrind is wasted just due to way too
tight retry loops.
That's harder to do if we have many places polling.
But anyway, I digress, that's really not related to your change.
Greetings,
Andres Freund