Hi,
On 2024-12-09 00:12:32 +0100, Tomas Vondra wrote:
> Hi, the TAP test 001_connection_limits.pl introduced by 6a1d0d470e84
> seems to have problems with valgrind :-( I reliably get this failure:
>
>
> t/001_connection_limits.pl .. 3/? # Tests were run but no plan was
> declared and done_testing() was not seen.
> # Looks like your test exited with 29 just after 4.
> t/001_connection_limits.pl .. Dubious, test returned 29 (wstat 7424, 0x1d00)
> All 4 subtests passed
>
>
> and tmp_check/log/regress_log_001_connection_limits says:
>
>
> [23:48:44.444](1.129s) ok 3 - reserved_connections limit
> [23:48:44.445](0.001s) ok 4 - reserved_connections limit: matches
> process ended prematurely at
> /home/user/work/postgres/src/test/postmaster/../../../src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
> line 154.
> # Postmaster PID for node "primary" is 198592
I just saw this failure on skink in the BF:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-04%2015%3A43%3A23
[17:05:56.438](0.247s) ok 3 - reserved_connections limit
[17:05:56.438](0.000s) ok 4 - reserved_connections limit: matches
process ended prematurely at /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
line160.
> That BackgroundPsql.pm line is this in wait_connect()
>
> $self->{run}->pump()
> until $self->{stdout} =~ /$banner/ || $self->{timeout}->is_expired;
A big part of the problem here imo is the exception behaviour that
IPC::Run::pump() has:
If pump() is called after all harnessed activities have completed, a "process
ended prematurely" exception to be thrown. This allows for simple scripting
of external applications without having to add lots of error handling code at
each step of the script:
Which is, uh, not very compatible with how we use IPC::Run (here and
elsewhere). Just ending the test because a connection failed is pretty awful.
This behaviour makes it really hard to debug problems. It'd have been a lot
easier to understand the problem if we'd seen psql's stderr before the test
died.
I guess that mean at the very least we'd need to put an eval {} around the
->pump() call., print $self->{stdout}, ->{stderr} and reraise an error?
Presumably not just in in wait_connect(), but also at least in pump_until()?
Will respond downthread to a potential workaround for the issue.
Greetings,
Andres Freund