Re: BackgroundPsql swallowing errors on windows - Mailing list pgsql-hackers

From Andres Freund
Subject Re: BackgroundPsql swallowing errors on windows
Date
Msg-id s74k236btv44hiy5j5k4dgu3syqbir2unhwdry2orqobwedlfw@lwuxhvkwn5fs
Whole thread Raw
In response to Re: BackgroundPsql swallowing errors on windows  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2025-02-16 18:18:44 -0500, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > From the slow proxy's perspective, it can't rule out the program under test
> > having done those two write() calls.  The proxy doesn't have enough
> > information to reconstruct the original four write() calls.  What prevents
> > that anomaly?
> 
> Yeah, I think it's hopeless to expect that we can disambiguate the
> order of writes to two different pipes.  For the problem at hand,
> though, it seems like we don't really need to do that.  Rather, the
> question is "when we detect that the program-under-test has exited,
> can we be sure we have collected all of its output?".

That's what my patch upthread tries to achieve by having a query separator
both on stdout and stderr and waiting for both.


> I think that IPC::Run may be screwing up here, because I have seen
> non-Windows CI failures that look like it didn't read all the stderr output.
> For example, this pgbench test failure on macOS from [1]:
> 
> # Running: pgbench -n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 -Dbadtrue=trueXXX
-Dmaxint=9223372036854775807-Dminint=-9223372036854775808 -M prepared -f
/Users/admin/pgsql/build/testrun/pgbench/001_pgbench_with_server/data/t_001_pgbench_with_server_main_data/001_pgbench_error_shell_bad_command
> [17:27:47.408](0.061s) ok 273 - pgbench script error: shell bad command status (got 2 vs expected 2)
> [17:27:47.409](0.000s) ok 274 - pgbench script error: shell bad command stdout /(?^:processed: 0/1)/
> [17:27:47.409](0.000s) not ok 275 - pgbench script error: shell bad command stderr /(?^:\(shell\) .* meta-command
failed)/
> [17:27:47.409](0.000s) #   Failed test 'pgbench script error: shell bad command stderr /(?^:\(shell\) .* meta-command
failed)/'
> #   at /Users/admin/pgsql/src/bin/pgbench/t/001_pgbench_with_server.pl line 1466.
> #                   ''
> #     doesn't match '(?^:\(shell\) .* meta-command failed)'
> 
> The program's exited with a failure code as expected, and we saw (some
> of?) the expected stdout output, but stderr output is reported to be
> empty.

It's possible this is caused by the same issue as on windows. Or by one of the
other things fixed in the patch, a) there's afaict no guarantee that we'd read
from pipe A if we were waiting for A|B and B got ready b) that we weren't
actually waiting for quite all the output to be generated (missing the
newline).  Or it could be because psql doesn't actually flush stderr in all
patch, from what I can tell...

I hope it'll be easier to debug with the patch in place if it doesn't turn out
to already be fixed.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: BackgroundPsql swallowing errors on windows
Next
From: Noah Misch
Date:
Subject: Re: BackgroundPsql swallowing errors on windows