On Tue, May 17, 2022 at 4:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > On 2022-May-16, Thomas Munro wrote:
> >> 1. It seems a bit useless to have a load of "FATAL: the database
> >> system is in recovery mode" spam whenever the server crashes under
> >> src/test/regress. Any reason not to just turn that off, as we do for
> >> the TAP tests?
>
> > I don't know of any. Let's.
>
> Have you actually tested what happens? I fear this would just
> result in different spam.
I'd forgotten that we already do this on CI, via
src/tools/ci/pg_ci_base.conf, so we can compare. A CI postmaster.log
that ends with "shutting down because restart_after_crash is off":
https://api.cirrus-ci.com/v1/artifact/task/6537277877780480/log/src/test/regress/log/postmaster.log
The build farm version has ~350 lines of "FATAL: the database system
is in recovery mode" instead:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp&dt=2022-05-16%2023%3A17%3A49
Admittedly that is nothing compared to the huge amount of extra log
spam caused by regression.diffs filling up with these:
- ...the complete expected output of each test spanning many lines...
- ...
- ...
+psql: error: connection to server on socket
"/tmp/pg_regress-ZqXocK/.s.PGSQL.5678" failed: FATAL: the database
system is in recovery mode
In the CI version, that looks like:
- ...the complete expected output of each test spanning many lines...
- ...
- ...
+psql: error: connection to server on socket
"/tmp/pg_regress-T35Yzi/.s.PGSQL.51696" failed: No such file or
directory
+ Is the server running locally and accepting connections on that socket?
I wonder if there would be a good way to filter those "never managed
to connect" cases out... Exit code 2 (EXIT_BADCONN) is not the
answer, because you get that also for servers that go away due to a
crash where you do want to be able to see the diff, for information
about where it crashed.