Thread: Minor improvements to test log navigability
Hi, Speaking as someone who regularly trawls through megabytes of build farm output: 1. It seems a bit useless to have a load of "FATAL: the database system is in recovery mode" spam whenever the server crashes under src/test/regress. Any reason not to just turn that off, as we do for the TAP tests? 2. The TAP test logs are strangely named. Any reason not to call them 001_testname.log, instead of regress_log_001_testname, so they appear next to the corresponding 001_testname_{primary,standby,xxx}.log in directory listings (CI) and dumps (build farm, presumably), and have a traditional .log suffix?
Attachment
On 2022-May-16, Thomas Munro wrote: > 1. It seems a bit useless to have a load of "FATAL: the database > system is in recovery mode" spam whenever the server crashes under > src/test/regress. Any reason not to just turn that off, as we do for > the TAP tests? I don't know of any. Let's. > 2. The TAP test logs are strangely named. Any reason not to call > them 001_testname.log, instead of regress_log_001_testname, so they > appear next to the corresponding > 001_testname_{primary,standby,xxx}.log in directory listings (CI) and > dumps (build farm, presumably), and have a traditional .log suffix? +1. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Voy a acabar con todos los humanos / con los humanos yo acabaré voy a acabar con todos (bis) / con todos los humanos acabaré ¡acabaré! (Bender)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2022-May-16, Thomas Munro wrote: >> 1. It seems a bit useless to have a load of "FATAL: the database >> system is in recovery mode" spam whenever the server crashes under >> src/test/regress. Any reason not to just turn that off, as we do for >> the TAP tests? > I don't know of any. Let's. Have you actually tested what happens? I fear this would just result in different spam. >> 2. The TAP test logs are strangely named. Any reason not to call >> them 001_testname.log, instead of regress_log_001_testname, so they >> appear next to the corresponding >> 001_testname_{primary,standby,xxx}.log in directory listings (CI) and >> dumps (build farm, presumably), and have a traditional .log suffix? > +1. Andrew would have to weigh in on whether this'd break the buildfarm, but if it doesn't, +1. regards, tom lane
On Mon, May 16, 2022 at 12:18:24PM -0400, Tom Lane wrote: > Andrew would have to weigh in on whether this'd break the buildfarm, > but if it doesn't, +1. FWIW, the buildfarm client feeds from the contents of the directory tmp_check/log/, so a simple renaming of the main log file would have no impact. Here are the parts of the code doing that: PGBuild/Modules/TestMyTap.pm: my @logs = glob("$self->{where}/tmp_check/log/*"); run_build.pl: my @logs = glob("$dir/tmp_check/log/*"); -- Michael
Attachment
On Tue, May 17, 2022 at 4:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > On 2022-May-16, Thomas Munro wrote: > >> 1. It seems a bit useless to have a load of "FATAL: the database > >> system is in recovery mode" spam whenever the server crashes under > >> src/test/regress. Any reason not to just turn that off, as we do for > >> the TAP tests? > > > I don't know of any. Let's. > > Have you actually tested what happens? I fear this would just > result in different spam. I'd forgotten that we already do this on CI, via src/tools/ci/pg_ci_base.conf, so we can compare. A CI postmaster.log that ends with "shutting down because restart_after_crash is off": https://api.cirrus-ci.com/v1/artifact/task/6537277877780480/log/src/test/regress/log/postmaster.log The build farm version has ~350 lines of "FATAL: the database system is in recovery mode" instead: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp&dt=2022-05-16%2023%3A17%3A49 Admittedly that is nothing compared to the huge amount of extra log spam caused by regression.diffs filling up with these: - ...the complete expected output of each test spanning many lines... - ... - ... +psql: error: connection to server on socket "/tmp/pg_regress-ZqXocK/.s.PGSQL.5678" failed: FATAL: the database system is in recovery mode In the CI version, that looks like: - ...the complete expected output of each test spanning many lines... - ... - ... +psql: error: connection to server on socket "/tmp/pg_regress-T35Yzi/.s.PGSQL.51696" failed: No such file or directory + Is the server running locally and accepting connections on that socket? I wonder if there would be a good way to filter those "never managed to connect" cases out... Exit code 2 (EXIT_BADCONN) is not the answer, because you get that also for servers that go away due to a crash where you do want to be able to see the diff, for information about where it crashed.
Thomas Munro <thomas.munro@gmail.com> writes: > ... Admittedly that is nothing compared to the huge amount of extra log > spam caused by regression.diffs filling up with these: Yeah, that's really the main problem. > I wonder if there would be a good way to filter those "never managed > to connect" cases out... Exit code 2 (EXIT_BADCONN) is not the > answer, because you get that also for servers that go away due to a > crash where you do want to be able to see the diff, for information > about where it crashed. Maybe pg_regress could check that postmaster.pid is still there before launching each new test script? (Obviously this all applies only to "make check" not "make installcheck".) regards, tom lane
On 16.05.22 01:01, Thomas Munro wrote: > 2. The TAP test logs are strangely named. Any reason not to call > them 001_testname.log, instead of regress_log_001_testname, so they > appear next to the corresponding > 001_testname_{primary,standby,xxx}.log in directory listings (CI) and > dumps (build farm, presumably), and have a traditional .log suffix? I'm in favor of a saner name, but wouldn't something.log be confusable with a server log? Maybe something.out would be clearer. Or something_output.log, if the .log suffix is somehow desirable.