I've noticed a few buildfarm failures similar to [1]:
# diff -U3 /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out
/repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out
# --- /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out 2026-04-21
04:22:01.030204342-0300
# +++ /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out 2026-04-21
04:29:54.795187855-0300
# @@ -155,7 +155,7 @@
# begin;
# set statement_timeout to 1000;
# select trap_timeout();
# -NOTICE: nyeah nyeah, can't stop me
# +NOTICE: caught others?
# ERROR: end of function
# CONTEXT: PL/pgSQL function trap_timeout() line 15 at RAISE
# rollback;
not ok 11 - plpgsql_trap 502 ms
which is coming from unexpected behavior of this bit of plpgsql
code:
begin
-- we assume this will take longer than 1 second:
select count(*) into x from generate_series(1, 1_000_000_000_000);
exception
when others then
raise notice 'caught others?';
when query_canceled then
raise notice 'nyeah nyeah, can''t stop me';
end;
The light bulb went on when I noticed a nearby failure from the same
machine that was clearly traceable to out-of-disk-space. What
happened here, I have no doubt, was that the "from generate_series"
bit tried to make a large temporary file, ran out of space, and threw
an appropriate error, causing us to take the "wrong" exception
handler.
Proposal:
1. Replace that query with something not so resource-intensive.
I'm not really sure why we didn't just use "perform pg_sleep(10)".
Maybe it didn't exist or didn't reliably wait 10 seconds at the
time, but it does now.
2. Adjust the "when others" handler to report the actual error,
to make this sort of thing easier to debug next time.
regards, tom lane
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=caiman&dt=2026-04-21%2007%3A21%3A57