Andrew Dunstan <andrew@dunslane.net> writes:
> There's something odd about the brin regression tests. They seem to
> generate intermittent failures, which suggests some sort of race
> condition or ordering failure.
> See for example
> <http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=fulmar&dt=2015-05-15%2001%3A02%3A28>
> and
> <http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=sittella&dt=2015-05-15%2021%3A08%3A38>
I found the cause of this symptom today. Alvaro said he'd added the
autovacuum_enabled=off option to the brintest table to prevent autovac
from screwing up this expected result ... but that only stops autovacuum
from summarizing the table. Guess what is in the concurrently-executed
gist.sql test, at line 40.
While we could and perhaps should change that command to a more narrowly
targeted "vacuum analyze gist_tbl;", this will not prevent someone from
reintroducing an untargeted vacuum command in one of the concurrent tests
later. I think a future-proof fix would require either making brintest
a temp table (losing all test coverage of WAL logging :-() or changing
the test so that it does not expect a specific result from
brin_summarize_new_values.
Or, maybe better, let's lose the brin_summarize_new_values call
altogether. What does it test that wouldn't be better done by
explicitly running "vacuum brintest;" ?
Also worth noting is that there's a completely different failure symptom
that's shown up a few times, eg here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk&dt=2015-05-25%2009%3A56%3A55
This makes it look like brintest sometimes contains no rows at all,
which is difficult to explain ...
regards, tom lane