Re: A test for replay of regression tests - Mailing list pgsql-hackers

From Andres Freund
Subject Re: A test for replay of regression tests
Date
Msg-id 20210423170452.hfyinzvaeef5y3dm@alap3.anarazel.de
Whole thread Raw
In response to Re: A test for replay of regression tests  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: A test for replay of regression tests  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2021-04-23 11:53:35 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Hm. I wonder if running with wal_consistency_checking=all doesn't also
> > reduce coverage of some aspects of recovery, by increasing record sizes
> > etc.
> 
> Yeah.  I found out earlier that wal_consistency_checking=all is an
> absolute PIG.  Running the regression tests that way requires tens of
> gigabytes of disk space, and a significant amount of time if your
> disk is not very speedy.  If we put this into the buildfarm at all,
> it would have to be opt-in, not run-by-default, because a lot of BF
> animals simply don't have the horsepower.  I think I'd vote against
> adding it to check-world, too; the cost/benefit ratio is not good
> unless you are specifically working on replay functions.

I think it'd be a huge improvement to test recovery during check-world
by default - it's a huge swath of crucial code that practically has no
test coverage.  I agree that testing by default with
wal_consistency_checking=all isn't feasible due to the time & space
overhead, so I think we should not do that.


> The two things I'd say about this are (1) Whether to use
> wal_consistency_checking, and with what value, needs to be
> easily adjustable. (2) People will want to run other test suites
> than the core regression tests, eg contrib modules.

I'm not really excited about tackling 2) initially. I think it's a
significant issue that we don't have test coverage for most redo
routines and that we should change that with some urgency - even if we
back out these WAL prefetch related changes, there've been sufficiently
many changes affecting WAL that I think it's worth doing the minimal
thing soon.

I don't think there's actually that much need to test contrib modules
through recovery - most of them don't seem like they'd add meaningful
coverage? The exception is contrib/bloom, but perhaps that'd be better
tackled with a dedicated test?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: TRUNCATE on foreign table
Next
From: Alvaro Herrera
Date:
Subject: Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY