Re: Fixing WAL instability in various TAP tests - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Fixing WAL instability in various TAP tests
Date
Msg-id 2B4C64CF-EE3F-474B-9685-6A927E5E49AE@enterprisedb.com
Whole thread Raw
In response to Re: Fixing WAL instability in various TAP tests  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fixing WAL instability in various TAP tests
Re: Fixing WAL instability in various TAP tests
List pgsql-hackers

> On Sep 25, 2021, at 7:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> Leaving the tests brittle wastes developer time.
>
> Trying to make them proof against all possible settings would waste
> a lot more time, though.

You may be right, but the conversation about "all possible settings" was started by Noah.  I was really just talking
abouttests that depend on wal files not being removed, but taking no action to guarantee that, merely trusting that
underdefault settings they won't be.  I can't square that design against other TAP tests that do take measures to
preventwal files being removed.  Why is the precaution taken in some tests but not others?  If this is intentional,
shouldn'tsome comment in the tests without such precautions explain that choice?  Are they intentionally testing that
thedefault GUC wal size settings and wal verbosity won't break the test? 

This isn't a rhetorical question:

In src/test/recovery/t/015_promotion_pages.pl, the comments talk about the how checkpoints impact what happens on the
standby. The test issues an explicit checkpoint on the primary, and again later on the standby, so it is unclear if
that'swhat the comments refer to, or if they also refer to implicit expectations about when/if other checkpoints will
happen. The test breaks when I change the GUC settings, but I can fix that breakage by adding a replication slot to the
test. Have I broken the purpose of the test by doing so, though?  Does using a replication slot to force the wal to not
beremoved early break what the test is designed to check? 

The other tests raise similar questions.  Is the brittleness intentional?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Next
From: Tom Lane
Date:
Subject: Re: Release 14 Schedule