Re: Changing the state of data checksums in a running cluster - Mailing list pgsql-hackers
| From | Tomas Vondra |
|---|---|
| Subject | Re: Changing the state of data checksums in a running cluster |
| Date | |
| Msg-id | daa95f1b-4dbd-4e03-a436-3ddd48b0fe56@vondra.me Whole thread Raw |
| In response to | Re: Changing the state of data checksums in a running cluster (Daniel Gustafsson <daniel@yesql.se>) |
| Responses |
Re: Changing the state of data checksums in a running cluster
|
| List | pgsql-hackers |
On 3/12/26 00:56, Daniel Gustafsson wrote:
>> On 6 Feb 2026, at 18:15, Tomas Vondra <tomas@vondra.me> wrote:
>
>> I spent a bit more time looking at this today, and I figured out a
>> simpler way to "cause trouble" by using PITR. I don't know if this has
>> the same root cause as the failures in the 006 TAP test, but I find it
>> interesting and I think it highlights some issues with the new patch.
>
> I've encoded some of this into a new test suite in the attached patchset,
> having it in a repeatable test should make it easier to reproduce and reason
> about.
>
> Attached is a rebase of the patchset which further refines the switch to using
> XLOG_CHECKPOINT_* records which was proposed by Andres upstream. The errors I
> encountered could be traced to replay across multiple state transitions (with
> at least one case of ON->OFF), and re-introducing checkpoints at on and off
> state transitions helped removing those.
>
> * A lot of code has been moved out from xlog.c and into the file which contains
> the code for the datachecksumsworker. This file has in turn been renamed to
> signal what it's used for. It should probably be moved from backend/postmaster
> but it's not clear to me where it should go.
> * Improved signalhandling in the worker code.
> * A new testsuite for PITR is added, which is based on Tomas' findings. This
> can be further increased and refined but it's at least a start.
> * Another new testsuite which changes full_page_writes settings is also added.
> This one should perhaps be rolled into 001_basic, not sure.
> * Tests 006 and 007, the pgbench "torture test" suites, are now being run in a
> pared-down fashion when PG_TEST_EXTRA isn't defined, rather than being skipped
> entirely
> * The minimum max_wal_size in tests is now 32 instead of 64.
> * Lots of polish and small tweaking and new comments
>
> There are a few TODOs, the more interesting ones include:
>
> * The change to XLOG_CHECKPOINT_REDO to move the wal_level into a proper record
> structure should be pulled out as a 0001 patch as it's an cleanup that has
> value on its own.
Makes sense, but it's going to be harder because since d774072f0040 all
4 bits in XLR_INFO are used.
> * Try to move even more code awy from xlog.c
> * Address, and remove, all XXX comments (most of which stem from a much earlier
> version of the patchset).
> * The code for emitting, and waiting on, the procsignalbarrier exists in
> multiple copies which needs to be refactored into a single one.
> * Tests failed in Autoconf on CI, which I need to fix.
>
A couple minor issues I noticed:
1) Is this actually doing the expected thing?
INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
We're passing a regular pointer to the database list, so can the
injection point actually modify it? I suppose it happens to work because
dc_dblist() removes the last item, so the pointer to the list does not
change. But that's luck.
2) ProcessAllDatabases may be misusing processed_databases
The loop is using processed_databases to track number of databases
processed in each interation, but it's also being used for progress
reporting through pgstat_progress_update_param(). So if we have to go
through multiple iterations of the loop, it'll reset to 0 and then only
count newly processed ones. But the progress still shows the original
total number of databases. IMO it needs to track the cumulative number
of processed databases.
3) DATACHECKSUMSWORKER_MAX_DB_RETRIES / DATACHECKSUMSWORKER_FAILED
What happens if a database reaches the maximum number of retries? We
mark that entry as failed, but AFAIK we'll still try to process any
remaining databases. Isn't that already doomed and we won't be able to
enable checksums? So why not to simply abort the loop right away?
regards
--
Tomas Vondra
pgsql-hackers by date: