Home > mailing lists

Re: Online checksums patch - once again - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Online checksums patch - once again
Date	February 9, 2021 08:54:50
Msg-id	ef9bb1da-ba99-a59f-4102-56026813469f@iki.fi Whole thread Raw
In response to	Re: Online checksums patch - once again (Daniel Gustafsson <daniel@yesql.se>)
Responses	Re: Online checksums patch - once again Re: Online checksums patch - once again Re: Online checksums patch - once again
List	pgsql-hackers

Tree view

(I may have said this before, but) My overall high-level impression of 
this patch is that it's really cmmplex for a feature that you use maybe 
once in the lifetime of a cluster. I'm happy to review but I'm not 
planning to commit this myself. I don't object if some other committer 
picks this up (Magnus?).

Now to the latest patch version:

On 03/02/2021 18:15, Daniel Gustafsson wrote:
> The previous v35 had a tiny conflict in pg_class.h, the attached v36 (which is
> a squash of the 2 commits in v35) fixes that.  No other changes are introduced
> in this version.

>     /*
>      * Check to see if my copy of RedoRecPtr is out of date. If so, may have
>      * to go back and have the caller recompute everything. This can only
>      * happen just after a checkpoint, so it's better to be slow in this case
>      * and fast otherwise.
>      *
>      * Also check to see if fullPageWrites or forcePageWrites was just turned
>      * on, or if we are in the process of enabling checksums in the cluster;
>      * if we weren't already doing full-page writes then go back and recompute.
>      *
>      * If we aren't doing full-page writes then RedoRecPtr doesn't actually
>      * affect the contents of the XLOG record, so we'll update our local copy
>      * but not force a recomputation.  (If doPageWrites was just turned off,
>      * we could recompute the record without full pages, but we choose not to
>      * bother.)
>      */
>     if (RedoRecPtr != Insert->RedoRecPtr)
>     {
>         Assert(RedoRecPtr < Insert->RedoRecPtr);
>         RedoRecPtr = Insert->RedoRecPtr;
>     }
>     doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites || DataChecksumsOnInProgress());

Why does this use DataChecksumsOnInProgress() instead of 
DataChecksumsNeedWrite()? If checksums are enabled, you always need 
full-page writes, don't you? If not, then why is it needed in the 
inprogress-on state?

We also set doPageWrites in InitXLOGAccess(). That should match the 
condition above (although it doesn't matter for correctness).

> /*
>  * DataChecksumsNeedVerify
>  *        Returns whether data checksums must be verified or not
>  *
>  * Data checksums are only verified if they are fully enabled in the cluster.
>  * During the "inprogress-on" and "inprogress-off" states they are only
>  * updated, not verified (see datachecksumsworker.c for a longer discussion).
>  *
>  * This function is intended for callsites which have read data and are about
>  * to perform checksum validation based on the result of this. To avoid the
>  * the risk of the checksum state changing between reading and performing the
>  * validation (or not), interrupts must be held off. This implies that calling
>  * this function must be performed as close to the validation call as possible
>  * to keep the critical section short. This is in order to protect against
>  * time of check/time of use situations around data checksum validation.
>  */
> bool
> DataChecksumsNeedVerify(void)
> {
>     Assert(InterruptHoldoffCount > 0);
>     return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
> }

What exactly is the intended call pattern here? Something like this?

smgrread() a data page
HOLD_INTERRUPTS();
if (DataChecksumsNeedVerify())
{
   if (pg_checksum_page((char *) page, blkno) != expected)
     elog(ERROR, "bad checksum");
}
RESUME_INTERRUPTS();

That seems to be what the code currently does. What good does holding 
interrupts do here? If checksums were not fully enabled at the 
smgrread() call, the page might have incorrect checksums, and if the 
state transitions from inprogress-on to on between the smggread() call 
and the DataChecksumsNeedVerify() call, you'll get an error. I think you 
need to hold the interrupts *before* the smgrread() call.

> /*
>  * Set checksum for a page in private memory.
>  *
>  * This must only be used when we know that no other process can be modifying
>  * the page buffer.
>  */
> void
> PageSetChecksumInplace(Page page, BlockNumber blkno)
> {
>     HOLD_INTERRUPTS();
>     /* If we don't need a checksum, just return */
>     if (PageIsNew(page) || !DataChecksumsNeedWrite())
>     {
>         RESUME_INTERRUPTS();
>         return;
>     }
> 
>     ((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
>     RESUME_INTERRUPTS();
> }

The checksums state might change just after this call, before the caller 
has actually performed the smgrwrite() or smgrextend() call. The caller 
needs to hold interrupts across this function and the 
smgrwrite/smgrextend() call. It is a bad idea to HOLD_INTERRUPTS() here, 
because that just masks bugs where the caller isn't holding the 
interrupts. Same in PageSetChecksumCopy().

- Heikki

pgsql-hackers by date:

From: torikoshia
Date: 09 February 2021, 08:48:55
Subject: Re: adding wait_start column to pg_locks

From: Greg Nancarrow
Date: 09 February 2021, 08:57:06
Subject: Re: Libpq support to connect to standby server as priority

Re: Online checksums patch - once again - Mailing list pgsql-hackers

Previous

Next