Re: Online checksums verification in the backend - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Online checksums verification in the backend
Date
Msg-id 20200328121858.GA13605@nol
Whole thread Raw
In response to Re: Online checksums verification in the backend  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: Online checksums verification in the backend
List pgsql-hackers
On Sat, Mar 28, 2020 at 12:28:27PM +0900, Masahiko Sawada wrote:
> On Wed, 18 Mar 2020 at 19:11, Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > v5 attached
> 
> Thank you for updating the patch. I have some comments:

Thanks a lot for the review!

> 1.
> +       <entry>
> +        <literal><function>pg_check_relation(<parameter>relation</parameter>
> <type>oid</type>, <parameter>fork</parameter>
> <type>text</type>)</function></literal>
> +       </entry>
> 
> Looking at the declaration of pg_check_relation, 'relation' and 'fork'
> are optional arguments. So I think the above is not correct. But as I
> commented below, 'relation' should not be optional, so maybe the above
> line could be:
> 
> +        <literal><function>pg_check_relation(<parameter>relation</parameter>
> <type>oid</type>[, <parameter>fork</parameter>
> <type>text</type>])</function></literal>

Yes I missed that when making relation mandatory.  Fixed.

> 2.
> +   <indexterm>
> +    <primary>pg_check_relation</primary>
> +   </indexterm>
> +   <para>
> +    <function>pg_check_relation</function> iterates over all the blocks of all
> +    or the specified fork of a given relation and verify their checksum.  It
> +    returns the list of blocks for which the found checksum doesn't match the
> +    expected one.  You must be a member of the
> +    <literal>pg_read_all_stats</literal> role to use this function.  It can
> +    only be used if data checksums are enabled.  See <xref
> +    linkend="app-initdb-data-checksums"/> for more information.
> +   </para>
> 
> * I think we need a description about possible values for 'fork'
> (i.g., 'main', 'vm', 'fsm' and 'init'), and the behavior when 'fork'
> is omitted.

Done.

> * Do we need to explain about checksum cost-based delay here?

It's probably better in config.sgml, nearby vacuum cost-based delay, so done
this way with a link to reference that part.

> 3.
> +CREATE OR REPLACE FUNCTION pg_check_relation(
> +  IN relation regclass DEFAULT NULL::regclass, IN fork text DEFAULT NULL::text,
> +  OUT relid oid, OUT forknum integer, OUT failed_blocknum bigint,
> +  OUT expected_checksum integer, OUT found_checksum integer)
> +  RETURNS SETOF record VOLATILE LANGUAGE internal AS 'pg_check_relation'
> +  PARALLEL RESTRICTED;
> 
> Now that pg_check_relation doesn't accept NULL as 'relation', I think
> we need to make 'relation' a mandatory argument.

Correct, fixed.

> 4.
> +   /* Check if the relation (still) exists */
> +   if (SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
> +   {
> +       /*
> +        * We use a standard relation_open() to acquire the initial lock.  It
> +        * means that this will block until the lock is acquired, or will
> +        * raise an ERROR if lock_timeout has been set.  If caller wants to
> +        * check multiple tables while relying on a maximum wait time, it
> +        * should process tables one by one instead of relying on a global
> +        * processing with the main SRF.
> +        */
> +       relation = relation_open(relid, AccessShareLock);
> +   }
> 
> IIUC the above was necessary because we used to have
> check_all_relations() which iterates all relations on the database to
> do checksum checks. But now that we don't have that function and
> pg_check_relation processes one relation. Can we just do
> relation_open() here?

Ah yes I missed that comment.  I think only the comment needed to be updated to
remove any part related to NULL-relation call.  I ended up removign the whole
comment since locking and lock_timeout behavior is inherent to relation_open
and there's no need to document that any further now that we always only check
one relation at a time.

> 5.
> I think we need to check if the relation is a temp relation. I'm not
> sure it's worth to check checksums of temp relations but at least we
> need not to check other's temp relations.

Good point.  I think it's still worthwhile to check the backend's temp
relation, although typical usage should be a bgworker/cron job doing that check
so there shouldn't be any.

> 6.
> +/*
> + * Safely read the wanted buffer from disk, dealing with possible concurrency
> + * issue.  Note that if a buffer is found dirty in shared_buffers, no read will
> + * be performed and the caller will be informed that no check should be done.
> + * We can safely ignore such buffers as they'll be written before next
> + * checkpoint's completion..
> [...] 
> + */
> 
> I think the above comment also needs some "/*-------" at the beginning and end.

Fixed.

> 7.
> +static void
> +check_get_buffer(Relation relation, ForkNumber forknum,
> +                BlockNumber blkno, char *buffer, bool needlock, bool *checkit,
> +                bool *found_in_sb)
> +{
> 
> Maybe we can make check_get_buffer() return a bool indicating we found
> a buffer to check, instead of having '*checkit'. That way, we can
> simplify the following code:
> 
> +       check_get_buffer(relation, forknum, blkno, buffer, force_lock,
> +                        &checkit, &found_in_sb);
> +
> +       if (!checkit)
> +           continue;
> 
> to something like:
> 
> + if (!check_get_buffer(relation, forknum, blkno, buffer, force_lock,
> +                          &found_in_sb))
> +     continue;

Changed.

> 8.
> +       if (PageIsVerified(buffer, blkno))
> +       {
> +           /*
> +            * If the page is really new, there won't by any checksum to be
> +            * computed or expected.
> +            */
> +           *chk_expected = *chk_found = NoComputedChecksum;
> +           return true;
> +       }
> +       else
> +       {
> +           /*
> +            * There's a corruption, but since this affect PageIsNew, we
> +            * can't compute a checksum, so set NoComputedChecksum for the
> +            * expected checksum.
> +            */
> +           *chk_expected = NoComputedChecksum;
> +           *chk_found = hdr->pd_checksum;
> +       }
> +       return false;
> 
> * I think the 'else' is not necessary here.

AFAICT it's, see below.

> * Setting *chk_expected and *chk_found seems useless when we return
> true. The caller doesn't use them.

Indeed, fixed.

> * Should we forcibly overwrite ignore_checksum_failure to off in
> pg_check_relation()? Otherwise, this logic seems not work fine.
> 
> * I don't understand why we cannot compute a checksum in case where a
> page looks like a new page but is actually corrupted. Could you please
> elaborate on that?

PageIsVerified has a different behavior depending on whether the page looks new
or not.  If the page looks like new, it only checks that it's indeed a new
page, and otherwise try to verify the checksum.

Also, pg_check_page() has an assert to make sure that the page isn't (or don't
look like) new.

So it seems to me that the 'else' is required to properly detect a real or fake
PageIsNew, and try to compute checksums only when required.

> 8.
> +   {
> +       {"checksum_cost_page_hit", PGC_USERSET, RESOURCES_CHECKSUM_DELAY,
> +           gettext_noop("Checksum cost for a page found in the buffer cache."),
> +           NULL
> +       },
> +       &ChecksumCostPageHit,
> +       1, 0, 10000,
> +       NULL, NULL, NULL
> +   },
> 
> * There is no description about the newly added four GUC parameters in the doc.
> 
> * We need to put new GUC parameters into postgresql.conf.sample as well.

Fixed both.

> * The patch doesn't use checksum_cost_page_hit at all.

Indeed, I also realized that while working on previous issues.  I removed it
and renamed checksum_cost_page_miss to checksum_cost_page.
>
> 9.
> diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
> index eb19644419..37f63e747c 100644
> --- a/src/backend/utils/init/globals.c
> +++ b/src/backend/utils/init/globals.c
> @@ -134,6 +134,14 @@ int            max_worker_processes = 8;
>  int            max_parallel_workers = 8;
>  int            MaxBackends = 0;
> 
> +int            ChecksumCostPageHit = 1;    /* GUC parameters for
> checksum check */
> +int            ChecksumCostPageMiss = 10;
> +int            ChecksumCostLimit = 200;
> +double     ChecksumCostDelay = 0;
> +
> +int            ChecksumCostBalance = 0;    /* working state for
> checksums check */
> +bool       ChecksumCostActive = false;
> 
> Can we declare them in checksum.c since these parameters are used only
> in checksum.c and it does I/O my itself.

The GUC parameters would still need to be global, so for consistency I kept all
the variables in globals.c.
>
> 10.
> +       /* Report the failure to the stat collector and the logs. */
> +       pgstat_report_checksum_failure();
> +       ereport(WARNING,
> +               (errcode(ERRCODE_DATA_CORRUPTED),
> +                errmsg("invalid page in block %u of relation %s",
> +                       blkno,
> +                       relpath(relation->rd_smgr->smgr_rnode, forknum))));
> 
> I think we could do pgstat_report_checksum_failure() and emit WARNING
> twice for the same page since PageIsVerified() also does the same.

As mentioned before, in this patch I only calls PageIsVerified() if the buffer
looks like new, and in this case PageIsVerified() only verify that it's a true
all-zero-page, and won't try to verify the checksum, so there's no possibility
of duplicated report.  I modified the comments to document all the interactions
and expectations.

v6 attached.

Attachment

pgsql-hackers by date:

Previous
From: "Ivan N. Taranov"
Date:
Subject: Re: [PATCH] postgresql.conf.sample->postgresql.conf.sample.in
Next
From: Dean Rasheed
Date:
Subject: Re: PATCH: add support for IN and @> in functional-dependencystatistics use