Re: Online checksums verification in the backend - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: Online checksums verification in the backend |
Date | |
Msg-id | 20200328121858.GA13605@nol Whole thread Raw |
In response to | Re: Online checksums verification in the backend (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
Responses |
Re: Online checksums verification in the backend
|
List | pgsql-hackers |
On Sat, Mar 28, 2020 at 12:28:27PM +0900, Masahiko Sawada wrote: > On Wed, 18 Mar 2020 at 19:11, Julien Rouhaud <rjuju123@gmail.com> wrote: > > > > v5 attached > > Thank you for updating the patch. I have some comments: Thanks a lot for the review! > 1. > + <entry> > + <literal><function>pg_check_relation(<parameter>relation</parameter> > <type>oid</type>, <parameter>fork</parameter> > <type>text</type>)</function></literal> > + </entry> > > Looking at the declaration of pg_check_relation, 'relation' and 'fork' > are optional arguments. So I think the above is not correct. But as I > commented below, 'relation' should not be optional, so maybe the above > line could be: > > + <literal><function>pg_check_relation(<parameter>relation</parameter> > <type>oid</type>[, <parameter>fork</parameter> > <type>text</type>])</function></literal> Yes I missed that when making relation mandatory. Fixed. > 2. > + <indexterm> > + <primary>pg_check_relation</primary> > + </indexterm> > + <para> > + <function>pg_check_relation</function> iterates over all the blocks of all > + or the specified fork of a given relation and verify their checksum. It > + returns the list of blocks for which the found checksum doesn't match the > + expected one. You must be a member of the > + <literal>pg_read_all_stats</literal> role to use this function. It can > + only be used if data checksums are enabled. See <xref > + linkend="app-initdb-data-checksums"/> for more information. > + </para> > > * I think we need a description about possible values for 'fork' > (i.g., 'main', 'vm', 'fsm' and 'init'), and the behavior when 'fork' > is omitted. Done. > * Do we need to explain about checksum cost-based delay here? It's probably better in config.sgml, nearby vacuum cost-based delay, so done this way with a link to reference that part. > 3. > +CREATE OR REPLACE FUNCTION pg_check_relation( > + IN relation regclass DEFAULT NULL::regclass, IN fork text DEFAULT NULL::text, > + OUT relid oid, OUT forknum integer, OUT failed_blocknum bigint, > + OUT expected_checksum integer, OUT found_checksum integer) > + RETURNS SETOF record VOLATILE LANGUAGE internal AS 'pg_check_relation' > + PARALLEL RESTRICTED; > > Now that pg_check_relation doesn't accept NULL as 'relation', I think > we need to make 'relation' a mandatory argument. Correct, fixed. > 4. > + /* Check if the relation (still) exists */ > + if (SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid))) > + { > + /* > + * We use a standard relation_open() to acquire the initial lock. It > + * means that this will block until the lock is acquired, or will > + * raise an ERROR if lock_timeout has been set. If caller wants to > + * check multiple tables while relying on a maximum wait time, it > + * should process tables one by one instead of relying on a global > + * processing with the main SRF. > + */ > + relation = relation_open(relid, AccessShareLock); > + } > > IIUC the above was necessary because we used to have > check_all_relations() which iterates all relations on the database to > do checksum checks. But now that we don't have that function and > pg_check_relation processes one relation. Can we just do > relation_open() here? Ah yes I missed that comment. I think only the comment needed to be updated to remove any part related to NULL-relation call. I ended up removign the whole comment since locking and lock_timeout behavior is inherent to relation_open and there's no need to document that any further now that we always only check one relation at a time. > 5. > I think we need to check if the relation is a temp relation. I'm not > sure it's worth to check checksums of temp relations but at least we > need not to check other's temp relations. Good point. I think it's still worthwhile to check the backend's temp relation, although typical usage should be a bgworker/cron job doing that check so there shouldn't be any. > 6. > +/* > + * Safely read the wanted buffer from disk, dealing with possible concurrency > + * issue. Note that if a buffer is found dirty in shared_buffers, no read will > + * be performed and the caller will be informed that no check should be done. > + * We can safely ignore such buffers as they'll be written before next > + * checkpoint's completion.. > [...] > + */ > > I think the above comment also needs some "/*-------" at the beginning and end. Fixed. > 7. > +static void > +check_get_buffer(Relation relation, ForkNumber forknum, > + BlockNumber blkno, char *buffer, bool needlock, bool *checkit, > + bool *found_in_sb) > +{ > > Maybe we can make check_get_buffer() return a bool indicating we found > a buffer to check, instead of having '*checkit'. That way, we can > simplify the following code: > > + check_get_buffer(relation, forknum, blkno, buffer, force_lock, > + &checkit, &found_in_sb); > + > + if (!checkit) > + continue; > > to something like: > > + if (!check_get_buffer(relation, forknum, blkno, buffer, force_lock, > + &found_in_sb)) > + continue; Changed. > 8. > + if (PageIsVerified(buffer, blkno)) > + { > + /* > + * If the page is really new, there won't by any checksum to be > + * computed or expected. > + */ > + *chk_expected = *chk_found = NoComputedChecksum; > + return true; > + } > + else > + { > + /* > + * There's a corruption, but since this affect PageIsNew, we > + * can't compute a checksum, so set NoComputedChecksum for the > + * expected checksum. > + */ > + *chk_expected = NoComputedChecksum; > + *chk_found = hdr->pd_checksum; > + } > + return false; > > * I think the 'else' is not necessary here. AFAICT it's, see below. > * Setting *chk_expected and *chk_found seems useless when we return > true. The caller doesn't use them. Indeed, fixed. > * Should we forcibly overwrite ignore_checksum_failure to off in > pg_check_relation()? Otherwise, this logic seems not work fine. > > * I don't understand why we cannot compute a checksum in case where a > page looks like a new page but is actually corrupted. Could you please > elaborate on that? PageIsVerified has a different behavior depending on whether the page looks new or not. If the page looks like new, it only checks that it's indeed a new page, and otherwise try to verify the checksum. Also, pg_check_page() has an assert to make sure that the page isn't (or don't look like) new. So it seems to me that the 'else' is required to properly detect a real or fake PageIsNew, and try to compute checksums only when required. > 8. > + { > + {"checksum_cost_page_hit", PGC_USERSET, RESOURCES_CHECKSUM_DELAY, > + gettext_noop("Checksum cost for a page found in the buffer cache."), > + NULL > + }, > + &ChecksumCostPageHit, > + 1, 0, 10000, > + NULL, NULL, NULL > + }, > > * There is no description about the newly added four GUC parameters in the doc. > > * We need to put new GUC parameters into postgresql.conf.sample as well. Fixed both. > * The patch doesn't use checksum_cost_page_hit at all. Indeed, I also realized that while working on previous issues. I removed it and renamed checksum_cost_page_miss to checksum_cost_page. > > 9. > diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c > index eb19644419..37f63e747c 100644 > --- a/src/backend/utils/init/globals.c > +++ b/src/backend/utils/init/globals.c > @@ -134,6 +134,14 @@ int max_worker_processes = 8; > int max_parallel_workers = 8; > int MaxBackends = 0; > > +int ChecksumCostPageHit = 1; /* GUC parameters for > checksum check */ > +int ChecksumCostPageMiss = 10; > +int ChecksumCostLimit = 200; > +double ChecksumCostDelay = 0; > + > +int ChecksumCostBalance = 0; /* working state for > checksums check */ > +bool ChecksumCostActive = false; > > Can we declare them in checksum.c since these parameters are used only > in checksum.c and it does I/O my itself. The GUC parameters would still need to be global, so for consistency I kept all the variables in globals.c. > > 10. > + /* Report the failure to the stat collector and the logs. */ > + pgstat_report_checksum_failure(); > + ereport(WARNING, > + (errcode(ERRCODE_DATA_CORRUPTED), > + errmsg("invalid page in block %u of relation %s", > + blkno, > + relpath(relation->rd_smgr->smgr_rnode, forknum)))); > > I think we could do pgstat_report_checksum_failure() and emit WARNING > twice for the same page since PageIsVerified() also does the same. As mentioned before, in this patch I only calls PageIsVerified() if the buffer looks like new, and in this case PageIsVerified() only verify that it's a true all-zero-page, and won't try to verify the checksum, so there's no possibility of duplicated report. I modified the comments to document all the interactions and expectations. v6 attached.
Attachment
pgsql-hackers by date: