Re: Online checksums verification in the backend - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Online checksums verification in the backend
Date
Msg-id CA+fd4k5VS_QAkdGwU-mwsr1z78XM7egixZ2bDhfeSJm4HCrRGg@mail.gmail.com
Whole thread Raw
In response to Re: Online checksums verification in the backend  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Online checksums verification in the backend  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On Sun, 5 Apr 2020 at 17:44, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Sun, Apr 05, 2020 at 01:13:30PM +0900, Masahiko Sawada wrote:
> >
> > Thank you for updating the patch! The patch looks good to me. Here are
> > some random comments mostly about cosmetic changes.
> >
>
> Thanks a lot for the review!

Thank you for updating the patch.

>
> >
> > 1.
> > I think we can have two separate SQL functions:
> > pg_check_relation(regclass, text) and pg_check_relation(regclass),
> > instead of setting NULL by default to the second argument.
> >
>
> I'm fine with it, so implemented this way with the required documentation
> changes.

Why do we need two rows in the doc? For instance, replication slot
functions have some optional arguments but there is only one row in
the doc. So I think we don't need to change the doc from the previous
version patch.

And I think these are not necessary as we already defined in
include/catalog/pg_proc.dat:

+CREATE OR REPLACE FUNCTION pg_check_relation(
+  IN relation regclass,
+  OUT relid oid, OUT forknum integer, OUT failed_blocknum bigint,
+  OUT expected_checksum integer, OUT found_checksum integer)
+  RETURNS SETOF record STRICT VOLATILE LANGUAGE internal AS 'pg_check_relation'
+  PARALLEL RESTRICTED;
+
+CREATE OR REPLACE FUNCTION pg_check_relation(
+  IN relation regclass, IN fork text,
+  OUT relid oid, OUT forknum integer, OUT failed_blocknum bigint,
+  OUT expected_checksum integer, OUT found_checksum integer)
+  RETURNS SETOF record STRICT VOLATILE LANGUAGE internal
+  AS 'pg_check_relation_fork'
+  PARALLEL RESTRICTED;

>
> >
> > 2.
> > + * Check data sanity for a specific block in the given fork of the given
> > + * relation, always retrieved locally with smgrred even if a version exists in
> >
> > s/smgrred/smgrread/
>
> Fixed.
>
> >
> > 3.
> > +       /* The buffer will have to check checked. */
> > +       Assert(checkit);
> >
> > Should it be "The buffer will have to be checked"?
> >
>
> Oops indeed, fixed.
>
> >
> > 4.
> > +   if (!is_member_of_role(GetUserId(), DEFAULT_ROLE_READ_SERVER_FILES))
> > +       ereport(ERROR,
> > +               (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > +                errmsg("only superuser or a member of the
> > pg_read_server_files role may use this function")));
> >
> > Looking at the definition of pg_stat_read_server_files role, this role
> > seems to be for operations that could read non-database files such as
> > csv files. Therefore, currently this role is used by file_fdw and COPY
> > command. I personally think pg_stat_scan_tables would be more
> > appropriate for this function but I'm not sure.
> >
>
> That's a very good point, especially since the documentation of this default
> role is quite relevant for those functions:
>
> "Execute monitoring functions that may take ACCESS SHARE locks on tables,
> potentially for a long time."
>
> So changed!
>
> >
> > 5.
> > +   /* Set cost-based vacuum delay */
> > +   ChecksumCostActive = (ChecksumCostDelay > 0);
> > +   ChecksumCostBalance = 0;
> >
> > s/vacuum/checksum verification/
> >
>
> Fixed.
>
> >
> > 6.
> > +       ereport(WARNING,
> > +               (errcode(ERRCODE_DATA_CORRUPTED),
> > +                errmsg("invalid page in block %u of relation %s",
> > +                       blkno,
> > +                       relpath(relation->rd_smgr->smgr_rnode, forknum))));
> >
> > I think it's better to show the relation name instead of the relation path here.
> >
>
> I'm here using the same pattern as what ReadBuffer_common() would display if a
> corrupted block is read.  I think it's better to keep the format for both, so
> any existing log analyzer will keep working with those new functions.

Ok, I agree with you.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Online checksums verification in the backend
Next
From: Julien Rouhaud
Date:
Subject: Re: Online checksums verification in the backend