Home > mailing lists

Re: Online checksums verification in the backend - Mailing list pgsql-hackers

From	Michael Paquier
Subject	Re: Online checksums verification in the backend
Date	March 18, 2020 04:20:47
Msg-id	20200318042047.GH214947@paquier.xyz Whole thread Raw
In response to	Re: Online checksums verification in the backend (Julien Rouhaud <rjuju123@gmail.com>)
Responses	Re: Online checksums verification in the backend Re: Online checksums verification in the backend
List	pgsql-hackers

Tree view

On Mon, Mar 16, 2020 at 09:21:22AM +0100, Julien Rouhaud wrote:
> On Mon, Mar 16, 2020 at 12:29:28PM +0900, Michael Paquier wrote:
>> With a large amount of
>> shared buffer eviction you actually increase the risk of torn page
>> reads.  Instead of a logic relying on partition mapping locks, which
>> could be unwise on performance grounds, did you consider different
>> approaches?  For example a kind of pre-emptive lock on the page in
>> storage to prevent any shared buffer operation to happen while the
>> block is read from storage, that would act like a barrier.
>
> Even with a workload having a large shared_buffers eviction pattern, I don't
> think that there's a high probability of hitting a torn page.  Unless I'm
> mistaken it can only happen if all those steps happen concurrently to doing the
> block read just after releasing the LWLock:
>
> - postgres read the same block in shared_buffers (including all the locking)
> - dirties it
> - writes part of the page
>
> It's certainly possible, but it seems so unlikely that the optimistic lock-less
> approach seems like a very good tradeoff.

Having false reports in this area could be very confusing for the
user.  That's for example possible now with checksum verification and
base backups.

>> I guess that this leads to the fact that this function may be better as
>> a contrib module, with the addition of some better-suited APIs in core
>> (see paragraph above).
>
> Below?

Above.  This thought more precisely:
>> For example a kind of pre-emptive lock on the page in
>> storage to prevent any shared buffer operation to happen while the
>> block is read from storage, that would act like a barrier.

> For the record when I first tested that feature I did try to check dirty
> blocks, and it seemed that dirty blocks of shared relation were sometimes
> wrongly reported as corrupted.  I didn't try to investigate more though.

Hmm.  It would be good to look at that, correct verification of shared
relations matter.

>> + * - if a block is dirty in shared_buffers, it's ignored as it'll be flushed to
>> + *   disk either before the end of the next checkpoint or during recovery in
>> + *   case of unsafe shutdown
>> Not sure that the indentation is going to react well on that part of
>> the patch, perhaps it would be better to add some "/*-------" at the
>> beginning and end of the comment block to tell pgindent to ignore this
>> part?
>
> Ok.  Although I think only the beginning comment is needed?

From src/tools/pgindent/README:
"pgindent will reflow any comment block that's not at the left margin.
If this messes up manual formatting that ought to be preserved,
protect the comment block with some dashes:"

        /*----------
     * Text here will not be touched by pgindent.
         *----------
         */

>> Based on the feedback gathered on this thread, I guess that you should
>> have a SRF returning the list of broken blocks, as well as NOTICE
>> messages.
>
> The current patch has an SRF and a WARNING message, do you want an additional
> NOTICE message or downgrade the existing one?

Right, not sure which one is better, for zero_damaged_pages a WARNING
is used.
--
Michael

Attachment

signature.asc

pgsql-hackers by date:

From: David Rowley
Date: 18 March 2020, 04:12:56
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition

From: "Shinoda, Noriyoshi (PN Japan A&PS Delivery)"
Date: 18 March 2020, 04:28:34
Subject: RE: Multivariate MCV list vs. statistics target

Re: Online checksums verification in the backend - Mailing list pgsql-hackers

Attachment

Previous

Next