Re: Online checksums verification in the backend - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Online checksums verification in the backend
Date
Msg-id 20200318101055.GA36918@nol
Whole thread Raw
In response to Re: Online checksums verification in the backend  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Online checksums verification in the backend  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
On Wed, Mar 18, 2020 at 07:06:19AM +0100, Julien Rouhaud wrote:
> On Wed, Mar 18, 2020 at 01:20:47PM +0900, Michael Paquier wrote:
> > On Mon, Mar 16, 2020 at 09:21:22AM +0100, Julien Rouhaud wrote:
> > > On Mon, Mar 16, 2020 at 12:29:28PM +0900, Michael Paquier wrote:
> > >> With a large amount of
> > >> shared buffer eviction you actually increase the risk of torn page
> > >> reads.  Instead of a logic relying on partition mapping locks, which
> > >> could be unwise on performance grounds, did you consider different
> > >> approaches?  For example a kind of pre-emptive lock on the page in
> > >> storage to prevent any shared buffer operation to happen while the
> > >> block is read from storage, that would act like a barrier.
> > >
> > > Even with a workload having a large shared_buffers eviction pattern, I don't
> > > think that there's a high probability of hitting a torn page.  Unless I'm
> > > mistaken it can only happen if all those steps happen concurrently to doing the
> > > block read just after releasing the LWLock:
> > >
> > > - postgres read the same block in shared_buffers (including all the locking)
> > > - dirties it
> > > - writes part of the page
> > >
> > > It's certainly possible, but it seems so unlikely that the optimistic lock-less
> > > approach seems like a very good tradeoff.
> >
> > Having false reports in this area could be very confusing for the
> > user.  That's for example possible now with checksum verification and
> > base backups.
>
>
> I agree, however this shouldn't be the case here, as the block will be
> rechecked while holding proper lock the 2nd time in case of possible false
> positive before being reported as corrupted.  So the only downside is to check
> twice a corrupted block that's not found in shared buffers (or concurrently
> loaded/modified/half flushed).  As the number of corrupted or concurrently
> loaded/modified/half flushed blocks should usually be close to zero, it seems
> worthwhile to have a lockless check first for performance reason.


I just noticed some dumb mistakes while adding the new GUCs.  v5 attached to
fix that, no other changes.

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: RecoveryWalAll and RecoveryWalStream wait events
Next
From: Peter Eisentraut
Date:
Subject: Re: adding partitioned tables to publications