Re: Hot standby queries see transient all-zeros pages - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Hot standby queries see transient all-zeros pages
Date
Msg-id 20241214032703.b5.nmisch@google.com
Whole thread Raw
In response to Re: Hot standby queries see transient all-zeros pages  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Fri, Dec 13, 2024 at 09:34:21PM -0500, Andres Freund wrote:
> On 2024-12-13 16:38:05 -0800, Noah Misch wrote:
> > On Fri, Dec 13, 2024 at 05:41:15PM -0500, Andres Freund wrote:
> > > Hm. Leaving RBM_ZERO_AND_LOCK aside, is it actually always safe to do
> > > RestoreBlockImage() into a buffer that currently is pinned? Not sure if
> > > there's actually all that much guarantee what transient state one can read
> > > when reading a page concurrently to a memcpy(). I suspect it's practically
> > > rare to see a problem, but one could imagine an memcpy implementation that
> > > uses non-temporal writes, which afaict would leave you open to seeing quite
> > > random states when reading concurrently, as the cache coherence protocol
> > > doesn't protect anymore.
> > 
> > I wondered about that, too.  I didn't dig too deep.
> > https://pubs.opengroup.org/onlinepubs/9799919799/functions/memcpy.html and
> > https://en.cppreference.com/w/cpp/string/byte/memcpy were both silent about
> > the topic.
> 
> Hm. Perhaps it'd be worth having a small stress test in the tests that'd make
> problems like this more apparent. Even if it's not a problem current libc's,
> who knows what happens down the line.

Agreed.

> > > On 2024-05-12 10:16:58 -0700, Noah Misch wrote:
> > > > I suspect the fix is to add a ReadBufferMode specified as, "If the block is
> > > > already in shared_buffers, do RBM_NORMAL and exclusive-lock the buffer.
> > > > Otherwise, do RBM_ZERO_AND_LOCK."
> > > 
> > > I think that should work. At least in the current code it looks near trivial
> > > to implement, although the branch differences are going to be annoying.
> > > 
> > > As usual the hardest part would probably be the naming. Maybe
> > > RBM_ZERO_ON_MISS_LOCK? RBM_LOCK_ZERO_ON_MISS? RBM_DWIM?
> > 
> > It turned out RBM_ZERO_AND_LOCK long worked that way, and postgr.es/c/e656657
> > just had to restore that longstanding behavior.  The existing comment "Don't
> > read from disk, caller will initialize." does allude to this (but I didn't
> > originally catch the subtle point).
> > 
> > If RBM_ZERO_AND_LOCK hadn't existed so long, I'd rename it.  Perhaps it
> > deserves a rename anyway?  Of those, I'd pick RBM_ZERO_ON_MISS_LOCK.  I also
> > considered RBM_RECENT_OR_ZERO, borrowing a term from ReadRecentBuffer().
> 
> At least we could make the documentation for the mode in the enum clearer...

Agreed.  Maybe "No I/O; get existing buffer, else zero-fill; caller
initializes."



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Fix early elog(FATAL)
Next
From: Nathan Bossart
Date:
Subject: Re: Fix early elog(FATAL)