Re: [HACKERS] WAL logging problem in 9.4.3? - Mailing list pgsql-hackers

From Noah Misch
Subject Re: [HACKERS] WAL logging problem in 9.4.3?
Date
Msg-id 20190517065050.GA1298884@rfd.leadboat.com
Whole thread Raw
In response to Re: [HACKERS] WAL logging problem in 9.4.3?  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [HACKERS] WAL logging problem in 9.4.3?  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On Tue, May 14, 2019 at 01:59:10PM +0900, Kyotaro HORIGUCHI wrote:
> At Sun, 12 May 2019 17:37:05 -0700, Noah Misch <noah@leadboat.com> wrote in
<20190513003705.GA1202614@rfd.leadboat.com>
> > On Sun, Mar 31, 2019 at 03:31:58PM -0700, Noah Misch wrote:
> > > On Sun, Mar 10, 2019 at 07:27:08PM -0700, Noah Misch wrote:
> > > > I also liked the design in the https://postgr.es/m/559FA0BA.3080808@iki.fi
> > > > last paragraph, and I suspect it would have been no harder to back-patch.  I
> > > > wonder if it would have been simpler and better, but I'm not asking anyone to
> > > > investigate that.
> > > 
> > > Now I am asking for that.  Would anyone like to try implementing that other
> > > design, to see how much simpler it would be?
> 
> Yeah, I think it is a bit too-complex for the value. But I think
> it is the best way as far as we keep reusing a file on
> truncation of the whole file.

The design of v11-0006-Fix-WAL-skipping-feature.patch doesn't, in general,
work for WAL records touching more than one buffer.  For heapam, that patch
works around this problem by emitting XLOG_HEAP_INSERT or XLOG_HEAP_DELETE
when we'd normally emit XLOG_HEAP_UPDATE.  As a result, post-crash-recovery
heap page bits differ from the bits present when we don't crash.  Though I'm
85% confident this does not introduce a bug today, this is fragile.  That is
the main complexity I wish to avoid.

I suspect the design in the https://postgr.es/m/559FA0BA.3080808@iki.fi last
paragraph will be simpler, not more complex.  In the implementation I'm
envisioning, smgrDoPendingDeletes() would change name, perhaps to
AtEOXact_Storage().  For every relfilenode it does not delete, it would ensure
durability by syncing (for large nodes) or by WAL-logging each page (for small
nodes).  RelationNeedsWAL() would return false whenever the applicable
relfilenode appears in pendingDeletes.  Access methods would remove their
smgrimmedsync() calls, but they would otherwise not change.  Would anyone like
to try implementing that?



pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: shared-memory based stats collector
Next
From: Lætitia Avrot
Date:
Subject: Re: [Doc] pg_restore documentation didn't explain how to useconnection string