Re: [HACKERS] WAL logging problem in 9.4.3? - Mailing list pgsql-hackers

From Noah Misch
Subject Re: [HACKERS] WAL logging problem in 9.4.3?
Date
Msg-id 20200316034647.GA1121601@rfd.leadboat.com
Whole thread Raw
In response to Re: [HACKERS] WAL logging problem in 9.4.3?  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: [HACKERS] WAL logging problem in 9.4.3?  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On Wed, Mar 04, 2020 at 04:29:19PM +0900, Kyotaro Horiguchi wrote:
> The attached is back-patches from 9.5 through master.

Thanks.  I've made some edits.  I'll plan to push the attached patches on
Friday or Saturday.

> log_newpage_range has been introduced at PG12.  Fortunately the
> required infrastructure is introduced at PG9.5 so what I need to do
> for PG95-PG11 is back-patching the function and its counter part in
> xlog_redo. It doen't WAL format itself but XLOG_FPI gets to have 2 or
> more backup pages so the compatibility is forward only. That is, newer
> minor versions read WAL from older minor versions, but not vise
> versea.  I'm not sure it is back-patchable so in the attached the
> end-of-xact WAL feature is separated for PG9.5-PG11.
> (000x-Add-end-of-xact-WAL-feature-of-WAL-skipping.patch)

The main patch's introduction of XLOG_GIST_ASSIGN_LSN already creates a WAL
upgrade hazard.  Changing XLOG_FPI is riskier, because an old server will
apply the first FPI and ignore the rest.  For v11 and earlier, I decided to
introduce XLOG_FPI_MULTI.  It behaves exactly like XLOG_FPI, but this PANICs
if one reads post-update WAL with a pre-update server.  The main alternative
would be to issue one XLOG_FPI per page, but I was concerned that would cause
a notable performance loss.

> In the patchset for 12, I let the functions heap_sync,
> heapam_methods.finish_bulk_insert and table_finish_bulk_insert left
> as-is.  As the result heapam_finish_bulk_insert becomes no-op.

heapam_finish_bulk_insert() is a static function, so I deleted it.

> begin_heap_rewrite is a public function but the last parameter is
> useless and rather harmful as it looks as if it works. So I removed
> the parameter.

Agreed.  Also, pgxn contains no references to begin_heap_rewrite().

> For 9.6, mdexists() creates the specified file while bootstrap mode
> and that leads to assertion failure of smgrDoPendingSyncs. So I made
> CreateStorage not register pending sync while bootstrap mode.
> gistbuild generates the LSN for root page of a newly created index
> using gistGetFakeLSN(heap), which fires assertion failure in
> gistGetFakeLSN.  I think we should use index instead of heap there,
> but it doesn't matter if we don't have the new pending sync mechanism,
> so I didn't split it as a separate patch.

v11 and v10, too, had the gistGetFakeLSN(heap) problem.  I saw that and other
problems by running the following on each branch:

  make check-world
  printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
  make check-world TEMP_CONFIG=/tmp/minimal.conf
  make -C doc  # catch breakage when XML changes don't work in SGML

> For 9.5, pg_visibility does not exist so I dropped the test for the
> module.

The test would have required further changes to work in v11 or earlier, so I
deleted the test.  It was a low-importance test.

> It lacks a part of TAP infrastructure nowadays we have, but I
> want to have the test (and it actually found a bug I made during this
> work). So I added a patch to back-patch TestLib.pm, PostgresNode.pm
> and RecursiveCopy.pm along with 018_wal_optimize.pl.
> (0004-Add-TAP-test-for-WAL-skipping-feature.patch)

That is a good idea.  Rather than make it specific to this test, I would like
to back-patch all applicable test files from 9.6 src/test/recovery.  I'll plan
to push that one part on Thursday.


Other notable changes:

- Like you suggested earlier, I moved restoration of old*Subid from
  index_create() back to ATExecAddIndex().  I decided to do this when I
  observed that pg_idx_advisor calls index_create().  That's not a strong
  reason, but it was enough to change a decision that had been arbitrary.

- Updated the wal_skip_threshold GUC category to WAL_SETTINGS, for consistency
  with the documentation move.  Added the GUC to postgresql.conf.sample.

- In released branches, I moved the new public struct fields to the end.  This
  reduces the number of extensions requiring a recompile.  From a grep of
  pgxn, one extension ("citus") relies on sizeof(RelationData), and nothing
  relies on sizeof(IndexStmt).

- In 9.6, I rewrote the mdimmedsync() changes so the function never ignores
  FileSync() failure.

Other observations:

- The new test file takes ~62s longer on 9.6 and 9.5, mostly due to commit
  c61559ec3 first appearing in v10.  I am fine with this.

- This is the most-demanding back-branch fix I've ever attempted.  Hopefully
  I've been smarter than usual while reviewing it, but that is unlikely.

Thanks,
nm

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Online checksums verification in the backend
Next
From: Masahiko Sawada
Date:
Subject: Re: Berserk Autovacuum (let's save next Mandrill)