Re: [HACKERS] WAL logging problem in 9.4.3? - Mailing list pgsql-hackers

From Noah Misch
Subject Re: [HACKERS] WAL logging problem in 9.4.3?
Date
Msg-id 20191125030839.GA51906@gust.leadboat.com
Whole thread Raw
In response to Re: [HACKERS] WAL logging problem in 9.4.3?  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: [HACKERS] WAL logging problem in 9.4.3?  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Mon, Nov 25, 2019 at 11:08:54AM +0900, Kyotaro Horiguchi wrote:
> At Sat, 23 Nov 2019 16:21:36 -0500, Noah Misch <noah@leadboat.com> wrote in 
> > This benchmark procedure may help:
> > 
> > 1. Determine $DDL_COUNT, a number of DDL transactions that take about one
> >    minute when done via syncs.
> > 2. Start "pgbench -rP1 --progress-timestamp -T180 -c10 -j10".
> > 3. Wait 10s.
> > 4. Start one DDL backend that runs $DDL_COUNT transactions.
> > 5. Save DDL start timestamp, DDL end timestamp, and pgbench output.
> > 
> > I would compare pgbench tps and latency between the seconds when DDL is and is
> > not running.  As you did in earlier tests, I would repeat it using various
> > page counts, with and without sync.
> 
> I understood the "DDL" is not pure DDLs but a kind of
> define-then-load, like "CREATE TABLE AS" , "CREATE TABLE" then "COPY
> FROM".

When I wrote "DDL", I meant the four-command transaction that you already used
in benchmarks.

> > I noticed an additional defect:
> > 
> > BEGIN;
> > CREATE TABLE t (c) AS SELECT 1;
> > CHECKPOINT; -- write and fsync the table's one page
> > TRUNCATE t; -- no WAL
> > COMMIT; -- no FPI, just the commit record
> > 
> > If we crash after the COMMIT and before the next fsync or OS-elected sync of
> > the table's file, the table will stay on disk with its pre-TRUNCATE content.
> 
> The TRUNCATE replaces relfilenode in the catalog

No, it does not.  Since the relation is new in the transaction, the TRUNCATE
uses the heap_truncate_one_rel() strategy.

> Since the file has no pages, it's right that no FPI emitted.

Correct.

> If we don't rely on such
> behavior, we can make sure thhat by turning the zero-pages case from
> WAL into file sync. I'll do that in the next version.

The zero-pages case is not special.  Here's an example of the problem with a
nonzero size:

BEGIN;
CREATE TABLE t (c) AS SELECT * FROM generate_series(1,100000);
CHECKPOINT; -- write and fsync the table's many pages
TRUNCATE t; -- no WAL
INSERT INTO t VALUES (0); -- no WAL
COMMIT; -- FPI for one page; nothing removes the additional pages



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: Safeguards against incorrect fd flags for fsync()
Next
From: Michael Paquier
Date:
Subject: Re: Attempt to consolidate reading of XLOG page