Re: [HACKERS] WAL logging problem in 9.4.3? - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [HACKERS] WAL logging problem in 9.4.3?
Date
Msg-id CAB7nPqTRyica1d-zU+YckveFC876=Sc847etmk7TRgAS2pA9CA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] WAL logging problem in 9.4.3?  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [HACKERS] WAL logging problem in 9.4.3?
List pgsql-hackers
On Tue, Apr 11, 2017 at 5:38 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Sorry, what I have just sent was broken.

You can use PROVE_TESTS when running make check to select a subset of
tests you want to run. I use that all the time when working on patches
dedicated to certain code paths.

>> - Relation has new members no_pending_sync and pending_sync that
>>   works as instant cache of an entry in pendingSync hash.
>> - Commit-time synchronizing is restored as Michael's patch.
>> - If relfilenode is replaced, pending_sync for the old node is
>>   removed. Anyway this is ignored on abort and meaningless on
>>   commit.
>> - TAP test is renamed to 012 since some new files have been added.
>>
>> Accessing pending sync hash occurred on every calling of
>> HeapNeedsWAL() (per insertion/update/freeze of a tuple) if any of
>> accessing relations has pending sync.  Almost of them are
>> eliminated as the result.

Did you actually test this patch? One of the logs added makes the
tests a long time to run:
2017-04-13 12:11:27.065 JST [85441] t/102_vacuumdb_stages.pl
STATEMENT:  ANALYZE;
2017-04-13 12:12:25.766 JST [85492] LOG:  BufferNeedsWAL: pendingSyncs
= 0x0, no_pending_sync = 0

-       lsn = XLogInsert(RM_SMGR_ID,
-                        XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
+           rel->no_pending_sync= false;
+           rel->pending_sync = pending;
+       }
It seems to me that those flags and the pending_sync data should be
kept in the context of backend process and not be part of the Relation
data...

+void
+RecordPendingSync(Relation rel)
I don't think that I agree that this should be part of relcache.c. The
syncs are tracked should be tracked out of the relation context.

Seeing how invasive this change is, I would also advocate for this
patch as only being a HEAD-only change, not many people are
complaining about this optimization of TRUNCATE missing when wal_level
= minimal, and this needs a very careful review.

Should I code something? Or Horiguchi-san, would you take care of it?
The previous crash I saw has been taken care of, but it's been really
some time since I looked at this patch...
-- 
Michael



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...
Next
From: Noah Misch
Date:
Subject: Re: [HACKERS] tablesync patch broke the assumption that logical rep depends on?