Re: [HACKERS] WAL logging problem in 9.4.3? - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] WAL logging problem in 9.4.3?
Date
Msg-id 20170413.184219.106482305.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] WAL logging problem in 9.4.3?  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [HACKERS] WAL logging problem in 9.4.3?  (Daniel Gustafsson <daniel@yesql.se>)
List pgsql-hackers
At Thu, 13 Apr 2017 13:52:40 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<CAB7nPqTRyica1d-zU+YckveFC876=Sc847etmk7TRgAS2pA9CA@mail.gmail.com>
> On Tue, Apr 11, 2017 at 5:38 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Sorry, what I have just sent was broken.
> 
> You can use PROVE_TESTS when running make check to select a subset of
> tests you want to run. I use that all the time when working on patches
> dedicated to certain code paths.

Thank you for the information. Removing unwanted test scripts
from t/ directories was annoyance. This makes me happy.

> >> - Relation has new members no_pending_sync and pending_sync that
> >>   works as instant cache of an entry in pendingSync hash.
> >> - Commit-time synchronizing is restored as Michael's patch.
> >> - If relfilenode is replaced, pending_sync for the old node is
> >>   removed. Anyway this is ignored on abort and meaningless on
> >>   commit.
> >> - TAP test is renamed to 012 since some new files have been added.
> >>
> >> Accessing pending sync hash occurred on every calling of
> >> HeapNeedsWAL() (per insertion/update/freeze of a tuple) if any of
> >> accessing relations has pending sync.  Almost of them are
> >> eliminated as the result.
> 
> Did you actually test this patch? One of the logs added makes the
> tests a long time to run:

Maybe this patch requires make clean since it extends the
structure RelationData. (Perhaps I saw the same trouble.)

> 2017-04-13 12:11:27.065 JST [85441] t/102_vacuumdb_stages.pl
> STATEMENT:  ANALYZE;
> 2017-04-13 12:12:25.766 JST [85492] LOG:  BufferNeedsWAL: pendingSyncs
> = 0x0, no_pending_sync = 0
> 
> -       lsn = XLogInsert(RM_SMGR_ID,
> -                        XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
> +           rel->no_pending_sync= false;
> +           rel->pending_sync = pending;
> +       }
> 
> It seems to me that those flags and the pending_sync data should be
> kept in the context of backend process and not be part of the Relation
> data...

I understand that the context of "backend process" means
storage.c local. I don't mind the context on which the data is,
but I found only there that can get rid of frequent hash
searching. For pending deletions, just appending to a list is
enough and costs almost nothing, on the other hand pendig syncs
are required to be referenced, sometimes very frequently.

> +void
> +RecordPendingSync(Relation rel)
> I don't think that I agree that this should be part of relcache.c. The
> syncs are tracked should be tracked out of the relation context.

Yeah.. It's in storage.c in the latest patch. (Sorry for the
duplicate name). I think it is a kind of bond between smgr and
relation.

> Seeing how invasive this change is, I would also advocate for this
> patch as only being a HEAD-only change, not many people are
> complaining about this optimization of TRUNCATE missing when wal_level
> = minimal, and this needs a very careful review.

Agreed.

> Should I code something? Or Horiguchi-san, would you take care of it?
> The previous crash I saw has been taken care of, but it's been really
> some time since I looked at this patch...

My point is hash-search on every tuple insertion should be evaded
even if it happens rearely. Once it was a bit apart from your
original patch, but in the latest patch the significant part
(pending-sync hash) is revived from the original one.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Minor improvement for inval.c header comment
Next
From: Magnus Hagander
Date:
Subject: Re: [HACKERS] Function to control physical replication slot