Home > mailing lists

Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From	Kyotaro Horiguchi
Subject	Re: [Patch] Optimize dropping of relation buffers using dlist
Date	November 27, 2020 09:06:39
Msg-id	20201127.150639.586140002202092815.horikyota.ntt@gmail.com Whole thread Raw
In response to	RE: [Patch] Optimize dropping of relation buffers using dlist ("k.jamison@fujitsu.com" <k.jamison@fujitsu.com>)
Responses	Re: [Patch] Optimize dropping of relation buffers using dlist RE: [Patch] Optimize dropping of relation buffers using dlist Re: [Patch] Optimize dropping of relation buffers using dlist
List	pgsql-hackers

Tree view

At Fri, 27 Nov 2020 02:19:57 +0000, "k.jamison@fujitsu.com" <k.jamison@fujitsu.com> wrote in 
> > From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> > Hello, Kirk. Thank you for the new version.
> 
> Hi, Horiguchi-san. Thank you for your very helpful feedback.
> I'm updating the patches addressing those.
> 
> > +            if (!smgrexists(rels[i], j))
> > +                continue;
> > +
> > +            /* Get the number of blocks for a relation's fork */
> > +            blocks[i][numForks] = smgrnblocks(rels[i], j,
> > NULL);
> > 
> > If we see a fork which its size is not cached we must give up this optimization
> > for all target relations.
> 
> I did not use the "cached" flag in DropRelFileNodesAllBuffers and use InRecovery
> when deciding for optimization because of the following reasons:
> XLogReadBufferExtended() calls smgrnblocks() to apply changes to relation page
> contents. So in DropRelFileNodeBuffers(), XLogReadBufferExtended() is called
> during VACUUM replay because VACUUM changes the page content.
> OTOH, TRUNCATE doesn't change the relation content, it just truncates relation pages
> without changing the page contents. So XLogReadBufferExtended() is not called, and
> the "cached" flag will always return false. I tested with "cached" flags before, and this

A bit different from the point, but if some tuples have been inserted
to the truncated table, XLogReadBufferExtended() is called for the
table and the length is cached.

> always return false, at least in DropRelFileNodesAllBuffers. Due to this, we cannot use
> the cached flag in DropRelFileNodesAllBuffers(). However, I think we can still rely on
> smgrnblocks to get the file size as long as we're InRecovery. That cached nblocks is still
> guaranteed to be the maximum in the shared buffer.
> Thoughts?

That means that we always think as if smgrnblocks returns "cached" (or
"safe") value during recovery, which is out of our current
consensus. If we go on that side, we don't need to consult the
"cached" returned from smgrnblocks at all and it's enough to see only
InRecovery.

I got confused..

We are relying on the "fact" that the first lseek() call of a
(startup) process tells the truth.  We added an assertion so that we
make sure that the cached value won't be cleared during recovery.  A
possible remaining danger would be closing of an smgr object of a live
relation just after a file extension failure. I think we are thinking
that that doesn't happen during recovery.  Although it seems to me
true, I'm not confident.

If that's true, we don't even need to look at the "cached" flag at all
and always be able to rely on the returned value from msgrnblocks()
during recovery.  Otherwise, we need to avoid the danger situation.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

From: "tsunakawa.takay@fujitsu.com"
Date: 27 November 2020, 09:05:55
Subject: RE: POC: postgres_fdw insert batching

From: Masahiko Sawada
Date: 27 November 2020, 09:07:34
Subject: Re: Disable WAL logging to speed up data loading

Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

Previous

Next