Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id CAA4eK1KxUSrTYDFrB4KiCMMchH+Av4OqKwJC94y8M_1_tqLzhQ@mail.gmail.com
Whole thread Raw
In response to RE: [Patch] Optimize dropping of relation buffers using dlist  ("k.jamison@fujitsu.com" <k.jamison@fujitsu.com>)
Responses RE: [Patch] Optimize dropping of relation buffers using dlist  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
RE: [Patch] Optimize dropping of relation buffers using dlist  ("k.jamison@fujitsu.com" <k.jamison@fujitsu.com>)
List pgsql-hackers
On Mon, Sep 7, 2020 at 1:33 PM k.jamison@fujitsu.com
<k.jamison@fujitsu.com> wrote:
>
> On Wednesday, September 2, 2020 5:49 PM, Amit Kapila wrote:
> > On Wed, Sep 2, 2020 at 9:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > Even if the relation is locked, background processes like
> > > > checkpointer can still touch the relation which might cause
> > > > problems. Consider a case where we extend the relation but didn't
> > > > flush the newly added pages. Now during truncate operation,
> > > > checkpointer can still flush those pages which can cause trouble for
> > > > truncate. But, I think in the recovery path such cases won't cause a
> > problem.
> > >
> > > I wouldn't count on that staying true ...
> > >
> > >
> > https://www.postgresql.org/message-id/CA+hUKGJ8NRsqgkZEnsnRc2MFR
> > OBV-jC
> > > nacbYvtpptK2A9YYp9Q@mail.gmail.com
> > >
> >
> > I don't think that proposal will matter after commit c5315f4f44 because we are
> > caching the size/blocks for recovery while doing extend (smgrextend). In the
> > above scenario, we would have cached the blocks which will be used at later
> > point of time.
> >
>
> I'm guessing we can still pursue this idea of improving the recovery path first.
>

I think so.

> I'm working on an updated patch version, because the CFBot's telling
> that postgres fails to build (one of the recovery TAP tests fails).
> I'm still working on refactoring my patch, but have yet to find a proper solution at the moment.
> So I'm going to continue my investigation.
>
> Attached is an updated WIP patch.
> I'd appreciate if you could take a look at the patch as well.
>

So, I see the below log as one of the problems:
2020-09-07 06:20:33.918 UTC [10914] LOG:  redo starts at 0/15FFEC0
2020-09-07 06:20:33.919 UTC [10914] FATAL:  unexpected data beyond EOF
in block 1 of relation base/13743/24581

This indicates that we missed invalidating some buffer which should
have been invalidated. If you are able to reproduce this locally then
I suggest to first write a simple patch without the check of the
threshold, basically in recovery always try to use the new way to
invalidate the buffer. That will reduce the scope of the code that can
create a problem. Let us know if the problem still exists and share
the logs. BTW, I think I see one problem in the code:

if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ bufHdr->tag.forkNum == forkNum[j] &&
+ bufHdr->tag.blockNum >= firstDelBlock[j])

Here, I think you need to use 'i' not 'j' for forkNum and
firstDelBlock as those are arrays w.r.t forks. That might fix the
problem but I am not sure as I haven't tried to reproduce it.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: Transactions involving multiple postgres foreign servers, take 2
Next
From: Ian Barwick
Date:
Subject: Re: Improving connection scalability: GetSnapshotData()