Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id CAA4eK1KPhyewX5OsYBYKxvuGryhuqjGeA8BAA-KVpRrJ706EWw@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] Optimize dropping of relation buffers using dlist  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Nov 18, 2020 at 11:43 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-11-18 17:34:31 +0530, Amit Kapila wrote:
> > Yeah, that won't be a bad idea especially because the patch being
> > discussed in the thread you referred is still in an exploratory phase.
> > I haven't tested or done a detailed review but I feel there shouldn't
> > be many problems if we agree on the approach.
> >
> > Thomas/others, do you have objections to proceeding here? It shouldn't
> > be a big problem to change the code in this area even if we get the
> > shared relation size stuff in.
>
> I'm doubtful the patches as is are a good idea / address the correctness
> concerns to a sufficient degree.
>
> One important part of that is that the patch includes pretty much zero
> explanations about why it is safe what it is doing. Something having
> being discussed deep in this thread won't help us in a few months, not
> to speak of a few years.
>
>
> The commit message says:
> > While recovery, we can get a reliable cached value of nblocks for
> > supplied relation's fork, and it's safe because there are no other
> > processes but the startup process that changes the relation size
> > during recovery.
>
> and the code only applies the optimized scan only when cached:
> +       /*
> +        * Look up the buffers in the hashtable and drop them if the block size
> +        * is already cached and the total blocks to be invalidated is below the
> +        * full scan threshold.  Otherwise, give up the optimization.
> +        */
> +       if (cached && nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)
>
>
> This seems quite narrow to me. There's plenty cases where there's no
> cached relation size in the startup process, restricting the
> availability of this optimization as written. Where do we even use
> DropRelFileNodeBuffers() in recovery?
>

This will be used in the recovery of truncate done by vacuum (via
replay of XLOG_SMGR_TRUNCATE->smgrtruncate->DropRelFileNodeBuffers).
And Kirk-San has done some testing [1][2] to show the performance
benefits of the same.

> The most common path is
> DropRelationFiles()->smgrdounlinkall()->DropRelFileNodesAllBuffers(),
> which 3/4 doesn't address and 4/4 doesn't mention.
>
> 4/4 seems to address DropRelationFiles(), but only talks about TRUNCATE?
>
> I'm also worried about the cases where this could cause buffers left in
> the buffer pool, without a crosscheck like Thomas' patch would allow to
> add. Obviously other processes can dirty buffers in hot_standby, so any
> leftover buffer could have bad consequences.
>

The problem can only arise if other processes extend the relation. The
idea was that in recovery it extends relation by one process which
helps to maintain the cache. Kirk seems to have done testing to
cross-verify it by using his first patch
(Prevent-invalidating-blocks-in-smgrextend-during). Which other
crosscheck you are referring here?

I agree that we can do a better job by expanding comments to clearly
state why it is safe.

[1] -
https://www.postgresql.org/message-id/OSBPR01MB23413F14ED6B2D0D007698F4EFED0%40OSBPR01MB2341.jpnprd01.prod.outlook.com
[2] -
https://www.postgresql.org/message-id/OSBPR01MB234176B1829AECFE9FDDFCC2EFE90%40OSBPR01MB2341.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Should we document IS [NOT] OF?
Next
From: Ajin Cherian
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions