RE: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From k.jamison@fujitsu.com
Subject RE: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id OSBPR01MB2341AB66B8B0982CB857963FEF2F0@OSBPR01MB2341.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [Patch] Optimize dropping of relation buffers using dlist  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Wednesday, September 2, 2020 10:31 AM, Kyotaro Horiguchi wrote:
> Hello.
>
> At Tue, 1 Sep 2020 13:02:28 +0000, "k.jamison@fujitsu.com"
> <k.jamison@fujitsu.com> wrote in
> > On Tuesday, August 18, 2020 3:05 PM (GMT+9), Amit Kapila wrote:
> > > Today, again thinking about this point it occurred to me that during
> > > recovery we can reliably find the relation size and after Thomas's
> > > recent commit
> > > c5315f4f44 (Cache smgrnblocks() results in recovery), we might not
> > > need to even incur the cost of lseek. Why don't we fix this first
> > > for 'recovery' (by following something on the lines of what Andres
> > > suggested) and then later once we have a generic mechanism for
> > > "caching the relation size" [1], we can do it for non-recovery paths.
> > > I think that will at least address the reported use case with some
> > > minimal changes.
> > >
> > > [1] -
> > >
> https://www.postgresql.org/message-id/CAEepm%3D3SSw-Ty1DFcK%3D1r
> > > U-K6GSzYzfdD4d%2BZwapdN7dTa6%3DnQ%40mail.gmail.com
>
> Isn't a relation always locked asscess-exclusively, at truncation time?  If so,
> isn't even the result of lseek reliable enough? And if we don't care the cost of
> lseek, we can do the same optimization also for non-recovery paths. Since
> anyway we perform actual file-truncation just after so I think the cost of lseek
> is negligible here.

The reason for that is when I read the comment in smgrnblocks in smgr.c
I thought that smgrnblocks can only be reliably used during recovery here
to ensure that we have the correct size.
Please correct me if my understanding is wrong, and I'll fix the patch.

     * For now, we only use cached values in recovery due to lack of a shared
     * invalidation mechanism for changes in file size.
     */
    if (InRecovery && reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber)
        return reln->smgr_cached_nblocks[forknum];

> > Attached is an updated V9 version with minimal code changes only and
> > avoids the previous overhead in the BufferAlloc. This time, I only
> > updated the recovery path as suggested by Amit, and followed Andres'
> > suggestion of referring to the cached blocks in smgrnblocks.
> > The layering is kinda tricky so the logic may be wrong. But as of now,
> > it passes the regression tests. I'll follow up with the performance results.
> > It seems there's regression for smaller shared_buffers. I'll update if I find
> bugs.
> > But I'd also appreciate your reviews in case I missed something.
>
> BUF_DROP_THRESHOLD seems to be misued. IIUC it defines the maximum
> number of file pages that we make relation-targetted search for buffers.
> Otherwise we scan through all buffers. On the other hand the latest patch just
> leaves all buffers for relation forks longer than the threshold.

Right, I missed the part or condition for that part. Fixed in the latest one.

> I think we should determine whether to do targetted-scan or full-scan based
> on the ratio of (expectedly maximum) total number of pages for all (specified)
> forks in a relation against total number of buffers.

> By the way
>
> > #define BUF_DROP_THRESHOLD        500    /* NBuffers divided
> by 2 */
>
> NBuffers is not a constant. Even if we wanted to set the macro as described
> in the comment, we should have used (NBuffers/2) instead of "500". But I
> suppose you might wanted to use (NBuffders / 500) as Tom suggested
> upthread.  And the name of the macro seems too generic. I think more
> specific names like BUF_DROP_FULLSCAN_THRESHOLD would be better.

Fixed.

Thank you for the review!
Attached is the v10 of the patch.

Best regards,
Kirk Jamison

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Next
From: Rahila Syed
Date:
Subject: Re: More tests with USING INDEX replident and dropped indexes