RE: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From k.jamison@fujitsu.com
Subject RE: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id OSBPR01MB2341EF13BC3FE9879F42B9FFEF320@OSBPR01MB2341.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [Patch] Optimize dropping of relation buffers using dlist  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses RE: [Patch] Optimize dropping of relation buffers using dlist  ("k.jamison@fujitsu.com" <k.jamison@fujitsu.com>)
List pgsql-hackers
On Tuesday, September 29, 2020 10:35 AM, Horiguchi-san wrote:

> FWIW, I (and maybe Amit) am thinking that the property we need here is not it
> is cached or not but the accuracy of the returned file length, and that the
> "cached" property should be hidden behind the API.
>
> Another reason for not adding this function is the cached value is not really
> reliable on non-recovery environment.
>
> > So in the new function, it goes something like:
> >     if (InRecovery)
> >     {
> >         if (reln->smgr_cached_nblocks[forknum] !=
> InvalidBlockNumber)
> >             return reln->smgr_cached_nblocks[forknum];
> >         else
> >             return InvalidBlockNumber;
> >     }
>
> If we add the new function, it should reutrn InvalidBlockNumber without
> consulting smgr_nblocks().

So here's how I revised it
smgrcachednblocks(SMgrRelation reln, ForkNumber forknum)
{
    if (InRecovery)
    {
        if (reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber)
            return reln->smgr_cached_nblocks[forknum];
    }
    return InvalidBlockNumber;


> Hmm. The current loop in DropRelFileNodeBuffers looks like this:
>
>     if (InRecovery)
>        for (for each forks)
>           if (the fork meets the criteria)
>              <optimized dropping>
>           else
>              <full scan>
>
> I think this is somewhat different from the current discussion. Whether we
> sum-up the number of blcoks for all forks or just use that of the main fork, we
> should take full scan if we failed to know the accurate size for any one of the
> forks. (In other words, it is stupid that we run a full scan for more than one
> fork at a
> drop.)
>
> Come to think of that, we can naturally sum-up all forks' blocks since anyway
> we need to call smgrnblocks for all forks to know the optimzation is usable.

I understand. We really don't have to enter the optimization when we know the
file size is inaccurate. That also makes the patch simpler.

> So that block would be something like this:
>
>     for (forks of the rel)
>       /* the function returns InvalidBlockNumber if !InRecovery */
>       if (smgrnblocks returned InvalidBlockNumber)
>          total_blocks = InvalidBlockNumber;
>          break;
>       total_blocks += nbloks of this fork
>
>     /* <we could rely on the fact that InvalidBlockNumber is zero> */
>     if (total_blocks != InvalidBlockNumber && total_blocks < threshold)
>           for (forks of the rel)
>           for (blocks of the fork)
>              <try dropping the buffer for the block>
>     else
>        <full scan dropping>

I followed this logic in the attached patch.
Thank you very much for the thoughtful reviews.

Performance measurement for large shared buffers to follow.

Best regards,
Kirk Jamison


Attachment

pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: Re: New statistics for tuning WAL buffer size
Next
From: Michael Paquier
Date:
Subject: Re: [patch] Concurrent table reindex per-index progress reporting