RE: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From k.jamison@fujitsu.com
Subject RE: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id OSBPR01MB2341D82B99B68BFB28C178C2EF0C0@OSBPR01MB2341.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [Patch] Optimize dropping of relation buffers using dlist  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [Patch] Optimize dropping of relation buffers using dlist
List pgsql-hackers
On Monday, October 5, 2020 3:30 PM, Amit Kapila wrote:

> + for (i = 0; i < nforks; i++)
> + {
> + /* Get the total nblocks for a relation's fork */ nForkBlocks =
> + smgrcachednblocks(smgr_reln, forkNum[i]);
> +
> + if (nForkBlocks == InvalidBlockNumber) { nTotalBlocks =
> + InvalidBlockNumber; break; } nTotalBlocks += nForkBlocks;
> + nBlocksToInvalidate = nTotalBlocks - firstDelBlock[i]; }
> +
> + /*
> + * Do explicit hashtable probe if the total of nblocks of relation's
> + forks
> + * is not invalid and the nblocks to be invalidated is less than the
> + * full-scan threshold of buffer pool.  Otherwise, full scan is executed.
> + */
> + if (nTotalBlocks != InvalidBlockNumber && nBlocksToInvalidate <
> + BUF_DROP_FULL_SCAN_THRESHOLD) { for (j = 0; j < nforks; j++) {
> + BlockNumber curBlock;
> +
> + nForkBlocks = smgrcachednblocks(smgr_reln, forkNum[j]);
> +
> + for (curBlock = firstDelBlock[j]; curBlock < nForkBlocks; curBlock++)
> 
> What if one or more of the forks doesn't have cached value? I think the patch
> will skip such forks and will invalidate/unpin buffers for others. 

Not having a cached value is equivalent to InvalidBlockNumber, right?
Maybe I'm missing something? But in the first loop we are already doing the
pre-check of whether or not one of the forks doesn't have cached value.
If it's not cached, then the nTotalBlocks is set to InvalidBlockNumber so we
won't need to enter the optimization loop and just execute the full scan buffer
invalidation process.

> You probably
> need a local array of nForkBlocks which will be formed first time and then
> used in the second loop. You also in some way need to handle the case where
> that local array doesn't have cached blocks.

Understood. that would be cleaner.
    BlockNumber    nForkBlocks[MAX_FORKNUM];

As for handling whether the local array is empty, I think the first loop would cover it,
and there's no need to pre-check if the array is empty again in the second loop.
for (i = 0; i < nforks; i++)
{
    nForkBlocks[i] = smgrcachednblocks(smgr_reln, forkNum[i]);

    if (nForkBlocks[i] == InvalidBlockNumber)
    {
        nTotalBlocks = InvalidBlockNumber;
        break;
    }
    nTotalBlocks += nForkBlocks[i];
    nBlocksToInvalidate = nTotalBlocks - firstDelBlock[i];
}

> 2. Also, the other thing is I have asked for some testing to avoid the small
> regression we have for a smaller number of shared buffers which I don't see
> the results nor any change in the code. I think it is better if you post the
> pending/open items each time you post a new version of the patch.

Ah. Apologies for forgetting to include updates about that, but since I keep on updating
the patch I've decided not to post results yet as performance may vary per patch-update
due to possible bugs.
But for the performance case of not using recovery check, I just removed it from below.
Does it meet the intention?

BlockNumber smgrcachednblocks(SMgrRelation reln, ForkNumber forknum) {
-       if (InRecovery && reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber)
+       if (reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber)
                return reln->smgr_cached_nblocks[forknum];

Regards,
Kirk Jamison

pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Re: Implementing Incremental View Maintenance
Next
From: John Naylor
Date:
Subject: Re: small cleanup: unify scanstr() functions