Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id 20201009.111201.608064254389053226.horikyota.ntt@gmail.com
Whole thread Raw
In response to RE: [Patch] Optimize dropping of relation buffers using dlist  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses Re: [Patch] Optimize dropping of relation buffers using dlist
RE: [Patch] Optimize dropping of relation buffers using dlist
List pgsql-hackers
At Fri, 9 Oct 2020 00:41:24 +0000, "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> wrote in 
> From: Jamison, Kirk/ジャミソン カーク <k.jamison@fujitsu.com>
> > > (6)
> > > +                    bufHdr->tag.blockNum >=
> > > firstDelBlock[j])
> > > +                    InvalidateBuffer(bufHdr);    /*
> > > releases spinlock */
> > >
> > > The right side of >= should be cur_block.
> > 
> > Fixed.
> 
> >= should be =, shouldn't it?
> 
> Please measure and let us see just the recovery performance again because the critical part of the patch was
modified. If the performance is good as the previous one, and there's no review interaction with others in progress,
I'llmark the patch as ready for committer in a few days.
 

The performance is expected to be kept since smgrnblocks() is called
in a non-hot code path and actually it is called at most four times
per a buffer drop in this patch. But it's better making it sure.

I have some comments on the latest patch.

@@ -445,6 +445,7 @@ BlockNumber
 visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
 {
     BlockNumber newnblocks;
+    bool    cached;

All the added variables added by 0002 is useless because all the
caller sites are not interested in the value. smgrnblocks should
accept NULL as isCached. (I'm agree with Tsunakawa-san that the
camel-case name is not common there.)

+        nForkBlocks[i] = smgrnblocks(smgr_reln, forkNum[i], &isCached);
+
+        if (!isCached)

"is cached" is not the property that code is interested in. No other callers to smgrnblocks are interested in that
property.The need for caching is purely internal of smgrnblocks().
 

On the other hand, we are going to utilize the property of "accuracy"
that is a biproduct of reducing fseek calls, and, again, not
interested in how it is achieved.

So I suggest that the name should be "accurite" or something that is
not suggest the mechanism used under the hood.

+    if (nTotalBlocks != InvalidBlockNumber &&
+        nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)

I don't think nTotalBlocks is useful. What we need here is only total
blocks for every forks (nForkBlocks[]) and the total number of buffers
to be invalidated for all forks (nBlocksToInvalidate).


> > > The right side of >= should be cur_block.
> > 
> > Fixed.
> 
> >= should be =, shouldn't it?

It's just from a paranoia. What we are going to invalidate is blocks
blockNum of which >= curBlock. Although actually there's no chance of
any other processes having replaced the buffer with another page (with
lower blockid) of the same relation after BugTableLookup(), that
condition makes it sure not to leave blocks to be invalidated left
alone.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Assertion failure with LEFT JOINs among >500 relations
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist