Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [Patch] Optimize dropping of relation buffers using dlist |
Date | |
Msg-id | CAA4eK1J5Zxnm7+dCJqqZy+z+SpV2HexAdfgkOJoFa9icj9R6Cg@mail.gmail.com Whole thread Raw |
In response to | RE: [Patch] Optimize dropping of relation buffers using dlist ("k.jamison@fujitsu.com" <k.jamison@fujitsu.com>) |
Responses |
Re: [Patch] Optimize dropping of relation buffers using dlist
|
List | pgsql-hackers |
On Mon, Dec 7, 2020 at 12:32 PM k.jamison@fujitsu.com <k.jamison@fujitsu.com> wrote: > > On Friday, December 4, 2020 8:27 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Nov 27, 2020 at 11:36 AM Kyotaro Horiguchi > > <horikyota.ntt@gmail.com> wrote: > > > > > > At Fri, 27 Nov 2020 02:19:57 +0000, "k.jamison@fujitsu.com" > > > <k.jamison@fujitsu.com> wrote in > > > > > From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Hello, Kirk. > > > > > Thank you for the new version. > > > > > > > > Hi, Horiguchi-san. Thank you for your very helpful feedback. > > > > I'm updating the patches addressing those. > > > > > > > > > + if (!smgrexists(rels[i], j)) > > > > > + continue; > > > > > + > > > > > + /* Get the number of blocks for a relation's fork */ > > > > > + blocks[i][numForks] = smgrnblocks(rels[i], j, > > > > > NULL); > > > > > > > > > > If we see a fork which its size is not cached we must give up this > > > > > optimization for all target relations. > > > > > > > > I did not use the "cached" flag in DropRelFileNodesAllBuffers and > > > > use InRecovery when deciding for optimization because of the following > > reasons: > > > > XLogReadBufferExtended() calls smgrnblocks() to apply changes to > > > > relation page contents. So in DropRelFileNodeBuffers(), > > > > XLogReadBufferExtended() is called during VACUUM replay because > > VACUUM changes the page content. > > > > OTOH, TRUNCATE doesn't change the relation content, it just > > > > truncates relation pages without changing the page contents. So > > > > XLogReadBufferExtended() is not called, and the "cached" flag will > > > > always return false. I tested with "cached" flags before, and this > > > > > > A bit different from the point, but if some tuples have been inserted > > > to the truncated table, XLogReadBufferExtended() is called for the > > > table and the length is cached. > > > > > > > always return false, at least in DropRelFileNodesAllBuffers. Due to > > > > this, we cannot use the cached flag in DropRelFileNodesAllBuffers(). > > > > However, I think we can still rely on smgrnblocks to get the file > > > > size as long as we're InRecovery. That cached nblocks is still guaranteed > > to be the maximum in the shared buffer. > > > > Thoughts? > > > > > > That means that we always think as if smgrnblocks returns "cached" (or > > > "safe") value during recovery, which is out of our current consensus. > > > If we go on that side, we don't need to consult the "cached" returned > > > from smgrnblocks at all and it's enough to see only InRecovery. > > > > > > I got confused.. > > > > > > We are relying on the "fact" that the first lseek() call of a > > > (startup) process tells the truth. We added an assertion so that we > > > make sure that the cached value won't be cleared during recovery. A > > > possible remaining danger would be closing of an smgr object of a live > > > relation just after a file extension failure. I think we are thinking > > > that that doesn't happen during recovery. Although it seems to me > > > true, I'm not confident. > > > > > > > Yeah, I also think it might not be worth depending upon whether smgr close > > has been done before or not. I feel the current idea of using 'cached' > > parameter is relatively solid and we should rely on that. > > Also, which means that in DropRelFileNodesAllBuffers() we should rely on > > the same, I think doing things differently in this regard will lead to confusion. I > > agree in some cases we might not get benefits but it is more important to be > > correct and keep the code consistent to avoid introducing bugs now or in the > > future. > > > Hi, > I have reported before that it is not always the case that the "cached" flag of > srnblocks() return true. So when I checked the truncate test case used in my > patch, it does not enter the optimization path despite doing INSERT before > truncation of table. > The reason for that is because in TRUNCATE, a new RelFileNode is assigned > to the relation when creating a new file. In recovery, XLogReadBufferExtended() > always opens the RelFileNode and calls smgrnblocks() for that RelFileNode for the > first time. And for recovery processing, different RelFileNodes are used for the > INSERTs to the table and TRUNCATE to the same table. > Hmm, how is it possible if Insert is done before Truncate? The insert should happen in old RelFileNode only. I have verified by adding a break-in (while (1), so that it stops there) heap_xlog_insert and DropRelFileNodesAllBuffers(), and both get the same (old) RelFileNode. How have you verified what you are saying? -- With Regards, Amit Kapila.
pgsql-hackers by date: