RE: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers
From | k.jamison@fujitsu.com |
---|---|
Subject | RE: [Patch] Optimize dropping of relation buffers using dlist |
Date | |
Msg-id | OSBPR01MB2341EB559E240818E2BFA79FEF200@OSBPR01MB2341.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: [Patch] Optimize dropping of relation buffers using dlist (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
RE: [Patch] Optimize dropping of relation buffers using dlist
|
List | pgsql-hackers |
On Tuesday, September 8, 2020 1:02 PM, Amit Kapila wrote: Hello, > On Mon, Sep 7, 2020 at 1:33 PM k.jamison@fujitsu.com > <k.jamison@fujitsu.com> wrote: > > > > On Wednesday, September 2, 2020 5:49 PM, Amit Kapila wrote: > > > On Wed, Sep 2, 2020 at 9:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > > > > Amit Kapila <amit.kapila16@gmail.com> writes: > > > > > Even if the relation is locked, background processes like > > > > > checkpointer can still touch the relation which might cause > > > > > problems. Consider a case where we extend the relation but > > > > > didn't flush the newly added pages. Now during truncate > > > > > operation, checkpointer can still flush those pages which can > > > > > cause trouble for truncate. But, I think in the recovery path > > > > > such cases won't cause a > > > problem. > > > > > > > > I wouldn't count on that staying true ... > > > > > > > > > > > > https://www.postgresql.org/message-id/CA+hUKGJ8NRsqgkZEnsnRc2MFR > > > OBV-jC > > > > nacbYvtpptK2A9YYp9Q@mail.gmail.com > > > > > > > > > > I don't think that proposal will matter after commit c5315f4f44 > > > because we are caching the size/blocks for recovery while doing > > > extend (smgrextend). In the above scenario, we would have cached the > > > blocks which will be used at later point of time. > > > > > > > I'm guessing we can still pursue this idea of improving the recovery path > first. > > > > I think so. Alright, so I've updated the patch which passes the regression and TAP tests. It compiles and builds as intended. > > I'm working on an updated patch version, because the CFBot's telling > > that postgres fails to build (one of the recovery TAP tests fails). > > I'm still working on refactoring my patch, but have yet to find a proper > solution at the moment. > > So I'm going to continue my investigation. > > > > Attached is an updated WIP patch. > > I'd appreciate if you could take a look at the patch as well. > > > > So, I see the below log as one of the problems: > 2020-09-07 06:20:33.918 UTC [10914] LOG: redo starts at 0/15FFEC0 > 2020-09-07 06:20:33.919 UTC [10914] FATAL: unexpected data beyond EOF > in block 1 of relation base/13743/24581 > > This indicates that we missed invalidating some buffer which should have > been invalidated. If you are able to reproduce this locally then I suggest to first > write a simple patch without the check of the threshold, basically in recovery > always try to use the new way to invalidate the buffer. That will reduce the > scope of the code that can create a problem. Let us know if the problem still > exists and share the logs. BTW, I think I see one problem in the code: > > if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) && > + bufHdr->tag.forkNum == forkNum[j] && tag.blockNum >= > + bufHdr->firstDelBlock[j]) > > Here, I think you need to use 'i' not 'j' for forkNum and > firstDelBlock as those are arrays w.r.t forks. That might fix the > problem but I am not sure as I haven't tried to reproduce it. Thanks for advice. Right, that seems to be the cause of error, and fixing that (using fork) solved the case. I also followed the advice of Tsunakawa-san of using more meaningful iterator Instead of using "i" & "j" for readability. I also added a new function when relation fork is bigger than the threshold If (nblocks > BUF_DROP_FULLSCAN_THRESHOLD) (DropRelFileNodeBuffersOfFork) Perhaps there's a better name for that function. However, as expected in the previous discussions, this is a bit slower than the standard buffer invalidation process, because the whole shared buffers are scanned nfork times. Currently, I set the threshold to (NBuffers / 500) Feedback on the patch/testing are very much welcome. Best regards, Kirk Jamison
Attachment
pgsql-hackers by date: