Re: [PATCH] Speedup truncates of relation forks - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [PATCH] Speedup truncates of relation forks
Date
Msg-id CAD21AoBr-E5=iAgyWSasPKED-y_6fqQc9wmPF43yWSBJwXpHMA@mail.gmail.com
Whole thread Raw
In response to RE: [PATCH] Speedup truncates of relation forks  ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>)
Responses RE: [PATCH] Speedup truncates of relation forks  ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>)
List pgsql-hackers
On Mon, Jun 17, 2019 at 5:01 PM Jamison, Kirk <k.jamison@jp.fujitsu.com> wrote:
>
> Hi all,
>
> Attached is the v2 of the patch. I added the optimization that Sawada-san
> suggested for DropRelFileNodeBuffers, although I did not acquire the lock
> when comparing the minBlock and target block.
>
> There's actually a comment written in the source code that we could
> pre-check buffer tag for forkNum and blockNum, but given that FSM and VM
> blocks are small compared to main fork's, the additional benefit of doing so
> would be small.
>
> >* We could check forkNum and blockNum as well as the rnode, but the
> >* incremental win from doing so seems small.
>
> I personally think it's alright not to include the suggested pre-checking.
> If that's the case, we can just follow the patch v1 version.
>
> Thoughts?
>
> Comments and reviews from other parts of the patch are also very much welcome.
>

Thank you for updating the patch. Here is the review comments for v2 patch.

---
- *     visibilitymap_truncate - truncate the visibility map
+ *     visibilitymap_mark_truncate - mark the about-to-be-truncated VM
+ *
+ * Formerly, this function truncates VM relation forks. Instead, this just
+ * marks the dirty buffers.
  *
  * The caller must hold AccessExclusiveLock on the relation, to ensure that
  * other backends receive the smgr invalidation event that this function sends
  * before they access the VM again.
  *

I don't think we should describe about the previous behavior here.
Rather we need to describe what visibilitymap_mark_truncate does and
what it returns to the caller.

I'm not sure that visibilitymap_mark_truncate function name is
appropriate here since it actually truncate map bits, not only
marking. Perhaps we can still use visibilitymap_truncate or change to
visibilitymap_truncate_prepare, or something? Anyway, this function
truncates only tail bits in the last remaining map page and we can
have a rule that the caller must call smgrtruncate() later to actually
truncate pages.

The comment of second paragraph is now out of date since this function
no longer sends smgr invalidation message.

Is it worth to leave the current visibilitymap_truncate function as a
shortcut function, instead of replacing? That way we don't need to
change pg_truncate_visibility_map function.

The same comments are true for MarkFreeSpaceMapTruncateRel.

---
+       ForkNumber      forks[MAX_FORKNUM];
+       BlockNumber     blocks[MAX_FORKNUM];
+       BlockNumber     new_nfsmblocks = InvalidBlockNumber;    /* FSM blocks */
+       BlockNumber     newnblocks = InvalidBlockNumber;        /* VM blocks */
+       int             nforks = 0;

I think that we can have new_nfsmblocks and new_nvmblocks and wipe out
the comments.

---
-       /* Truncate the FSM first if it exists */
+       /*
+        * We used to truncate FSM and VM forks here. Now we only mark the
+        * dirty buffers of all forks about-to-be-truncated if they exist.
+        */
+

Again, I think we need the description of current behavior rather than
the history, except the case where the history is important.

---
-               /*
-                * Make an XLOG entry reporting the file truncation.
-                */
+               /* Make an XLOG entry reporting the file truncation */

Unnecessary change.

---
+       /*
+        * We might as well update the local smgr_fsm_nblocks and
smgr_vm_nblocks
+        * setting. smgrtruncate sent an smgr cache inval message,
which will cause
+        * other backends to invalidate their copy of smgr_fsm_nblocks and
+        * smgr_vm_nblocks, and this one too at the next command
boundary. But this
+        * ensures it isn't outright wrong until then.
+        */
+       if (rel->rd_smgr)
+       {
+               rel->rd_smgr->smgr_fsm_nblocks = new_nfsmblocks;
+               rel->rd_smgr->smgr_vm_nblocks = newnblocks;
+       }

new_nfsmblocks and newnblocks could be InvalidBlockNumber when the
forks are already enough small. So the above code sets incorrect
values to smgr_{fsm,vm}_nblocks.

Also, I wonder if we can do the above code in smgrtruncate. Otherwise
we always need to set smgr_{fsm,vm}_nblocks after smgrtruncate, which
is inconvenient.

---
+               /* Update the local smgr_fsm_nblocks and
smgr_vm_nblocks setting */
+               if (rel->rd_smgr)
+               {
+                       rel->rd_smgr->smgr_fsm_nblocks = new_nfsmblocks;
+                       rel->rd_smgr->smgr_vm_nblocks = newnblocks;
+               }

The save as above. And we need to set smgr_{fsm,vm}_nblocks in spite
of freeing the fake relcache soon?

---
+       /* Get the lower bound of target block number we're interested in */
+       for (i = 0; i < nforks; i++)
+       {
+               if (!BlockNumberIsValid(minBlock) ||
+                       minBlock > firstDelBlock[i])
+                       minBlock = firstDelBlock[i];
+       }

Maybe we can declare 'i' in the for statement (i.e. for (int i = 0;
...)) at every outer loops in this functions. And in the inner loop we
can use 'j'.

---
-DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
-                                          BlockNumber firstDelBlock)
+DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
+                                          BlockNumber *firstDelBlock,
int nforks)

I think it's better to declare *forkNum and nforks side by side for
readability. That is, we can have it as follows.

DropRelFileNodeBuffers (RelFileNodeBackend rnode, ForkNumber *forkNum,
int nforks, BlockNumber *firstDelBlock)


---
-smgrdounlinkfork(SMgrRelation reln, ForkNumber forknum, bool isRedo)
+smgrdounlinkfork(SMgrRelation reln, ForkNumber *forknum, bool isRedo,
int nforks)

Same as above. The order of reln, *forknum, nforks, isRedo would be better.

---
@@ -383,6 +383,10 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
 {
        Oid                     relid = PG_GETARG_OID(0);
        Relation        rel;
+       ForkNumber      forks[MAX_FORKNUM];
+       BlockNumber     blocks[MAX_FORKNUM];
+       BlockNumber     newnblocks = InvalidBlockNumber;
+       int             nforks = 0;

Why do we need the array of forks and blocks here? I think it's enough
to have one fork and one block number.

---
The comment of smgrdounlinkfork function needs to be updated. We now
can remove multiple forks.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Refactoring base64 encoding and decoding into a safer interface
Next
From: Tomas Vondra
Date:
Subject: Re: Increasing default value for effective_io_concurrency?