RE: [PATCH] Speedup truncates of relation forks - Mailing list pgsql-hackers

From Jamison, Kirk
Subject RE: [PATCH] Speedup truncates of relation forks
Date
Msg-id D09B13F772D2274BB348A310EE3027C6509051@g01jpexmbkw24
Whole thread Raw
In response to Re: [PATCH] Speedup truncates of relation forks  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses RE: [PATCH] Speedup truncates of relation forks
List pgsql-hackers
On Tuesday, July 2, 2019 4:59 PM (GMT+9), Masahiko Sawada wrote:
> Thank you for updating the patch. Here is the review comments for v2 patch.

Thank you so much for review!
I indicated the addressed parts below and attached the updated patch.

---
visibilitymap.c: visibilitymap_truncate()

> I don't think we should describe about the previous behavior here.
> Rather we need to describe what visibilitymap_mark_truncate does and what
> it returns to the caller.
>
> I'm not sure that visibilitymap_mark_truncate function name is appropriate
> here since it actually truncate map bits, not only marking. Perhaps we can
> still use visibilitymap_truncate or change to
> visibilitymap_truncate_prepare, or something? Anyway, this function
> truncates only tail bits in the last remaining map page and we can have a
> rule that the caller must call smgrtruncate() later to actually truncate
> pages.
> 
> The comment of second paragraph is now out of date since this function no
> longer sends smgr invalidation message.

(1) I updated function name to "visibilitymap_truncate_prepare()" as suggested.
I think that parameter name fits, unless other reviewers suggest a better name.
I also updated its description more accurately: describing current behavior,
caller must eventually call smgrtruncate() to actually truncate vm pages,
and removed the outdated description.


> Is it worth to leave the current visibilitymap_truncate function as a shortcut
> function, instead of replacing? That way we don't need to change
> pg_truncate_visibility_map function.

(2) Yeah, it's kinda displeasing that I had to add lines in pg_truncate_visibility_map.
By any chance, re: shortcut function, you meant to retain the function
visibilitymap_truncate() and just add another visibilitymap_truncate_prepare(),
isn't it? I'm not sure if it's worth the additional lines of adding
another function in visibilitymap.c, that's why I just updated the function itself
which just adds 10 lines to pg_truncate_visibility_map anyway.
Hmm. If it's not wrong to do it this way, then I will retain this change.
OTOH, if pg_visibility.c *must* not be modified, then I'll follow your advice.


----
pg_visibility.c: pg_truncate_visibility_map()

> @@ -383,6 +383,10 @@ pg_truncate_visibility_map(PG_FUNCTION_ARGS)
>  {
>         Oid                     relid = PG_GETARG_OID(0);
>         Relation        rel;
> +       ForkNumber      forks[MAX_FORKNUM];
> +       BlockNumber     blocks[MAX_FORKNUM];
> +       BlockNumber     newnblocks = InvalidBlockNumber;
> +       int             nforks = 0;
> 
> Why do we need the array of forks and blocks here? I think it's enough to
> have one fork and one block number.

(3) Thanks for the catch. Updated.


----
freespace.c: MarkFreeSpaceMapTruncateRel()

> The same comments are true for MarkFreeSpaceMapTruncateRel.

> +       BlockNumber     new_nfsmblocks = InvalidBlockNumber;    /* FSM
> blocks */
> +       BlockNumber     newnblocks = InvalidBlockNumber;        /* VM
> blocks */
> +       int             nforks = 0;
> 
> I think that we can have new_nfsmblocks and new_nvmblocks and wipe out the
> comments.

(4) I updated the description accordingly, describing only the current behavior.
The caller must eventually call smgrtruncate() to actually truncate fsm pages.
I also removed the outdated description and irrelevant comments.


------
storage.c: RelationTruncate()

> +        * We might as well update the local smgr_fsm_nblocks and
> smgr_vm_nblocks
> +        * setting. smgrtruncate sent an smgr cache inval message,
> which will cause
> +        * other backends to invalidate their copy of smgr_fsm_nblocks and
> +        * smgr_vm_nblocks, and this one too at the next command
> boundary. But this
> +        * ensures it isn't outright wrong until then.
> +        */
> +       if (rel->rd_smgr)
> +       {
> +               rel->rd_smgr->smgr_fsm_nblocks = new_nfsmblocks;
> +               rel->rd_smgr->smgr_vm_nblocks = newnblocks;
> +       }
> 
> new_nfsmblocks and newnblocks could be InvalidBlockNumber when the forks are
> already enough small. So the above code sets incorrect values to
> smgr_{fsm,vm}_nblocks.
> 
> Also, I wonder if we can do the above code in smgrtruncate. Otherwise we always
> need to set smgr_{fsm,vm}_nblocks after smgrtruncate, which is inconvenient.

(5) 
In my patch, did you mean that there's a possibility that these values
were already set to InvalidBlockNumber even before I did the setting, is it? 
I'm not sure if IIUC, the point of the above code is to make sure that
smgr_{fsm,vm}_nblocks are not InvalidBlockNumber until the next command
boundary, and while we don't reach that boundary yet, we make sure
these values are valid within that window. Is my understanding correct?
Maybe following your advice that putting it inside the smgrtruncate loop
will make these values correct.
For example, below?

void smgrtruncate
{
    ...
    CacheInvalidateSmgr(reln->smgr_rnode);

    /* Do the truncation */
    for (i = 0; i < nforks; i++)
    {
        smgrsw[reln->smgr_which].smgr_truncate(reln, forknum[i], nblocks[i]);

        if (forknum[i] == FSM_FORKNUM)
            reln->smgr_fsm_nblocks = nblocks[i];
        if (forknum[i] == VISIBILITYMAP_FORKNUM)
            reln->smgr_vm_nblocks = nblocks[i];
    }

Another problem I have is where I should call FreeSpaceMapVacuumRange to 
account for truncation of fsm pages. I also realized the upper bound
new_nfsmblocks might be incorrect in this case.
This is the cause why regression test fails in my updated patch...
+     * Update upper-level FSM pages to account for the truncation.
+     * This is important because the just-truncated pages were likely
+     * marked as all-free, and would be preferentially selected.
+     */
+    FreeSpaceMapVacuumRange(rel->rd_smgr, new_nfsmblocks, InvalidBlockNumber);


-----------
storage.c: smgr_redo()

> +               /* Update the local smgr_fsm_nblocks and
> smgr_vm_nblocks setting */
> +               if (rel->rd_smgr)
> +               {
> +                       rel->rd_smgr->smgr_fsm_nblocks = new_nfsmblocks;
> +                       rel->rd_smgr->smgr_vm_nblocks = newnblocks;
> +               }
> 
> The save as above. And we need to set smgr_{fsm,vm}_nblocks in spite of freeing
> the fake relcache soon?

(6) You're right. It's unnecessary in this case.
If I also put the smgr_{fsm,vm}_nblocks setting inside the smgrtruncate
as you suggested above, it will still be set after truncation? Hmm.
Perhaps it's ok, because in the current source code it also does the setting
whenever we call visibilitymap_truncate, FreeSpaceMapTruncateRel during redo.


-----------
bufmgr.c: DropRelFileNodeBuffers()

> +       /* Get the lower bound of target block number we're interested in
> */
> +       for (i = 0; i < nforks; i++)
> +       {
> +               if (!BlockNumberIsValid(minBlock) ||
> +                       minBlock > firstDelBlock[i])
> +                       minBlock = firstDelBlock[i];
> +       }
> 
> Maybe we can declare 'i' in the for statement (i.e. for (int i = 0;
> ...)) at every outer loops in this functions. And in the inner loop we can
> use 'j'.

(7) Agree. Updated.

> I think it's better to declare *forkNum and nforks side by side for readability.
> That is, we can have it as follows.
> 
> DropRelFileNodeBuffers (RelFileNodeBackend rnode, ForkNumber *forkNum, int
> nforks, BlockNumber *firstDelBlock)

(8) Agree. I updated DropRelFileNodeBuffers, smgrtruncate and 
smgrdounlinkfork accordingly.

---------
smgr.c: smgrdounlinkfork()

> -smgrdounlinkfork(SMgrRelation reln, ForkNumber forknum, bool isRedo)
> +smgrdounlinkfork(SMgrRelation reln, ForkNumber *forknum, bool isRedo,
> int nforks)
> 
> Same as above. The order of reln, *forknum, nforks, isRedo would be better.
> 
> The comment of smgrdounlinkfork function needs to be updated. We now can
> remove multiple forks.

(9) Agree. Updated accordingly.


I updated the patch based from comments,
but it still fails the regression test as indicated in (5) above.
Kindly verify if I correctly addressed the other parts as what you intended.

Thanks again for the review!
I'll update the patch again after further comments.

Regards,
Kirk Jamison

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Index Skip Scan
Next
From: Peter Eisentraut
Date:
Subject: Re: Feature: Add Greek language fulltext search