Re: WIP: Avoid creation of the free space map for small tables - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: WIP: Avoid creation of the free space map for small tables
Date
Msg-id CAA4eK1J957pGLFVRzB1ao-rOwfY2j-AwMjMfNAzZrCaJn5-rmw@mail.gmail.com
Whole thread Raw
In response to Re: WIP: Avoid creation of the free space map for small tables  (John Naylor <jcnaylor@gmail.com>)
Responses Re: WIP: Avoid creation of the free space map for small tables  (John Naylor <jcnaylor@gmail.com>)
Re: WIP: Avoid creation of the free space map for small tables  (John Naylor <jcnaylor@gmail.com>)
List pgsql-hackers
On Sun, Oct 14, 2018 at 1:09 AM John Naylor <jcnaylor@gmail.com> wrote:
>
> On 10/13/18, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > I think you have found a good way to avoid creating FSM, but can't we
> > use some simpler technique like if the FSM fork for a relation doesn't
> > exist, then check the heapblk number for which we try to update the
> > FSM and if it is lesser than HEAP_FSM_EXTENSION_THRESHOLD, then avoid
> > creating the FSM.
>
> I think I see what you mean, but to avoid the vacuum problem you just
> mentioned, we'd need to check the relation size, too.
>

Sure, but vacuum already has relation size.  In general, I think we
should try to avoid adding more system calls in the common code path.
It can impact the performance.

Few comments on your latest patch:
-
+static bool
+allow_write_to_fsm(Relation rel, BlockNumber heapBlk)
+{
+ BlockNumber heap_nblocks;
+
+ if (heapBlk > HEAP_FSM_EXTENSION_THRESHOLD ||
+ rel->rd_rel->relkind != RELKIND_RELATION)
+ return true;
+
+ /* XXX is this value cached? */
+ heap_nblocks = RelationGetNumberOfBlocks(rel);
+
+ if (heap_nblocks > HEAP_FSM_EXTENSION_THRESHOLD)
+ return true;
+ else
+ {
+ RelationOpenSmgr(rel);
+ return smgrexists(rel->rd_smgr, FSM_FORKNUM);
+ }
+}

I think you can avoid calling RelationGetNumberOfBlocks, if you call
smgrexists before and for the purpose of vacuum, we can get that as an
input parameter.  I think one can argue for not changing the interface
functions like RecordPageWithFreeSpace to avoid calling
RelationGetNumberOfBlocks, but to me, it appears worth to save the
additional system call.

-
targetBlock = GetPageWithFreeSpace(relation, len + saveFreeSpace);
-
- /*
- * If the FSM knows nothing of the rel, try the last page before we
- * give up and extend.  This avoids one-tuple-per-page syndrome during
- * bootstrapping or in a recently-started system.
- */
  if (targetBlock == InvalidBlockNumber)
- {
- BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
-
- if (nblocks > 0)
- targetBlock = nblocks - 1;
- }
+ targetBlock = get_page_no_fsm(relation, InvalidBlockNumber,
+   &try_every_page);


Is it possible to hide the magic of trying each block within
GetPageWithFreeSpace?  It will simplify the code and in future, if
another storage API has a different function for
RelationGetBufferForTuple, it will work seamlessly, provided they are
using same FSM.  One such user is zheap.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: "SELECT ... FROM DUAL" is not quite as silly as it appears
Next
From: Andrey Borodin
Date:
Subject: Re: B-tree cache prefetches