Re: WIP: Avoid creation of the free space map for small tables - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: WIP: Avoid creation of the free space map for small tables
Date
Msg-id CAA4eK1Kr4gyt0LQPER-v2g_GNq_T4u3xgX=d_6PrZuV-U0ch-w@mail.gmail.com
Whole thread Raw
In response to WIP: Avoid creation of the free space map for small tables  (John Naylor <jcnaylor@gmail.com>)
Responses Re: WIP: Avoid creation of the free space map for small tables  (John Naylor <jcnaylor@gmail.com>)
List pgsql-hackers
On Sat, Oct 6, 2018 at 12:17 AM John Naylor <jcnaylor@gmail.com> wrote:
>
> Hi all,
> A while back, Robert Haas noticed that the space taken up by very
> small tables is dominated by the FSM [1]. Tom suggested that we could
> prevent creation of the FSM until the heap has reached a certain
> threshold size [2]. Attached is a WIP patch to implement that. I've
> also attached a SQL script to demonstrate the change in behavior for
> various scenarios.
>
> The behavior that allows the simplest implementation I thought of is as follows:
>
> -The FSM isn't created if the heap has fewer than 10 blocks (or
> whatever). If the last known good block has insufficient space, try
> every block before extending the heap.
>
> -If a heap with a FSM is truncated back to below the threshold, the
> FSM stays around and can be used as usual.
>
> -If the heap tuples are all deleted, the FSM stays but has no leaf
> blocks (same as on master). Although it exists, it won't be
> re-extended until the heap re-passes the threshold.
>
> --
> Some notes:
>
> -For normal mode, I taught fsm_set_and_search() to switch to a
> non-extending buffer call, but the biggest missing piece is WAL
> replay.
>

fsm_set_and_search()
{
..
+ /*
+ * For heaps we prevent extension of the FSM unless the number of pages
+ * exceeds
HEAP_FSM_EXTENSION_THRESHOLD. For tables that don't already
+ * have a FSM, this will save an inode and a few kB
of space.
+ * For sane threshold values, the FSM address will be zero, so we
+ * don't bother dealing with
anything else.
+ */
+ if (rel->rd_rel->relkind == RELKIND_RELATION
+ && addr.logpageno == 0)

I am not sure if this is a solid way to avoid creating FSM.  What if
fsm_set_and_search gets called for the level other than 0?   Also,
when the relation has blocks more than HEAP_FSM_EXTENSION_THRESHOLD,
then first time when vacuum will try to record the free space in the
page, won't it skip recording free space for first
HEAP_FSM_EXTENSION_THRESHOLD pages?

I think you have found a good way to avoid creating FSM, but can't we
use some simpler technique like if the FSM fork for a relation doesn't
exist, then check the heapblk number for which we try to update the
FSM and if it is lesser than HEAP_FSM_EXTENSION_THRESHOLD, then avoid
creating the FSM.

> I couldn't find a non-extending equivalent of
> XLogReadBufferExtended(), so I might have to create one.
>

I think it would be better if we can find a common way to avoid
creating FSM both during DO and REDO time.  It might be possible if
somethin like what I have said above is feasible.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Alexander Kukushkin
Date:
Subject: Re: Maximum password length
Next
From: Andrew Dunstan
Date:
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums