Re: Unhappy about API changes in the no-fsm-for-small-rels patch - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Date
Msg-id CAA4eK1JFD=H0iPbmjYXKiek0x1gJEh7E_4rqPLW8rqkUp4_=og@mail.gmail.com
Whole thread Raw
In response to Re: Unhappy about API changes in the no-fsm-for-small-rels patch  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Apr 24, 2019 at 9:49 PM Andres Freund <andres@anarazel.de> wrote:
> On 2019-04-24 11:28:32 +0530, Amit Kapila wrote:
> > On Tue, Apr 23, 2019 at 10:59 PM Andres Freund <andres@anarazel.de> wrote:
> > > > > On 2019-04-22 18:49:44 +0530, Amit Kapila wrote:
> > > > I think we should first try to see in this new scheme (a) when to set
> > > > the map, (b) when to clear it, (c) how to use.  I have tried to
> > > > summarize my thoughts about it, let me know what do you think about
> > > > the same?
> > > >
> > > > When to set the map.
> > > > At the beginning (the first time relation is used in the backend), the
> > > > map will be clear.  When the first time in the backend, we find that
> > > > FSM doesn't exist and the number of blocks is lesser than
> > > > HEAP_FSM_CREATION_THRESHOLD, we set the map for the total blocks that
> > > > exist at that time and mark all or alternate blocks as available.
> > >
> > > I think the alternate blocks scheme has to go. It's not defensible.
> > >
> >
> > Fair enough, I have changed it in the attached patch.  However, I
> > think we should test it once the patch is ready as we have seen a
> > small performance regression due to that.
>
> Sure, but that was because you re-scanned from scratch after every
> insertion, no?
>

Possible.

>
> > > And sure, leaving that aside we could store one byte per block
> >
> > Hmm, I think you mean to say one-bit per block, right?
>
> No, I meant byte.
>

Upthread you have said: "Hm, given realistic
HEAP_FSM_CREATION_THRESHOLD, and the fact that we really only need one
bit per relation, it seems like map should really be just a uint32
with one bit per page.   It'd allow different AMs to have different
numbers of dont-create-fsm thresholds without needing additional
memory (up to 32 blocks)."

I can understand the advantage of one-bit per-page suggestion, but now
you are telling one-byte per-page.  I am confused between those two
options.  Am, I missing something?

> The normal FSM saves how much space there is for a
> block, but the current local fsm doesn't. That means pages are marked as
> unavailble even though other tuples would possibly fit.
>

Sure, in regular FSM, the vacuum can update the available space, but
we don't have such a provision for local map unless we decide to keep
it in shared memory.

>
> > >  It's possible that'd come with
> > > some overhead - I've not thought sufficiently about it: I assume we'd
> > > still start out in each backend assuming each page is empty, and we'd
> > > then rely on RelationGetBufferForTuple() to update that. What I wonder
> > > is if we'd need to check if an on-disk FSM has been created every time
> > > the space on a page is reduced?  I think not, if we use invalidations to
> > > notify others of the existance of an on-disk FSM. There's a small race,
> > > but that seems ok.
>
> > Do you mean to say that vacuum or some backend should invalidate in
> > case it first time creates the FSM for a relation?
>
> Right.
>
>
> > I think it is quite possible that the vacuum takes time to trigger
> > such invalidation if there are fewer deletions.  And we won't be able
> > to use newly added page/s.
>
> I'm not sure I understand what you mean by that? If the backend that
> ends up creating the FSM - because it extended the relation beyond 4
> pages - issues an invalidation, the time till other backends pick that
> up should be minimal?
>

Consider when a backend-1 starts inserting into a relation, it has
just two pages and we create a local map in the relation which has two
pages.  Now, backend-2 extends the relation by one-page, how and when
will backend-1 comes to know about that.  One possibility is that once
all the pages present in backend-1's relation becomes invalid
(used-up), we again check the number of blocks and update the local
map.

>
> > IIUC, you are suggesting to issue invalidations when the (a) vacuum
> > finds there is no FSM and page has more space now, (b) invalidation to
> > notify the existence of FSM.  IT seems to me that issuing more
> > invalidations for the purpose of FSM can lead to an increased number
> > of relation builds in the overall system.  I think this can have an
> > impact when there are many small relations in the system which in some
> > scenarios might not be uncommon.
>
> If this becomes an issue I'd just create a separate type of
> invalidation, one that just signals that the FSM is being invalidated.
>

Oh, clever idea, but I guess that will be some work unless we already
do something similar elsewhere.  Anyway, we can look into it later.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Paul Guo
Date:
Subject: Re: [Patch] Check file type before calling AllocateFile() for filesout of pg data directory to avoid potential issues (e.g. hang).
Next
From: John Naylor
Date:
Subject: Re: Unhappy about API changes in the no-fsm-for-small-rels patch