Re: non-HOT update not looking at FSM for large tuple update - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: non-HOT update not looking at FSM for large tuple update
Date
Msg-id CAEze2WhHcC3qusCZL_YUL5xUSA4Z5ztthvDxyVPEaSUJfGrcaw@mail.gmail.com
Whole thread Raw
In response to Re: non-HOT update not looking at FSM for large tuple update  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: non-HOT update not looking at FSM for large tuple update
List pgsql-hackers
On Wed, 17 Mar 2021 at 21:52, John Naylor <john.naylor@enterprisedb.com> wrote:
>
> On Fri, Mar 12, 2021 at 8:45 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
> >
> > If this case isn't added, the lower else branch will fail to find
> > fitting pages for len > maxPaddedFsmRequest tuples; potentially
> > extending the relation when there is actually still enough space
> > available.
>
> Okay, with that it looks better to go back to using Max().
>
> Also in v4:
>
> - With the separate constant you suggested, I split up the comment into two parts.

> +     * The minimum space on a page for it to be considered "empty" regardless
> +     * of fillfactor. We base this on MaxHeapTupleSize, minus a small amount
> +     * of slack. We allow slack equal to 1/8 the maximum space that could be
> +     * taken by line pointers, which is somewhat arbitrary.

> +     * We want to allow inserting a large tuple into an empty page even if
> +     * that would violate the fillfactor. Otherwise, we would unnecessarily
> +     * extend the relation. Instead, ask the FSM for maxPaddedFsmRequest
> +     * bytes. This will allow it to return a page that is not quite empty
> +     * because of unused line pointers

How about

+    * Because pages that have no items left can still have space allocated
+    * to item pointers, we consider pages "empty" for FSM requests when they
+    * have at most 1/8 of their MaxHeapTuplesPerPage item pointers' space
+    * allocated. This is a somewhat arbitrary number, but should prevent
+    * most unnecessary relation extensions due to not finding "empty" pages
+    * while inserting combinations of large tuples with low fillfactors.

+    * When the free space to be requested from the FSM is greater than
+    * maxPaddedFsmRequest, this is considered equivalent to requesting
+    * insertion on an "empty" page, so instead of failing to find a page
+    * with more empty space than an "empty" page and extend the relation,
+    * we try to find a page which is considered "emtpy".

This is slightly more verbose, but I think this clarifies the
reasoning why we need this a bit better. Feel free to reject or adapt
as needed.

> - I've added a regression test to insert.sql similar to your earlier test, but we cannot use vacuum, since in
paralleltests there could still be tuples visible to other transactions. It's still possible to test almost-all-free by
insertinga small tuple. 
> - Draft commit message

Apart from these mainly readability changes in comments, I think this is ready.

> --
> John Naylor
> EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: GROUP BY DISTINCT
Next
From: "Euler Taveira"
Date:
Subject: Re: cleanup temporary files after crash