Re: Deleting older versions in unique indexes to avoid page splits - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Deleting older versions in unique indexes to avoid page splits
Date
Msg-id CAH2-Wzksdm-05bt8rxw0BiV2RJnAq+6RzQ7Bg6FUdNb9NUjbSA@mail.gmail.com
Whole thread Raw
In response to Re: Deleting older versions in unique indexes to avoid page splits  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Deleting older versions in unique indexes to avoid page splits
List pgsql-hackers
On Fri, Oct 23, 2020 at 9:03 AM Simon Riggs <simon@2ndquadrant.com> wrote:
> Please publish details of how long a pre-split cleaning operation
> takes and what that does to transaction latency. It *might* be true
> that the cost of avoiding splits is worth it in balance against the
> cost of splitting, but it might not.

I don't think that you understand how the patch works. I cannot very
well isolate that cost because the patch is designed to only pay it
when there is a strong chance of getting a much bigger reward, and
when the only alternative is to split the page. When it fails the
question naturally doesn't come up again for the same two pages that
follow from the page split. As far as I know the only cases that are
regressed all involve small indexes with lots of contention, which is
not surprising. And not necessarily due to the heap page accesses -
making indexes smaller sometimes has that effect, even when it happens
due to things like better page split heuristics.

If anybody finds a serious problem with my patch then it'll be a
weakness or hole in the argument I just made -- it won't have much to
do with how expensive any of these operations are in isolation. It
usually isn't sensible to talk about page splits as isolated things.
Most of my work on B-Trees in the past couple of years built on the
observation that sometimes successive page splits are related to each
other in one way or another.

It is a fallacy of composition to think of the patch as a thing that
prevents some page splits. The patch is valuable because it more or
less eliminates *unnecessary* page splits (and makes it so that there
cannot be very many TIDs for each logical row in affected indexes).
The overall effect is not linear. If you added code to artificially
make the mechanism fail randomly 10% of the time (at the point where
it becomes clear that the current operation would otherwise be
successful) that wouldn't make the resulting code 90% as useful as the
original. It would actually make it approximately 0% as useful. On
human timescales the behavioral difference between this hobbled
version of my patch and the master branch would be almost
imperceptible.

It's obvious that a page split is more expensive than the delete
operation (when it works out). It doesn't need a microbenchmark (and I
really can't think of one that would make any sense). Page splits
typically have WAL records that are ~4KB in size, whereas the
opportunistic delete records are almost always less than 100 bytes,
and typically close to 64 bytes -- which is the same size as most
individual leaf page insert WAL records. Plus you have almost double
the FPIs going forward with the page split.

> You've shown us a very nice paper analysing the page split waves, but
> we need a similar detailed analysis so we can understand if what you
> propose is better or not (and in what situations).

That paper was just referenced in passing. It isn't essential to the
main argument.

> The leaf page locks are held for longer, so we need to perform
> sensible tests that show if this has a catastrophic effect on related
> workloads, or not.
>
> The SELECT tests proposed need to be aimed at the same table, at the same time.

But that's exactly what I did the first time!

I had two SELECT statements against the same table. They use almost
the same distribution as the UPDATE, so that they'd hit the same part
of the key space without it being exactly the same as the UPDATE from
the same xact in each case (I thought that if it was exactly the same
part of the table then that might unfairly favor my patch).

> > The strength of the design is in how clever it isn't.
>
> What it doesn't do could be good or bad so we need to review more
> details on behavior. Since the whole idea of the patch is to change
> behavior, that seems a reasonable ask. I don't have any doubt about
> the validity of the approach or coding.

I agree, but the patch isn't the sum of its parts. You need to talk
about a workload or a set of conditions, and how things develop over
time.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: The ultimate extension hook.
Next
From: Tom Lane
Date:
Subject: Re: new heapcheck contrib module