Home > mailing lists

Re: Limit index pages visited in planner's get_actual_variable_range - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Limit index pages visited in planner's get_actual_variable_range
Date	May 2, 2024 15:05:25
Msg-id	CAH2-WzmP=UQdQ1VrcQu52GJeAG3UWEyYT8r=0Nj=Z2L7_zzuAg@mail.gmail.com Whole thread Raw
In response to	Limit index pages visited in planner's get_actual_variable_range (Rian McGuire <rian.mcguire@buildkite.com>)
List	pgsql-hackers

Tree view

On Thu, May 2, 2024 at 2:12 AM Rian McGuire <rian.mcguire@buildkite.com> wrote:
> The planner was burning a huge amount of CPU time looking through
> index pages for the first visible tuple. The problem eventually
> resolved when the affected index was vacuumed, but that took several
> hours to complete.

This is exactly the same problem recently that Mark Callaghan recently
encountered when benchmarking Postgres using a variant of his insert
benchmark:

https://smalldatum.blogspot.com/2024/01/updated-insert-benchmark-postgres-9x-to_10.html
https://smalldatum.blogspot.com/2024/01/updated-insert-benchmark-postgres-9x-to_27.html
https://smalldatum.blogspot.com/2024/03/trying-to-tune-postgres-for-insert.html

This is a pretty nasty sharp edge. I bet many users encounter this
problem without ever understanding it.

> The previous discussion [1] touched on the idea of also limiting the
> number of index page fetches, but there were doubts about the safety
> of back-patching and the ugliness of modifying the index AM API to
> support this.

Fundamentally, the problem is that the heuristics that we have don't
care about the cost of reading index leaf pages. All that it takes is
a workload where that becomes the dominant cost -- such a workload
won't be helped at all by the existing heap-focussed heuristic. This
seems obvious to me.

It seems natural to fix the problem by teaching the heuristics to give
at least some consideration to the cost of reading index leaf pages --
more than zero. The patch from Simon seems like the right general
approach to me, since it precisely targets the underlying problem.
What other approach is there, really?

> I would like to submit our experience as evidence that the lack of
> limit on index page fetches is a real problem. Even if a fix for this
> doesn't get back-patched, it would be nice to see it in a major
> version.

I find that very easy to believe.

--
Peter Geoghegan

pgsql-hackers by date:

From: Alexander Lakhin
Date: 02 May 2024, 15:00:00
Subject: Re: cataloguing NOT NULL constraints

From: Tomas Vondra
Date: 02 May 2024, 15:19:03
Subject: Parallel CREATE INDEX for GIN indexes

Re: Limit index pages visited in planner's get_actual_variable_range - Mailing list pgsql-hackers

Previous

Next