On Thu, May 2, 2024 at 2:12 AM Rian McGuire <rian.mcguire@buildkite.com> wrote:
> The planner was burning a huge amount of CPU time looking through
> index pages for the first visible tuple. The problem eventually
> resolved when the affected index was vacuumed, but that took several
> hours to complete.
This is exactly the same problem recently that Mark Callaghan recently
encountered when benchmarking Postgres using a variant of his insert
benchmark:
https://smalldatum.blogspot.com/2024/01/updated-insert-benchmark-postgres-9x-to_10.html
https://smalldatum.blogspot.com/2024/01/updated-insert-benchmark-postgres-9x-to_27.html
https://smalldatum.blogspot.com/2024/03/trying-to-tune-postgres-for-insert.html
This is a pretty nasty sharp edge. I bet many users encounter this
problem without ever understanding it.
> The previous discussion [1] touched on the idea of also limiting the
> number of index page fetches, but there were doubts about the safety
> of back-patching and the ugliness of modifying the index AM API to
> support this.
Fundamentally, the problem is that the heuristics that we have don't
care about the cost of reading index leaf pages. All that it takes is
a workload where that becomes the dominant cost -- such a workload
won't be helped at all by the existing heap-focussed heuristic. This
seems obvious to me.
It seems natural to fix the problem by teaching the heuristics to give
at least some consideration to the cost of reading index leaf pages --
more than zero. The patch from Simon seems like the right general
approach to me, since it precisely targets the underlying problem.
What other approach is there, really?
> I would like to submit our experience as evidence that the lack of
> limit on index page fetches is a real problem. Even if a fix for this
> doesn't get back-patched, it would be nice to see it in a major
> version.
I find that very easy to believe.
--
Peter Geoghegan