On 11/8/24 08:21, Ed Sabol wrote:
> On Nov 7, 2024, at 5:18 PM, David Rowley <dgrowleyml@gmail.com> wrote:
>> It's impossible to say with the given information. You didn't mention
>> which version you upgraded from to start with.
>
> Sorry, 15.6 to 15.7 to 15.8, but we weren't on 15.7 for very long before 15.8.
>
>> You can set random_page_cost for just the session you're connected to
>> and try it. SET random_page_cost = <old value>; before running
>> EXPLAIN (ANALYZE, BUFFERS).
>
> Oh, I didn't think of that. Duh. Here it is with random_page_cost = 4.0:
I see estimation errors in many places here. The second plan survived by
using the Materialize node instead of repeating the execution of the
inner subquery.
Postgres didn't want Materialize in this example because of the low
estimation on its outer subquery. AFAIC, by increasing the *_page_cost's
value, you added extra weight to the inner subquery and shifted the
decision to use materialisation. It looks like a game of chance and
doesn't witness issues of the balance between page read cost and other
operations.
It is hard to say what you can use in general to avoid issues in queries
of such a type except for some query-based Postgres extensions like
AQO, but for now, you can try the following:
I see huge underestimation in the simple scan:
Bitmap Heap Scan on metainfo b_1
(cost=23.96..35.77 rows=3 width=38)
(actual time=1.225..4.206 rows=1025 loops=1)
It may be caused by some functional dependency in its filter:
((relation = 'located'::text) AND (type = 'document'::text))
You can create extended statistics on the columns 'relation' and 'type'.
These statistics can reduce estimation errors and enable the optimiser
to find a better plan without changing the cost balance.
--
regards, Andrei Lepikhov