On Fri, 10 May 2024, Tom Lane wrote:
> Dimitrios Apostolou <jimis@gmx.net> writes:
>> Further digging into this simple query, if I force the non-parallel plan
>> by setting max_parallel_workers_per_gather TO 0, I see that the query
>> planner comes up with a cost much higher:
>
>> Limit (cost=363.84..1134528847.47 rows=10 width=4)
>> -> Unique (cost=363.84..22690570036.41 rows=200 width=4)
>> -> Append (cost=363.84..22527480551.58 rows=65235793929 width=4)
>> ...
>
>> The total cost on the 1st line (cost=363.84..1134528847.47) has a much
>> higher upper limit than the total cost when
>> max_parallel_workers_per_gather is 4 (cost=853891608.79..853891608.99).
>> This explains the planner's choice. But I wonder why the cost estimation
>> is so far away from reality.
>
> I'd say the blame lies with that (probably-default) estimate of
> just 200 distinct rows. That means the planner expects to have
> to read about 5% (10/200) of the tables to get the result, and
> that's making fast-start plans look bad.
Indeed that's an awful estimate, the table has more than 1M of unique
values in that column. Looking into pg_stat_user_tables, I can't see the
partitions having been vacuum'd or analyzed at all. I think they should
have been auto-analyzed, since they get a ton of INSERTs
(no deletes/updates though) and I have the default autovacuum settings.
Could it be that autovacuum starts, but never
finishes? I can't find something in the logs.
In any case, even after the planner decides to execute the terrible plan
with the parallel seqscans, why doesn't it finish right when it finds 10
distinct values?
>
> Possibly an explicit ANALYZE on the partitioned table would help.
Thanks, I'll save the ANALYZE as the last step; I feel it's a good
opportunity to figure out more details about how postgres works. Plus I
expect ANALYZE to last a couple of days, so I should first find quiet time
for that. :-)
Dimitris