Re: Potential bug introduced in PG17 with query parallelization - plan flip - Mailing list pgsql-hackers

From David Rowley
Subject Re: Potential bug introduced in PG17 with query parallelization - plan flip
Date
Msg-id CAApHDvqgT-xap-+GZAxBUmvJOoqGE4jqf1TibpSrew0zMBqXAQ@mail.gmail.com
Whole thread Raw
In response to Potential bug introduced in PG17 with query parallelization - plan flip  (Jon Jenkins <jjenkins@gitlab.com>)
List pgsql-hackers
On Fri, 31 Oct 2025 at 10:40, Jon Jenkins <jjenkins@gitlab.com> wrote:
>          ->  Index Scan using index_on_merge_requests_for_latest_diffs
> on public.merge_requests  (cost=0.57..1011.36 rows=773 width=4)
> (actual time=0.015..10.901 rows=9271 loops=1)
>                Index Cond: (merge_requests.target_project_id = 16567575)

> However, on 17, a "more expensive" plan is proposed that does a gather
> between workers, but largely performs the exact same:

>         ->  Index Scan using
> index_merge_requests_on_target_project_id_and_created_at_and_id
> (cost=0.57..19664.54 rows=8262 width=4) (actual time=0.021..7.890
> rows=4498 loops=2)

In the two fragments above, you'll notice the row estimates are quite
different. I suspect this is the reason v17 went with the parallel
plan. What could cause this would be just a difference in sampled rows
from ANALYZE. If you're still in a testing environment, do you see the
plan change back to the non-parallel version if you run ANALYZE on the
merge_requests table and then run the query again? Try it a few times.
You might find that the rows that are sampled sometimes leads to
better or worse estimates.  It's a bit of a fact of life if you have
large skewed data as it's not practical to remember how many of each
and every value exist in the statistics. You could increase the number
by increasing the statistics targets for that column, per [1]. That
might still not be enough, however. If you can't go high enough, you
might be able to do something by setting it to not gather stats for
this column and then tuning n_distinct for the column to some value
that suits. The drawback there is that doing that is fraught with the
possible difficulties of having the planner always thinking there's
the same number of rows for every target_project_id. You might end up
doing an Index Scan when a Seq Scan is better (which it could be if
you had a particular target_project_id that dominated the table).

> I don't know enough about planner internals to specifically track this
> code, but I did see that this patch has been added to 17 that deals
> with parallelizing:
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e08d74ca1

This isn't related to what you're seeing.

David

[1] https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-DESC-SET-STATISTICS



pgsql-hackers by date:

Previous
From: Arne Roland
Date:
Subject: Re: apply_scanjoin_target_to_paths and partitionwise join
Next
From: "Joel Jacobson"
Date:
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue