On Tue, 2 Sept 2025 at 00:41, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
> After disabling MIN/MAX optimization in the grouping_planner function:
> /*
> * Preprocess MIN/MAX aggregates, if any. Note: be careful about
> * adding logic between here and the query_planner() call.
> Anything
> * that is needed in MIN/MAX-optimizable cases will have to be
> * duplicated in planagg.c.
> */
> //if (parse->hasAggs)
> // preprocess_minmax_aggregates(root);
>
> I got a better query plan, but I’m still investigating what went wrong.
This is basically just the standard issue people have with how we cost LIMIT.
If you look at this part:
-> Limit (cost=0.43..7.91 rows=1 width=8)
Output: docum.dt
-> Index Scan using docum_dt_7ee1d676 on public.docum
(cost=0.43..420487.43 rows=56222 width=8)
you can see that the Limit total cost is startup (0.43) plus 420487.43
/ 56222 * 1.
The problem is that the planner assumes there's no correlation between
dt and dt_real. It thinks because a decent number of rows have dt_real
>= '2025-08-14 09:44:09.033592' that it'll find the LIMIT 1 row fairly
quickly by scanning the docum_dt_7ee1d676 index. Unfortunately, it's
not quick because the rows matching dt_real >= '2025-08-14
09:44:09.033592' are nearly at the end of the index.
In v16+, using something like MIN(docum.dt ORDER BY dt) AS "dt__min"
should force the planner into giving a better plan. ORDER BY /
DISTINCT aggregates are a bit less efficient in versions earlier than
that, so doing that would require a bit of additional work if using
v14. It might not be too bad, though.
David