On Mon, 2026-02-23 at 21:42 +0100, Attila Soki wrote:
> > > plan-ok:
> > > https://explain.depesz.com/s/hQvM
> > >
> > > plan-wrong:
> > > https://explain.depesz.com/s/uLvl
> >
> > Thanks.
> >
> > The difference in the plans is under the "Subquery Scan on odg", starting with
> > plan node 50 (everything under the "Sort"). I suspect that the mis-estimate
> > that is at the root of the problem is here:
> >
> > -> Index Scan using table_k_late_spec_dp_end_dat_key on schema1.table_k kal (... rows=196053 ...) (...
rows=471.00...)
> > Index Cond: (kal.dp_end_dat < ('now'::cstring)::date)
> > Index Searches: 1
> > Buffers: shared hit=230 read=49
> > I/O Timings: shared read=0.142
> >
> > PostgreSQL overestimates the row count by a factor of over 400.
> > Try to fix that estimate and see if that gets PostgreSQL to do the right thing.
> >
> > Perhaps a simple ANALYZE on the table can do the trick.
>
>
> In the examples I used table_k to flip the plan with
> vacuumed -Upostgres -vZ -t schema1.tbl_used_in_query db1
> in the explain output schema1.tbl_used_in_query is table_k
I cannot understand that.
> > The right side of the comparison looks awkward, as if you wrote 'now'::text::date
> > My experiments show that PostgreSQL v18 estimates well even with such a weird
> > condition, but perhaps if you write "current_date" instead, you'd get better results.
>
> I didn't realize that made a difference. I will replace all occurrences. It also looks more clean with current_date.
It *didn't* make a difference when I played with that...
> > I'd play just with a query like
> >
> > EXPLAIN (ANALYZE)
> > SELECT * FROM schema1.table_k AS kal
> > WHERE dp_end_dat < current_date;
> >
> > until I get a good estimate.
>
> I will try to set custom statistics for dp_end_dat and the fields used by the table_k_late_spec_dp_end_dat_key index.
> Let’s see if that helps.
For a simple condition like that, extended statistics won't help.
That's why I suggested a plain ANALYZE.
I am not sure why that is estimated so badly.
Yours,
Laurenz Albe