On Fri, Jan 2, 2026 at 8:45 PM David Rowley <dgrowleyml@gmail.com> wrote:
> Do you have an example case of this happening? Ideally, the code that
> should disfavour Memoize for this case is estimate_num_groups() as
> called in cost_memoize_rescan() by returning that there's 1 group per
> input row. I guess that's not happening for this case? Why?
I have seen the issue pop up a few times when the unique constraint is
across multiple columns and the join is only on one of those columns
(e.g. https://www.postgresql.org/message-id/CAAiQw3yBPrCw6ZLeTwVS4QhKDWgJkmmp9LnGPdodxeQmn=kqVg@mail.gmail.com),
and a constant filter is on the other column. I think what happens is
that this introduces potential for error into the sample because
Postgres can now come across more duplicates of the join key than
expected, reducing n_distinct, or could come across more rows with the
constant filtered value, thus increasing its predicted frequency (in
the case I linked, the constant's frequency was stored in the columns
MCV list), leading to a nonzero hit ratio as the cardinality
estimation/estcalls > ndistinct. However, I am not certain this is the
case (while there probably wouldn't need to be much of an asymmetry to
cause memorization given the high cost in the planner for extra index
scans, it still seems odd that stats could be off enough to enable
this because of sampling alone). Maybe there is a statistics bug at
play? I am not certain.
Jacob