Re: Checking join outer relation uniqueness to prevent unnecessary memoization - Mailing list pgsql-hackers

From Jacob Jackson
Subject Re: Checking join outer relation uniqueness to prevent unnecessary memoization
Date
Msg-id CAAiQw3zHwyfD2PDi=Qu10WriT7zpoO8YDLUKTBWEGGUM52TWiQ@mail.gmail.com
Whole thread Raw
In response to Re: Checking join outer relation uniqueness to prevent unnecessary memoization  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Checking join outer relation uniqueness to prevent unnecessary memoization
List pgsql-hackers
On Fri, Jan 2, 2026 at 8:45 PM David Rowley <dgrowleyml@gmail.com> wrote:
> Do you have an example case of this happening? Ideally, the code that
> should disfavour Memoize for this case is estimate_num_groups() as
> called in cost_memoize_rescan() by returning that there's 1 group per
> input row. I guess that's not happening for this case? Why?

I have seen the issue pop up a few times when the unique constraint is
across multiple columns and the join is only on one of those columns
(e.g. https://www.postgresql.org/message-id/CAAiQw3yBPrCw6ZLeTwVS4QhKDWgJkmmp9LnGPdodxeQmn=kqVg@mail.gmail.com),
and a constant filter is on the other column. I think what happens is
that this introduces potential for error into the sample because
Postgres can now come across more duplicates of the join key than
expected, reducing n_distinct, or could come across more rows with the
constant filtered value, thus increasing its predicted frequency (in
the case I linked, the constant's frequency was stored in the columns
MCV list), leading to a nonzero hit ratio as the cardinality
estimation/estcalls > ndistinct. However, I am not certain this is the
case (while there probably wouldn't need to be much of an asymmetry to
cause memorization given the high cost in the planner for extra index
scans, it still seems odd that stats could be off enough to enable
this because of sampling alone). Maybe there is a statistics bug at
play? I am not certain.


Jacob



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: LLVM 22
Next
From: Pavel Stehule
Date:
Subject: Re: [PATCH] psql: add size-based sorting options (O/o) for tables and indexes