Thread: Negative cache entries for memoize

Negative cache entries for memoize

From
Bruce Momjian
Date:
I wrote an optimizer talk that explains memoize, slides 24-25:

    https://momjian.us/main/writings/pgsql/beyond.pdf#page=25

During two presentations, I was asked if negative cache entries were
created for cases where inner-side lookups returned no rows.

It seems we don't do that.  Has this been considered or is it planned?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.



Re: Negative cache entries for memoize

From
David Rowley
Date:
On Thu, 6 Apr 2023 at 03:12, Bruce Momjian <bruce@momjian.us> wrote:
> During two presentations, I was asked if negative cache entries were
> created for cases where inner-side lookups returned no rows.
>
> It seems we don't do that.  Has this been considered or is it planned?

It does allow negative cache entries, so I'm curious about what you
did to test this.

A cache entry is always marked as complete (i.e valid to use for
lookups) when we execute the subnode to completion.  In some plan
shapes we might not execute the inner side until it returns NULL, for
example in Nested Loop Semi Joins we skip to the next outer row when
matching the first inner row. This could leave an incomplete cache
entry which Memoize can't be certain if it contains all rows from the
subnode or not.

For the negative entry case, which really there is no special code
for, there are simply just no matching rows so the cache entry will be
marked as complete always as the inner node will return NULL on the
first call. So negative entries will even work in the semi-join case.

Here's a demo of the negative entries working with normal joins:

create table t0 (a int);
insert into t0 select 0 from generate_Series(1,1000000);
create table t1 (a int primary key);
insert into t1 select x from generate_series(1,1000000)x;
vacuum analyze t0,t1;
explain (analyze, costs off, timing off, summary off)
select * from t0 inner join t1 on t0.a=t1.a;
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Nested Loop (actual rows=0 loops=1)
   ->  Seq Scan on t0 (actual rows=1000000 loops=1)
   ->  Memoize (actual rows=0 loops=1000000)
         Cache Key: t0.a
         Cache Mode: logical
         Hits: 999999  Misses: 1  Evictions: 0  Overflows: 0  Memory Usage: 1kB
         ->  Index Only Scan using t1_pkey on t1 (actual rows=0 loops=1)
               Index Cond: (a = t0.a)
               Heap Fetches: 0

David



Re: Negative cache entries for memoize

From
Bruce Momjian
Date:
On Thu, Apr  6, 2023 at 09:23:31AM +1200, David Rowley wrote:
> On Thu, 6 Apr 2023 at 03:12, Bruce Momjian <bruce@momjian.us> wrote:
> > During two presentations, I was asked if negative cache entries were
> > created for cases where inner-side lookups returned no rows.
> >
> > It seems we don't do that.  Has this been considered or is it planned?
> 
> It does allow negative cache entries, so I'm curious about what you
> did to test this.

My mistake.  Someone asked in Los Angeles and Jan Wieck checked during
the talk and said he didn't see it, and when someone asked in Moscow, I
repeated that answer.  My mistake.  I have updated the slides with the
correct information.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.