Re: Negative cache entries for memoize - Mailing list pgsql-hackers

From David Rowley
Subject Re: Negative cache entries for memoize
Date
Msg-id CAApHDvptH3p=C4Mm0Pk-u8A=h58vi=m-Y9eBP528bmgBLX1b7w@mail.gmail.com
Whole thread Raw
In response to Negative cache entries for memoize  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Negative cache entries for memoize  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Thu, 6 Apr 2023 at 03:12, Bruce Momjian <bruce@momjian.us> wrote:
> During two presentations, I was asked if negative cache entries were
> created for cases where inner-side lookups returned no rows.
>
> It seems we don't do that.  Has this been considered or is it planned?

It does allow negative cache entries, so I'm curious about what you
did to test this.

A cache entry is always marked as complete (i.e valid to use for
lookups) when we execute the subnode to completion.  In some plan
shapes we might not execute the inner side until it returns NULL, for
example in Nested Loop Semi Joins we skip to the next outer row when
matching the first inner row. This could leave an incomplete cache
entry which Memoize can't be certain if it contains all rows from the
subnode or not.

For the negative entry case, which really there is no special code
for, there are simply just no matching rows so the cache entry will be
marked as complete always as the inner node will return NULL on the
first call. So negative entries will even work in the semi-join case.

Here's a demo of the negative entries working with normal joins:

create table t0 (a int);
insert into t0 select 0 from generate_Series(1,1000000);
create table t1 (a int primary key);
insert into t1 select x from generate_series(1,1000000)x;
vacuum analyze t0,t1;
explain (analyze, costs off, timing off, summary off)
select * from t0 inner join t1 on t0.a=t1.a;
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Nested Loop (actual rows=0 loops=1)
   ->  Seq Scan on t0 (actual rows=1000000 loops=1)
   ->  Memoize (actual rows=0 loops=1000000)
         Cache Key: t0.a
         Cache Mode: logical
         Hits: 999999  Misses: 1  Evictions: 0  Overflows: 0  Memory Usage: 1kB
         ->  Index Only Scan using t1_pkey on t1 (actual rows=0 loops=1)
               Index Cond: (a = t0.a)
               Heap Fetches: 0

David



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
Next
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences, take 2