Re: What to call an executor node which lazily caches tuples in a hash table? - Mailing list pgsql-hackers

From Zhihong Yu
Subject Re: What to call an executor node which lazily caches tuples in a hash table?
Date
Msg-id CALNJ-vQu_3peshD2WuFX8o4SYjS2GVTVZ0Z5hiaH72JUyhL_3w@mail.gmail.com
Whole thread Raw
In response to What to call an executor node which lazily caches tuples in a hash table?  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: What to call an executor node which lazily caches tuples in a hash table?  (Andy Fan <zhihui.fan1213@gmail.com>)
List pgsql-hackers
Hi,
I was reading this part of the description:

the Result Cache's
hash table is much smaller than the hash join's due to result cache only
caching useful values rather than all tuples from the inner side of the join.

I think the word 'Result' should be part of the cache name considering the above.

Cheers

On Tue, Mar 30, 2021 at 4:30 PM David Rowley <dgrowleyml@gmail.com> wrote:
Hackers,

Over on [1] I've been working on adding a new type of executor node
which caches tuples in a hash table belonging to a given cache key.

The current sole use of this node type is to go between a
parameterized nested loop and the inner node in order to cache
previously seen sets of parameters so that we can skip scanning the
inner scan for parameter values that we've already cached.  The node
could also be used to cache results from correlated subqueries,
although that's not done yet.

The cache limits itself to not use more than hash_mem by evicting the
least recently used entries whenever more space is needed for new
entries.

Currently, in the patch, the node is named "Result Cache".  That name
was not carefully thought out. I just needed to pick something when
writing the code.

Here's an EXPLAIN output with the current name:

postgres=# explain (costs off) select relkind,c from pg_class c1,
lateral (select count(*) c from pg_class c2 where c1.relkind =
c2.relkind) c2;
                     QUERY PLAN
----------------------------------------------------
 Nested Loop
   ->  Seq Scan on pg_class c1
   ->  Result Cache
         Cache Key: c1.relkind
         ->  Aggregate
               ->  Seq Scan on pg_class c2
                     Filter: (c1.relkind = relkind)
(7 rows)

I just got off a team call with Andres, Thomas and Melanie. During the
call I mentioned that I didn't like the name "Result Cache". Many name
suggestions followed:

Here's a list of a few that were mentioned:

Probe Cache
Tuple Cache
Keyed Materialize
Hash Materialize
Result Cache
Cache
Hash Cache
Lazy Hash
Reactive Hash
Parameterized Hash
Parameterized Cache
Keyed Inner Cache
MRU Cache
MRU Hash

I was hoping to commit the final patch pretty soon, but thought I'd
have another go at seeing if we can get some consensus on a name
before doing that. Otherwise, I'd sort of assumed that we'd just reach
some consensus after everyone complained about the current name after
the feature is committed.

My personal preference is "Lazy Hash", but I feel it might be better
to use the word "Reactive" instead of "Lazy".

There was some previous discussion on the name in [2]. I suggested
some other names in [3]. Andy voted for "Tuple Cache" in [4]

Votes? Other suggestions?

(I've included all the people who have shown some previous interest in
naming this node.)

David

[1] https://www.postgresql.org/message-id/flat/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CA%2BTgmoZMxLeanqrS00_p3xNsU3g1v3EKjNZ4dM02ShRxxLiDBw%40mail.gmail.com
[3] https://www.postgresql.org/message-id/CAApHDvoj_sH1H3JVXgHuwnxf1FQbjRVOqqgxzOgJX13NiA9-cg%40mail.gmail.com
[4] https://www.postgresql.org/message-id/CAKU4AWoshM0JoymwBK6PKOFDMKg-OO6qtSVU_Piqb0dynxeL5w%40mail.gmail.com


pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: unconstrained memory growth in long running procedure stored procedure after upgrading 11-12
Next
From: "'alvherre@alvh.no-ip.org'"
Date:
Subject: Re: libpq debug log