Re: Hybrid Hash/Nested Loop joins and caching results from subplans - Mailing list pgsql-hackers

From David Rowley
Subject Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date
Msg-id CAApHDvq5_n8svmLHJJ0Vr_v2R3GTmsO9xuW-hU6RyYSVosyj0Q@mail.gmail.com
Whole thread Raw
In response to Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (Andres Freund <andres@anarazel.de>)
Responses Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, 9 Jul 2020 at 04:53, Andres Freund <andres@anarazel.de> wrote:
>
> On 2020-05-20 23:44:27 +1200, David Rowley wrote:
> > I've attached a patch which implements this.  The new node type is
> > called "Result Cache".  I'm not particularly wedded to keeping that
> > name, but if I change it, I only want to do it once. I've got a few
> > other names I mind, but I don't feel strongly or confident enough in
> > them to go and do the renaming.
>
> I'm not convinced it's a good idea to introduce a separate executor node
> for this. There's a fair bit of overhead in them, and they will only be
> below certain types of nodes afaict. It seems like it'd be better to
> pull the required calls into the nodes that do parametrized scans of
> subsidiary nodes. Have you considered that?

I see 41 different node types mentioned in ExecReScan().  I don't
really think it would be reasonable to change all those.

Here are a couple of examples, one with a Limit below the Result Cache
and one with a GroupAggregate.

postgres=# explain (costs off) select * from pg_Class c1 where relname
= (select relname from pg_Class c2 where c1.relname = c2.relname
offset 1 limit 1);
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Seq Scan on pg_class c1
   Filter: (relname = (SubPlan 1))
   SubPlan 1
     ->  Result Cache
           Cache Key: c1.relname
           ->  Limit
                 ->  Index Only Scan using pg_class_relname_nsp_index
on pg_class c2
                       Index Cond: (relname = c1.relname)
(8 rows)


postgres=# explain (costs off) select * from pg_Class c1 where relname
= (select relname from pg_Class c2 where c1.relname = c2.relname group
by 1 having count(*) > 1);
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Seq Scan on pg_class c1
   Filter: (relname = (SubPlan 1))
   SubPlan 1
     ->  Result Cache
           Cache Key: c1.relname
           ->  GroupAggregate
                 Group Key: c2.relname
                 Filter: (count(*) > 1)
                 ->  Index Only Scan using pg_class_relname_nsp_index
on pg_class c2
                       Index Cond: (relname = c1.relname)
(10 rows)

As for putting the logic somewhere like ExecReScan() then the first
paragraph in [1] are my thoughts on that.

David

[1] https://www.postgresql.org/message-id/CAApHDvr-yx9DEJ1Lc9aAy8QZkgEZkTP=3hBRBe83Vwo=kAndcA@mail.gmail.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: jsonpath versus NaN
Next
From: Peter Geoghegan
Date:
Subject: Re: Index Skip Scan (new UniqueKeys)