Re: Hybrid Hash/Nested Loop joins and caching results from subplans - Mailing list pgsql-hackers

From David Rowley
Subject Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date
Msg-id CAApHDvpDdQDFSM+u19ROinT0qw41OX=MW4-B2mO003v6-X0AjA@mail.gmail.com
Whole thread Raw
In response to Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Sat, 29 Aug 2020 at 02:54, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 26 Aug 2020 at 03:52, Andres Freund <andres@anarazel.de> wrote:
> > There'll be a significant reduction in increase in performance.
>
> So I did a very rough-cut change to the patch to have the caching be
> part of Nested Loop.  It can be applied on top of the other 3 v7
> patches.
>
> For the performance, the test I did results in the performance
> actually being reduced from having the Result Cache as a separate
> node.  The reason for this is mostly because Nested Loop projects.

I spoke to Andres off-list this morning in regards to what can be done
to remove this performance regression over the separate Result Cache
node version of the patch. He mentioned that I could create another
ProjectionInfo for when reading from the cache's slot and use that to
project with.

I've hacked this up in the attached. It looks like another version of
the joinqual would also need to be created to that the MinimalTuple
from the cache is properly deformed. I've not done this yet.

The performance does improve this time. Using the same two test
queries from [1], I get:

v7 (Separate Result Cache node)

Query 1:
postgres=# explain (analyze, timing off) select count(l.a) from
hundredk hk inner join lookup100 l on hk.one = l.a;
                                                              QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=152861.19..152861.20 rows=1 width=8) (actual rows=1 loops=1)
   ->  Nested Loop  (cost=0.45..127891.79 rows=9987763 width=4)
(actual rows=10000000 loops=1)
         ->  Seq Scan on hundredk hk  (cost=0.00..1637.00 rows=100000
width=4) (actual rows=100000 loops=1)
         ->  Result Cache  (cost=0.45..2.53 rows=100 width=4) (actual
rows=100 loops=100000)
               Cache Key: hk.one
               Hits: 99999  Misses: 1  Evictions: 0  Overflows: 0
               ->  Index Only Scan using lookup100_a_idx on lookup100
l  (cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
                     Index Cond: (a = hk.one)
                     Heap Fetches: 0
 Planning Time: 0.045 ms
 Execution Time: 894.003 ms
(11 rows)

Query 2:
postgres=# explain (analyze, timing off) select * from hundredk hk
inner join lookup100 l on hk.one = l.a;
                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.45..127891.79 rows=9987763 width=28) (actual
rows=10000000 loops=1)
   ->  Seq Scan on hundredk hk  (cost=0.00..1637.00 rows=100000
width=24) (actual rows=100000 loops=1)
   ->  Result Cache  (cost=0.45..2.53 rows=100 width=4) (actual
rows=100 loops=100000)
         Cache Key: hk.one
         Hits: 99999  Misses: 1  Evictions: 0  Overflows: 0
         ->  Index Only Scan using lookup100_a_idx on lookup100 l
(cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
               Index Cond: (a = hk.one)
               Heap Fetches: 0
 Planning Time: 0.077 ms
 Execution Time: 854.950 ms
(10 rows)

v7 + hacks_V3 (caching done in Nested Loop)

Query 1:
explain (analyze, timing off) select count(l.a) from hundredk hk inner
join lookup100 l on hk.one = l.a;
                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=378570.41..378570.42 rows=1 width=8) (actual rows=1 loops=1)
   ->  Nested Loop Cached  (cost=0.43..353601.00 rows=9987763 width=4)
(actual rows=10000000 loops=1)
         Cache Key: $0
         Hits: 99999  Misses: 1  Evictions: 0  Overflows: 0
         ->  Seq Scan on hundredk hk  (cost=0.00..1637.00 rows=100000
width=4) (actual rows=100000 loops=1)
         ->  Index Only Scan using lookup100_a_idx on lookup100 l
(cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
               Index Cond: (a = hk.one)
               Heap Fetches: 0
 Planning Time: 0.103 ms
 Execution Time: 770.470 ms
(10 rows)

Query 2
explain (analyze, timing off) select * from hundredk hk inner join
lookup100 l on hk.one = l.a;
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Nested Loop Cached  (cost=0.43..353601.00 rows=9987763 width=28)
(actual rows=10000000 loops=1)
   Cache Key: $0
   Hits: 99999  Misses: 1  Evictions: 0  Overflows: 0
   ->  Seq Scan on hundredk hk  (cost=0.00..1637.00 rows=100000
width=24) (actual rows=100000 loops=1)
   ->  Index Only Scan using lookup100_a_idx on lookup100 l
(cost=0.43..2.52 rows=100 width=4) (actual rows=100 loops=1)
         Index Cond: (a = hk.one)
         Heap Fetches: 0
 Planning Time: 0.090 ms
 Execution Time: 779.181 ms
(9 rows)

Also, I'd just like to reiterate that the attached is a very rough cut
implementation that I've put together just to use for performance
comparison in order to help move this conversation along. (I do know
that I'm breaking the const qualifier on PlanState's innerops.)

David

[1] https://www.postgresql.org/message-id/CAApHDvqt5U6VcKSm2G9Q1n4rsHejL-VX7QG9KToAQ0HyZymSzQ@mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Rahila Syed
Date:
Subject: Re: More tests with USING INDEX replident and dropped indexes
Next
From: Hao Wu
Date:
Subject: Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY