Re: Hybrid Hash/Nested Loop joins and caching results from subplans - Mailing list pgsql-hackers

From David Rowley
Subject Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date
Msg-id CAApHDvoccsfKtZ1JDnwDvZTFkGFWJcVW2eCbwY6vN+5oTL2xjA@mail.gmail.com
Whole thread Raw
In response to Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (Andy Fan <zhihui.fan1213@gmail.com>)
Responses Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Mon, 9 Nov 2020 at 16:29, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> I think either version is OK for me and I like this patch overall.

That's good to know. Thanks.

>  However I believe v9
> should be no worse than v8 all the time,  Is there any theory to explain
> your result?

Nothing jumps out at me from looking at profiles.   The only thing I
noticed was the tuple deforming is more costly with v9. I'm not sure
why.

The other part of v9 that I don't have a good solution for yet is the
code around the swapping of the projection info for the Nested Loop.
The cache always uses a MinimalTupleSlot, but we may have a
VirtualSlot when we get a cache miss. If we do then we need to
initialise 2 different projection infos so when we project from the
cache that we have the step to deform the minimal tuple. That step is
not required when the inner slot is a virtual slot.

I did some further testing on performance. Basically, I increased the
size of the tests by 2 orders of magnitude. Instead of 100k rows, I
used 10million rows. (See attached
resultcache_master_vs_v8_vs_v9_big.csv)

Loading that in with:

# create table resultcache_bench2 (tbl text, target text, col text,
latency_master numeric(10,3), latency_v8 numeric(10,3), latency_v9
numeric(10,3));
# copy resultcache_bench2 from
'/path/to/resultcache_master_vs_v8_vs_v9_big.csv' with(format csv);

I see that v8 still wins.

postgres=# select round(avg(latency_v8/latency_master)*100,1) as
v8_vs_master, round(avg(latency_v9/latency_master)*100,1) as
v9_vs_master, round(avg(latency_v8/latency_v9)*100,1) as v8_vs_v9 from
resultcache_bench2;
 v8_vs_master | v9_vs_master | v8_vs_v9
--------------+--------------+----------
         56.7 |         58.8 |     97.3

Execution for all tests for v8 runs in 56.7% of master, but v9 runs in
58.8% of master's time. Full results in
resultcache_master_v8_vs_v9_big.txt.  v9 wins in 7 of 24 tests this
time. The best example test for v8 shows that v8 takes 90.6% of the
time of v9, but in the tests where v9 is faster, it only has a 4.3%
lead on v8 (95.7%). You can see that overall v8 is 2.7% faster than v9
for these tests.

David

Attachment

pgsql-hackers by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Error on failed COMMIT
Next
From: Thomas Munro
Date:
Subject: Re: -Wformat-signedness