Re: parallel joins, and better parallel explain - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: parallel joins, and better parallel explain
Date
Msg-id CAFiTN-uAkSaSKLZrzk8_VfQ64+9dX8i-w4SSkQGHMKxtQYzTbw@mail.gmail.com
Whole thread Raw
In response to Re: parallel joins, and better parallel explain  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: parallel joins, and better parallel explain  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Dec 24, 2015 at 4:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Dec 23, 2015 at 2:34 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> Yeah right, After applying all three patches this problem is fixed, now
> parallel hash join is faster than normal hash join.
>
> I have tested one more case which Amit mentioned, I can see in that case
> parallel plan (parallel degree>= 3) is still slow, In Normal case it selects
> "Hash Join" but in case of parallel worker > 3 it selects Parallel "Nest
> Loop Join" which is making it costlier.

While investigating this problem, I discovered that I can produce a
regression even on unpatched master:

yeah, right..
 
But this is not entirely the fault of the parallel query code.  If you
force a seqscan-over-seqscan plan in the non-parallel cast, it
estimates the join cost as 287772.00, only slightly more than the
261522.02 cost units it thinks a non-parallel hash join will cost.  In
fact, however, the non-parallel hash join runs in 1.2 seconds and the
non-parallel nested loop takes 46 seconds. 

right.. 
 

Updated patch attached.


Thanks for the updated patch...

I have found regression in one more scenario, in hash Join..

Scenario:
--------------
I tried to create a inner table such that, inner table data don't fit in RAM (I created VM with 1GB Ram).
Purpose of this is to make Disk scan dominant, 
and since parallel join is repeating the Disk Scan and hash table building of inner table, so there will be lot of Parallel I/O and it has to pay penalty.

I think even though, inner table scanning and hash table building is parallel, but there will be lot of parallel I/O which will become
bottleneck.

Do we need to consider the cost for parallel i/o also, i am not sure can we really do that... ?

Note: I have tested with 1GB RAM machine and 8GB RAM machine.
Regression in 8GB RAM machine is less compared to 1GB RAM..


create table t1 (c1 int, c2 int, c3 text);

create table t2 (c1 int, c2 int, c3 text);

insert into t1 values(generate_series(1,100000000), generate_series(1,100000000), repeat('x', 100));

insert into t2 values(generate_series(1,48000000), generate_series(1,48000000), repeat('x', 5));

analyze t1;

analyze t2;

Test with: 1GB RAM

-----------------------------

postgres=# set max_parallel_degree=0;

SET

postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100;

                                                             QUERY PLAN                                                              

-------------------------------------------------------------------------------------------------------------------------------------

 Aggregate  (cost=12248485.55..12248485.56 rows=1 width=0) (actual time=147490.455..147490.455 rows=1 loops=1)

   ->  Hash Join  (cost=1526963.25..12208485.47 rows=16000033 width=0) (actual time=26652.871..143368.989 rows=47999950 loops=1)

         Hash Cond: (t1.c1 = t2.c1)

         Join Filter: ((t2.c2 + t1.c1) > 100)

         Rows Removed by Join Filter: 50

         ->  Seq Scan on t1  (cost=0.00..2742412.72 rows=100005072 width=4) (actual time=130.580..40127.004 rows=100000000 loops=1)

         ->  Hash  (cost=739461.00..739461.00 rows=48000100 width=8) (actual time=26500.439..26500.439 rows=48000000 loops=1)

               Buckets: 131072  Batches: 1024  Memory Usage: 2856kB

               ->  Seq Scan on t2  (cost=0.00..739461.00 rows=48000100 width=8) (actual time=0.039..11402.343 rows=48000000 loops=1)

 Planning time: 0.410 ms

 Execution time: 147490.553 ms

(11 rows)

postgres=# set max_parallel_degree=6;

SET

postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100;

                                                                   QUERY PLAN                                                                   

------------------------------------------------------------------------------------------------------------------------------------------------

 Aggregate  (cost=4969933.98..4969933.99 rows=1 width=0) (actual time=386024.487..386024.488 rows=1 loops=1)

   ->  Gather  (cost=1527963.25..4929933.89 rows=16000033 width=0) (actual time=199190.138..379487.861 rows=47999950 loops=1)

         Number of Workers: 6

         ->  Hash Join  (cost=1526963.25..3328930.59 rows=16000033 width=0) (actual time=178885.161..320724.381 rows=6857136 loops=7)

               Hash Cond: (t1.c1 = t2.c1)

               Join Filter: ((t2.c2 + t1.c1) > 100)

               Rows Removed by Join Filter: 7

               ->  Parallel Seq Scan on t1  (cost=0.00..421909.65 rows=15385396 width=4) (actual time=106.403..11735.643 rows=14285714 loops=7)

               ->  Hash  (cost=739461.00..739461.00 rows=48000100 width=8) (actual time=177959.433..177959.433 rows=48000000 loops=7)

                     Buckets: 131072  Batches: 1024  Memory Usage: 2856kB

                     ->  Seq Scan on t2  (cost=0.00..739461.00 rows=48000100 width=8) (actual time=0.022..20778.693 rows=48000000 loops=7)

 Planning time: 0.372 ms

 Execution time: 386025.056 ms


Test with 8GB RAM:

---------------------------

postgres=# set max_parallel_degree=0;

SET

postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100;

                                                             QUERY PLAN                                                             

------------------------------------------------------------------------------------------------------------------------------------

 Aggregate  (cost=12229853.83..12229853.84 rows=1 width=0) (actual time=111113.286..111113.286 rows=1 loops=1)

   ->  Hash Join  (cost=1526963.25..12189853.75 rows=16000033 width=0) (actual time=15830.319..108557.658 rows=47999950 loops=1)

         Hash Cond: (t1.c1 = t2.c1)

         Join Filter: ((t2.c2 + t1.c1) > 100)

         Rows Removed by Join Filter: 50

         ->  Seq Scan on t1  (cost=0.00..2724138.00 rows=100000000 width=4) (actual time=3.515..43207.798 rows=100000000 loops=1)

         ->  Hash  (cost=739461.00..739461.00 rows=48000100 width=8) (actual time=15436.088..15436.088 rows=48000000 loops=1)

               Buckets: 131072  Batches: 1024  Memory Usage: 2856kB

               ->  Seq Scan on t2  (cost=0.00..739461.00 rows=48000100 width=8) (actual time=0.677..6290.310 rows=48000000 loops=1)

 Planning time: 0.287 ms

 Execution time: 111113.358 ms

(11 rows)


postgres=# set max_parallel_degree=6;

SET

postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100;

                                                                  QUERY PLAN                                                                   

-----------------------------------------------------------------------------------------------------------------------------------------------

 Aggregate  (cost=6538149.22..6538149.23 rows=1 width=0) (actual time=172636.184..172636.184 rows=1 loops=1)

   ->  Gather  (cost=1527963.25..6498149.14 rows=16000033 width=0) (actual time=40952.576..168973.552 rows=47999950 loops=1)

         Number of Workers: 6

         ->  Hash Join  (cost=1526963.25..4897145.84 rows=16000033 width=0) (actual time=41109.818..151129.893 rows=6857136 loops=7)

               Hash Cond: (t1.c1 = t2.c1)

               Join Filter: ((t2.c2 + t1.c1) > 100)

               Rows Removed by Join Filter: 7

               ->  Parallel Seq Scan on t1  (cost=0.00..1890804.67 rows=16666667 width=4) (actual time=0.492..86241.998 rows=14285714 loops=7)

               ->  Hash  (cost=739461.00..739461.00 rows=48000100 width=8) (actual time=40936.920..40936.920 rows=48000000 loops=7)

                     Buckets: 131072  Batches: 1024  Memory Usage: 2856kB

                     ->  Seq Scan on t2  (cost=0.00..739461.00 rows=48000100 width=8) (actual time=0.024..22644.484 rows=48000000 loops=7)

 Planning time: 2.668 ms

 Execution time: 172636.647 ms

(13 rows)


-- 

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Multi-tenancy with RLS
Next
From: "Shulgin, Oleksandr"
Date:
Subject: Re: \x auto and EXPLAIN