Re: parallel joins, and better parallel explain - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: parallel joins, and better parallel explain
Date
Msg-id CAFiTN-s1aBjTZdmNih_noiPm6TUQOV=soLgxcxKdGKjsq+-ZzQ@mail.gmail.com
Whole thread Raw
In response to Re: parallel joins, and better parallel explain  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: parallel joins, and better parallel explain  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

On Tue, Jan 5, 2016 at 1:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 4, 2016 at 4:50 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> I tried to create a inner table such that, inner table data don't fit in RAM
> (I created VM with 1GB Ram).
> Purpose of this is to make Disk scan dominant,
> and since parallel join is repeating the Disk Scan and hash table building
> of inner table, so there will be lot of Parallel I/O and it has to pay
> penalty.
>
> I think even though, inner table scanning and hash table building is
> parallel, but there will be lot of parallel I/O which will become
> bottleneck.

Hmm.  Because only 1/1024th of the hash table fits in work_mem, the
executor is going to have to write out all of the tuples that don't
belong to the first batch to a temporary file and then read them back
in.  So each backend is going to write essentially the entirety of t2
out to disk and then read it all back in again. The non-parallel case
will also write most of the table contents and then read them back in,
but at least it will only be doing that once rather than 7 times, so
it's not as bad.  Also, with fewer backends running, the non-parallel
case will have a bit more memory free for caching purposes.

> Do we need to consider the cost for parallel i/o also, i am not sure can we
> really do that... ?

It seems to me that the problem here is that you've set
max_parallel_degree to an unrealistically high value.  The query
planner is entitled to assume that the user has set
max_parallel_degree to a value which is small enough that the workers
won't be fighting too viciously with each other over resources.  It
doesn't really matter whether those resources are CPU resources or I/O
resources.  I'm wondering if your 1GB VM really even has as many as 7
vCPUs, because that would seem to be something of an unusual
configuration - and if it doesn't, then setting max_parallel_degree to
a value that high is certainly user error. Even if it does, it's still
not right to set the value as high as six unless the system also has
enough I/O bandwidth to accommodate the amount of I/O that you expect
your queries to generate, and here it seems like it probably doesn't.

To put that another way, you can always make parallel query perform
badly by telling it to use too many workers relative to the size of
the machine you have. This is no different than getting bad query
plans by configuring work_mem or effective_cache_size or any other
query planner GUC to a value that doesn't reflect the actual execution
environment.  I would only consider this to be a problem with the
parallel join patch if the chosen plan is slower even on a machine
that's big enough to justify setting max_parallel_degree=6 in the
first place.


Yes, it's valid point... I have configured 6Processor for the virtual machine but that will be with HT.
 So this time i have configured 8 processor and taken performance again with less number of parallel degree.

Even though with less paralllel degree there is some regression, but still as you mentioned there can be some other limitation like i am configuring Disk of 50GB and filling 20GB with data.

I think you are right, before coming to any conclusion, we need to test on really high end machine where machine itself don't have any resource constraint.

In 1GB RAM
8Processor VM ( Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz) --> This machine i7, so i doubt it's really using 8 cores, so i tested with less parallel degree.
SSD: 50GB


postgres=# set max_parallel_degree=3;
postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100;
                                                                   QUERY PLAN                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=7920946.47..7920946.48 rows=1 width=0) (actual time=162329.829..162329.829 rows=1 loops=1)
   ->  Gather  (cost=1527963.25..7880946.39 rows=16000033 width=0) (actual time=58233.106..159140.629 rows=47999950 loops=1)
         Number of Workers: 3
         ->  Hash Join  (cost=1526963.25..6279943.09 rows=16000033 width=0) (actual time=58346.087..144309.987 rows=11999988 loops=4)
               Hash Cond: (t1.c1 = t2.c1)
               Join Filter: ((t2.c2 + t1.c1) > 100)
               Rows Removed by Join Filter: 12
               ->  Parallel Seq Scan on t1  (cost=0.00..2064959.01 rows=32259701 width=4) (actual time=98.514..27003.566 rows=25000000 loops=4)
               ->  Hash  (cost=739461.00..739461.00 rows=48000100 width=8) (actual time=58012.228..58012.228 rows=48000000 loops=4)
                     Buckets: 131072  Batches: 1024  Memory Usage: 2856kB
                     ->  Seq Scan on t2  (cost=0.0po0..739461.00 rows=48000100 width=8) (actual time=3.524..9634.181 rows=48000000 loops=4)
 Planning time: 1.945 ms
 Execution time: 162330.657 ms

 postgres=# set max_parallel_degree=2;                                                              
SET
postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100;
                                                                   QUERY PLAN                                                                    
-------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=8744354.81..8744354.82 rows=1 width=0) (actual time=133715.245..133715.245 rows=1 loops=1)
   ->  Gather  (cost=1527963.25..8704354.73 rows=16000033 width=0) (actual time=49240.892..130699.685 rows=47999950 loops=1)
         Number of Workers: 2
         ->  Hash Join  (cost=1526963.25..7103351.43 rows=16000033 width=0) (actual time=48916.074..116934.088 rows=15999983 loops=3)
               Hash Cond: (t1.c1 = t2.c1)
               Join Filter: ((t2.c2 + t1.c1) > 100)
               Rows Removed by Join Filter: 17
               ->  Parallel Seq Scan on t1  (cost=0.00..2159049.80 rows=41668780 width=4) (actual time=106.882..22650.646 rows=33333333 loops=3)
               ->  Hash  (cost=739461.00..739461.00 rows=48000100 width=8) (actual time=48670.370..48670.370 rows=48000000 loops=3)
                     Buckets: 131072  Batches: 1024  Memory Usage: 2856kB
                     ->  Seq Scan on t2  (cost=0.00..739461.00 rows=48000100 width=8) (actual time=0.618..7908.589 rows=48000000 loops=3)
 Planning time: 0.380 ms
 Execution time: 133715.932 ms
(13 rows)

postgres=# set max_parallel_degree=0;
SET
postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100;
                                                             QUERY PLAN                                                            
------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=12248485.55..12248485.56 rows=1 width=0) (actual time=92297.234..92297.234 rows=1 loops=1)
   ->  Hash Join  (cost=1526963.25..12208485.47 rows=16000033 width=0) (actual time=15739.911..89627.652 rows=47999950 loops=1)
         Hash Cond: (t1.c1 = t2.c1)
         Join Filter: ((t2.c2 + t1.c1) > 100)
         Rows Removed by Join Filter: 50
         ->  Seq Scan on t1  (cost=0.00..2742412.72 rows=100005072 width=4) (actual time=127.260..24826.175 rows=100000000 loops=1)
         ->  Hash  (cost=739461.00..739461.00 rows=48000100 width=8) (actual time=15560.002..15560.002 rows=48000000 loops=1)
               Buckets: 131072  Batches: 1024  Memory Usage: 2856kB
               ->  Seq Scan on t2  (cost=0.00..739461.00 rows=48000100 width=8) (actual time=0.834..6199.727 rows=48000000 loops=1)
 Planning time: 0.244 ms
 Execution time: 92298.000 ms
(11 rows)

One strange behaviour, after increasing number of processor for VM, max_parallel_degree=0; is also performing better.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Making tab-complete.c easier to maintain
Next
From: Amit Langote
Date:
Subject: Re: Accessing non catalog table in backend