Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1+O2fZe88NpxmMne94bwBKUEcx-2yuAe-MTYpbuoJ0NCw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Parallel Seq Scan  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Mon, Dec 8, 2014 at 10:40 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Dec 6, 2014 at 5:37 PM, Stephen Frost <sfrost@snowman.net> wrote:
> >
>
> So to summarize my understanding, below are the set of things
> which I should work on and in the order they are listed.
>
> 1. Push down qualification
> 2. Performance Data
> 3. Improve the way to push down the information related to worker.
> 4. Dynamic allocation of work for workers.
>
>

I have worked on the patch to accomplish above mentioned points
1, 2 and partly 3 and would like to share the progress with community.
If the statement contain quals that don't have volatile functions, then
they will be pushed down and the parallel can will be considered for
cost evaluation.  I think eventually we might need some better way
to decide about which kind of functions are okay to be pushed.
I have also unified the way information is passed from master backend
to worker backends which is convert each node to string that has to be
passed and then later workers convert string to node, this has simplified
the related code.

I have taken performance data for different selectivity and complexity of
qual expressions, I understand that there will be other kind of scenario's
which we need to consider, however I think the current set of tests is good
place to start, please feel free to comment on kind of scenario's which you
want me to check

Performance Data
------------------------------
m/c details
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB
non-default settings in postgresql.conf
max_connections=300
shared_buffers = 8GB
checkpoint_segments = 300
checkpoint_timeout = 30min
max_worker_processes=100

create table tbl_perf(c1 int, c2 char(1000));


30 million rows
------------------------
insert into tbl_perf values(generate_series(1,10000000),'aaaaa');
insert into tbl_perf values(generate_series(10000000,30000000),'aaaaa');
 
Function used in quals
-----------------------------------
A simple function which will perform some calculation and return
the value passed which can be used in qual condition.


create or replace function calc_factorial(a integer, fact_val integer) returns integer
as $$
begin
perform (fact_val)!;
        return a;
end;
$$ language plpgsql STABLE;    

In below data, 
num_workers - number of parallel workers configured using
parallel_seqscan_degree. 0, means it will execute sequence
scan and greater than 0 means parallel sequence scan.

exec_time - Execution Time given by Explain Analyze statement.

Tests having quals containing function evaluation in qual
expressions.

Test-1
Query - Explain analyze select c1 from tbl_perf where 
            c1 > calc_factorial(29700000,10) and c2 like '%aa%';
Selection_criteria – 1% of rows will be selected

num_workersexec_time (ms)
0229534
2121741
467051
835607
1624743


Test-2
Query - Explain analyze select c1 from tbl_perf where
 c1 > calc_factorial(27000000,10) and c2 like '%aa%';
Selection_criteria – 10% of rows will be selected

num_workersexec_time (ms)
0226671
2151587
493648
870540
1655466


Test-3
Query - Explain analyze select c1 from tbl_perf 
where c1 > calc_factorial(22500000,10) and c2 like '%aa%';
Selection_criteria – 25% of rows will be selected

num_workersexec_time (ms)
0232673
2197609
4142686
8111664
1698097


Tests having quals containing simple expressions in qual.

Test-4
Query - Explain analyze select c1 from tbl_perf
where c1 > 29700000 and c2 like '%aa%';
Selection_criteria – 1% of rows will be selected

num_workersexec_time (ms)
015505
29155
46030
84523
164459
328259
6413388

Test-5
Query - Explain analyze select c1 from tbl_perf
where c1 > 28500000 and c2 like '%aa%';
Selection_criteria – 5% of rows will be selected

num_workersexec_time (ms)
018906
213446
48970
87887
1610403

Test-6
Query - Explain analyze select c1 from tbl_perf
where c1 > 27000000 and c2 like '%aa%';
Selection_criteria – 10% of rows will be selected

num_workersexec_time (ms)
016132
223780
420275
811390
1611418


Conclusion
------------------
1. Parallel workers help a lot when there is an expensive qualification
to evaluated, the more expensive the qualification the more better are
results.
2. It works well for low selectivity quals and as the selectivity increases,
the benefit tends to go down due to additional tuple communication cost
between workers and master backend.
3. After certain point, increasing having more number of workers won't
help and rather have negative impact, refer Test-4.

I think as discussed previously we need to introduce 2 additional cost
variables (parallel_startup_cost, cpu_tuple_communication_cost) to
estimate the parallel seq scan cost so that when the tables are small
or selectivity is high, it should increase the cost of parallel plan.

Thoughts and feedback for the current state of patch is welcome. 


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: exitArchiveRecovery woes
Next
From: Heikki Linnakangas
Date:
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}