Re: Introducing coarse grain parallelism by postgres_fdw. - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Introducing coarse grain parallelism by postgres_fdw. |
Date | |
Msg-id | CAFjFpRe3DMxnqqmL09yeyuSmtMLnJ8saDkwVKQdB2TOwppqg0w@mail.gmail.com Whole thread Raw |
In response to | Re: Introducing coarse grain parallelism by postgres_fdw. (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Introducing coarse grain parallelism by postgres_fdw.
Re: Introducing coarse grain parallelism by postgres_fdw. |
List | pgsql-hackers |
Hi Kyotaro,
I looked at the patches and felt that the approach taken here is too intrusive, considering that the feature is only for foreign scans.On Fri, Aug 1, 2014 at 2:48 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello,But now the effect of async execution of FETCH'es is omitted
> Hello, this is the new version which is complete to some extent
> of parallelism based on postgres_fdw.
>
> This compares the costs for parallel and non-parallel execution
> and choose parallel one if it is faster by some extent specified
> by GUCs. The attached files are,
>
> 0001_parallel_exec_planning_v0.patch:
> - PostgreSQL body stuff for parallel execution planning.
>
> 0002_enable_postgres_fdw_to_run_in_parallel_v0.patch:
> - postgres_fdw parallelization.
>
> 0003_file_fdw_changes_to_avoid_error.patch:
> - error avoidig stuff for file_fdw (not necessary for this patch)
>
> env.sql:
> - simple test script to try this patch.
>
> =====
>
> - planner stuff to handle cost of parallel execution. Including
> indication of parallel execution.
>
> - GUCs to control how easy to go parallel.
>
> parallel_cost_threshold is the threshold of path total cost
> where to enable parallel execution.
>
> prallel_ratio_threshond is the threshold of the ratio of
> parallel cost to non-parallel cost where to choose the
> parallel path.
>
> - postgres_fdw which can run in multiple sessions using snapshot
> export and fetches in parallel for foreign scans on dedicated
> connections.
during planning.
> foreign server has a new option 'max_aux_connections', which
> limits the number of connections for parallel execution per
> (server, user) pairs.
>
> - change file_fdw to follow the changes of planner stuff.
>
>
> Whth the patch attached, the attached sql script shows the
> following result (after some line breaks are added).
>
> postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> FROM fvs1 a join fvs1_2 b on (a.a = b.a);
> QUERY PLAN
> ----------------------------------------------------------------------------
> Hash Join (cost=9573392.96..9573393.34 rows=1 width=40 parallel)
> (actual time=2213.400..2213.407 rows=12 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a
> (cost=9573392.96..9573393.29 rows=10 width=8 parallel)
> (actual time=2199.992..2199.993 rows=10 loops=1)
> -> Hash (cost=9573393.29..9573393.29 rows=10 width=36)
> (actual time=13.388..13.388 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 6kB
> -> Foreign Scan on fvs1_2 b
> (cost=9573392.96..9573393.29 rows=10 width=36 parallel)
> (actual time=13.376..13.379 rows=10 loops=1)
> Planning time: 4.761 ms
> Execution time: 2227.462 ms
> (8 rows)
> postgres=# SET parallel_ratio_threshold to 0.0;
> postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> FROM fvs1 a join fvs1 b on (a.a = b.a);
> QUERY PLAN
> ------------------------------------------------------------------------------
> Hash Join (cost=318084.32..318084.69 rows=1 width=40)
> (actual time=4302.913..4302.928 rows=12 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a (cost=159041.93..159042.26 rows=10 width=8)
> (actual time=2122.989..2122.992 rows=10 loops=1)
> -> Hash (cost=159042.26..159042.26 rows=10 width=500)
> (actual time=2179.900..2179.900 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 6kB
> -> Foreign Scan on fvs1 b
> (cost=159041.93..159042.26 rows=10 width=500)
> (actual time=2179.856..2179.864 rows=10 loops=1)
> Planning time: 5.085 ms
> Execution time: 4303.728 ms
> (8 rows)
>
> Where, "parallel" indicates that the node includes nodes run in
> parallel. The latter EXPLAIN shows the result when parallel
> execution is inhibited.
>
> Since the lack of time, sorry that the details for this patch is
> comming later.
>
> Is there any suggestions or opinions?
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
pgsql-hackers by date: