Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Parallel Inserts in CREATE TABLE AS |
Date | |
Msg-id | CALj2ACXQ_wfQBzaVqrTa+Tg2PwVke78-Y_qQZYgVijtM1S_GEA@mail.gmail.com Whole thread Raw |
In response to | RE: Parallel Inserts in CREATE TABLE AS ("Tang, Haiying" <tanghy.fnst@cn.fujitsu.com>) |
Responses |
Re: Parallel Inserts in CREATE TABLE AS
|
List | pgsql-hackers |
On Wed, Jan 27, 2021 at 1:25 PM Tang, Haiying <tanghy.fnst@cn.fujitsu.com> wrote: > I choose 5 cases which pick parallel insert plan in CTAS to measure the patched performance. Each case run 30 times. > > Most of the tests execution become faster with this patch. > > However, Test NO 4(create table xxx as table xxx.) appears performance degradation. I tested various table size(2/10/20millions), they all have a 6%-10% declines. I think it may need some check at this problem. > > > > Below are my test results. 'Test NO' is corresponded to 'Test NO' in attached test_ctas.sql file. > > reg%=(patched-master)/master > > Test NO | Test Case |reg% | patched(ms) | master(ms) > > --------|--------------------------------|------|--------------|------------- > > 1 | CTAS select from table | -9% | 16709.50477 | 18370.76660 > > 2 | Append plan | -14% | 16542.97807 | 19305.86600 > > 3 | initial plan under Gather node| -5% | 13374.27187 | 14120.02633 > > 4 | CTAS table | 10% | 20835.48800 | 18986.40350 > > 5 | CTAS select from execute | -6% | 16973.73890 | 18008.59789 > > > > About Test NO 4: > > In master(HEAD), this test case picks serial seq scan. > > query plan likes: > > ---------------------------------------------------------------------------------------------------------------------------------------------------------- > > Seq Scan on public.tenk1 (cost=0.00..444828.12 rows=10000012 width=244) (actual time=0.005..1675.268 rows=10000000 loops=1) > > Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1,stringu2, string4 Planning Time: 0.053 ms Execution Time: 20165.023 ms > > > > With this patch, it will choose parallel seq scan and parallel insert. > > query plan likes: > > ---------------------------------------------------------------------------------------------------------------------------------------------------------- > > Gather (cost=1000.00..370828.03 rows=10000012 width=244) (actual time=20428.823..20437.143 rows=0 loops=1) > > Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1,stringu2, string4 > > Workers Planned: 4 > > Workers Launched: 4 > > -> Create test > > -> Parallel Seq Scan on public.tenk1 (cost=0.00..369828.03 rows=2500003 width=244) (actual time=0.021..411.094 rows=2000000loops=5) > > Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even,stringu1, stringu2, string4 > > Worker 0: actual time=0.023..390.856 rows=1858407 loops=1 > > Worker 1: actual time=0.024..468.587 rows=2264494 loops=1 > > Worker 2: actual time=0.023..473.170 rows=2286580 loops=1 > > Worker 3: actual time=0.027..373.727 rows=1853216 loops=1 Planning Time: 0.053 ms Execution Time: 20437.643ms > > > > test machine spec: > > CentOS 8.2, 128G RAM, 40 processors, disk SAS Thanks a lot for the performance tests and test cases. I will analyze why the performance is degrading one case and respond soon. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: