Home > mailing lists

RE: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From	Tang, Haiying
Subject	RE: Parallel Inserts in CREATE TABLE AS
Date	January 27, 2021 07:55:33
Msg-id	1f854b422c7c45b1993a8588c312a30d@G08CNEXMBPEKD05.g08.fujitsu.local Whole thread
In response to	RE: Parallel Inserts in CREATE TABLE AS ("Tang, Haiying" <tanghy.fnst@cn.fujitsu.com>)
Responses	Re: Parallel Inserts in CREATE TABLE AS
List	pgsql-hackers

Tree view

Hi Bharath,

I choose 5 cases which pick parallel insert plan in CTAS to measure the patched performance. Each case run 30 times.

Most of the tests execution become faster with this patch.

However, Test NO 4(create table xxx as table xxx.) appears performance degradation. I tested various table size(2/10/20 millions), they all have a 6%-10% declines. I think it may need some check at this problem.

Below are my test results. 'Test NO' is corresponded to 'Test NO' in attached test_ctas.sql file.

reg%=(patched-master)/master

Test NO | Test Case |reg% | patched(ms) | master(ms)

--------|--------------------------------|------|--------------|-------------

1 | CTAS select from table | -9% | 16709.50477 | 18370.76660

2 | Append plan | -14% | 16542.97807 | 19305.86600

3 | initial plan under Gather node| -5% | 13374.27187 | 14120.02633

4 | CTAS table | 10% | 20835.48800 | 18986.40350

5 | CTAS select from execute | -6% | 16973.73890 | 18008.59789

About Test NO 4:

In master(HEAD), this test case picks serial seq scan.

query plan likes:

----------------------------------------------------------------------------------------------------------------------------------------------------------

Seq Scan on public.tenk1 (cost=0.00..444828.12 rows=10000012 width=244) (actual time=0.005..1675.268 rows=10000000 loops=1)

Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4 Planning Time: 0.053 ms Execution Time: 20165.023 ms

With this patch, it will choose parallel seq scan and parallel insert.

query plan likes:

----------------------------------------------------------------------------------------------------------------------------------------------------------

Gather (cost=1000.00..370828.03 rows=10000012 width=244) (actual time=20428.823..20437.143 rows=0 loops=1)

Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4

Workers Planned: 4

Workers Launched: 4

-> Create test

-> Parallel Seq Scan on public.tenk1 (cost=0.00..369828.03 rows=2500003 width=244) (actual time=0.021..411.094 rows=2000000 loops=5)

Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4

Worker 0: actual time=0.023..390.856 rows=1858407 loops=1

Worker 1: actual time=0.024..468.587 rows=2264494 loops=1

Worker 2: actual time=0.023..473.170 rows=2286580 loops=1

Worker 3: actual time=0.027..373.727 rows=1853216 loops=1 Planning Time: 0.053 ms Execution Time: 20437.643 ms

test machine spec:

CentOS 8.2, 128G RAM, 40 processors, disk SAS

Regards,

Tang

Attachment

test_ctas.sql

pgsql-hackers by date:

From: Michael Paquier
Date: 27 January 2021, 07:39:23
Subject: Re: Support for NSS as a libpq TLS backend

From: Keisuke Kuroda
Date: 27 January 2021, 07:58:44
Subject: Re: simplifying foreign key/RI checks

RE: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

Attachment

Previous

Next