Re: Parallel copy - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Parallel copy
Date
Msg-id CALj2ACXQPiPhqShCgdPM3_PZqydxPHvefg3E1N5kNBh2jD08mQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel copy  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Parallel copy  (Greg Nancarrow <gregn4422@gmail.com>)
Re: Parallel copy  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Thu, Sep 17, 2020 at 11:06 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Sep 16, 2020 at 1:20 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > Fortunately I have been given permission to share the exact table
> > definition and data I used, so you can check the behaviour and timings
> > on your own test machine.
> >
>
> Thanks Greg for the script. I ran your test case and I didn't observe
> any increase in exec time with 1 worker, indeed, we have benefitted a
> few seconds from 0 to 1 worker as expected.
>
> Execution time is in seconds. Each test case is executed 3 times on
> release build. Each time the data directory is recreated.
>
> Case 1: 1000000 rows, 2GB
> With Patch, default configuration, 0 worker: 88.933, 92.261, 88.423
> With Patch, default configuration, 1 worker: 73.825, 74.583, 72.678
>
> With Patch, custom configuration, 0 worker: 76.191, 78.160, 78.822
> With Patch, custom configuration, 1 worker: 61.289, 61.288, 60.573
>
> Case 2: 2550000 rows, 5GB
> With Patch, default configuration, 0 worker: 246.031, 188.323, 216.683
> With Patch, default configuration, 1 worker: 156.299, 153.293, 170.307
>
> With Patch, custom configuration, 0 worker: 197.234, 195.866, 196.049
> With Patch, custom configuration, 1 worker: 157.173, 158.287, 157.090
>

Hi Greg,

If you still observe the issue in your testing environment, I'm attaching a testing patch(that applies on top of the latest parallel copy patch set i.e. v5 1 to 6) to capture various timings such as total copy time in leader and worker, index and table insertion time, leader and worker waiting time. These logs are shown in the server log file.

Few things to follow before testing:
1. Is the table being dropped/truncated after the test with 0 workers and before running with 1 worker? If not, then the index insertion time would increase.[1](for me it is increasing by 10 sec). This is obvious because the 1st time index will be created from bottom up manner(from leaves to root), but for the 2nd time it has to search and insert at the proper leaves and inner B+Tree nodes.
2. If possible, can you also run with custom postgresql.conf settings[2] along with default? Just to ensure that other bg processes such as checkpointer, autovacuum, bgwriter etc. don't affect our testcase. For instance, with default postgresql.conf file, it looks like checkpointing[3] is happening frequently, could you please let us know if that happens at your end?
3. Could you please run the test case 3 times at least? Just to ensure the consistency of the issue.
4. I ran the tests in a performance test system where no other user processes(except system processes) are running. Is it possible for you to do the same?

Please capture and share the timing logs with us.

Here's a snapshot of how the added timings show up in the logs: ( I captured this with your test case case 1: 1000000 rows, 2GB, custom postgresql.conf file settings[2]).
with 0 workers:
2020-09-22 10:49:27.508 BST [163910] LOG:  totaltableinsertiontime = 24072.034 ms
2020-09-22 10:49:27.508 BST [163910] LOG:  totalindexinsertiontime = 60.682 ms
2020-09-22 10:49:27.508 BST [163910] LOG:  totalcopytime = 59664.594 ms

with 1 worker:
2020-09-22 10:53:58.409 BST [163947] LOG:  totalcopyworkerwaitingtime = 59.815 ms
2020-09-22 10:53:58.409 BST [163947] LOG:  totaltableinsertiontime = 23585.881 ms
2020-09-22 10:53:58.409 BST [163947] LOG:  totalindexinsertiontime = 30.946 ms
2020-09-22 10:53:58.409 BST [163947] LOG:  totalcopytimeworker = 47047.956 ms
2020-09-22 10:53:58.429 BST [163946] LOG:  totalcopyleaderwaitingtime = 26746.744 ms
2020-09-22 10:53:58.429 BST [163946] LOG:  totalcopytime = 47150.002 ms

[1]
0 worker:
LOG:  totaltableinsertiontime = 25491.881 ms
LOG:  totalindexinsertiontime = 14136.104 ms
LOG:  totalcopytime = 75606.858 ms
table is not dropped and so are indexes
1 worker:
LOG:  totalcopyworkerwaitingtime = 64.582 ms
LOG:  totaltableinsertiontime = 21360.875 ms
LOG:  totalindexinsertiontime = 24843.570 ms
LOG:  totalcopytimeworker = 69837.162 ms
LOG:  totalcopyleaderwaitingtime = 49548.441 ms
LOG:  totalcopytime = 69997.778 ms

[2]
custom postgresql.conf configuration:
shared_buffers = 40GB
max_worker_processes = 32
max_parallel_maintenance_workers = 24
max_parallel_workers = 32
synchronous_commit = off
checkpoint_timeout = 1d
max_wal_size = 24GB
min_wal_size = 15GB
autovacuum = off

[3]
LOG:  checkpoints are occurring too frequently (14 seconds apart)
HINT:  Consider increasing the configuration parameter "max_wal_size".

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: OpenSSL 3.0.0 compatibility
Next
From: Dilip Kumar
Date:
Subject: Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.