Re: Parallel copy - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Parallel copy |
Date | |
Msg-id | CALj2ACWeQVd-xoQZHGT01_33St4xPoZQibWz46o7jW1PE3XOqQ@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel copy (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
I did performance testing on v7 patch set[1] with custom postgresql.conf[2]. The results are of the triplet form (exec time in sec, number of workers, gain) Use case 1: 10million rows, 5.2GB data, 2 indexes on integer columns, 1 index on text column, binary file (1104.898, 0, 1X), (1112.221, 1, 1X), (640.236, 2, 1.72X), (335.090, 4, 3.3X), (200.492, 8, 5.51X), (131.448, 16, 8.4X), (121.832, 20, 9.1X), (124.287, 30, 8.9X) Use case 2: 10million rows, 5.2GB data,2 indexes on integer columns, 1 index on text column, copy from stdin, csv format (1203.282, 0, 1X), (1135.517, 1, 1.06X), (655.140, 2, 1.84X), (343.688, 4, 3.5X), (203.742, 8, 5.9X), (144.793, 16, 8.31X), (133.339, 20, 9.02X), (136.672, 30, 8.8X) Use case 3: 10million rows, 5.2GB data,2 indexes on integer columns, 1 index on text column, text file (1165.991, 0, 1X), (1128.599, 1, 1.03X), (644.793, 2, 1.81X), (342.813, 4, 3.4X), (204.279, 8, 5.71X), (139.986, 16, 8.33X), (128.259, 20, 9.1X), (132.764, 30, 8.78X) Above results are similar to the results with earlier versions of the patch set. On Fri, Oct 9, 2020 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Sure, you need to change the code such that when force_parallel_mode = > 'regress' is specified then it always uses one worker. This is > primarily for testing purposes and will help during the development of > this patch as it will make all exiting Copy tests to use quite a good > portion of the parallel infrastructure. > I performed force_parallel_mode = regress testing and found 2 issues, the fixes for the same are available in v7 patch set[1]. > > > Overall, we have below test cases to cover the code and for performance measurements. We plan to run these tests whenevera new set of patches is posted. > > > > 1. csv > > 2. binary > > Don't we need the tests for plain text files as well? > I added a text use case and above mentioned are perf results on v7 patch set[1]. > > > 3. force parallel mode = regress > > 4. toast data csv and binary > > 5. foreign key check, before row, after row, before statement, after statement, instead of triggers > > 6. partition case > > 7. foreign partitions and partitions having trigger cases > > 8. where clause having parallel unsafe and safe expression, default parallel unsafe and safe expression > > 9. temp, global, local, unlogged, inherited tables cases, foreign tables > > > > Sounds like good coverage. So, are you doing all this testing > manually? How are you maintaining these tests? > All test cases listed above, except for the cases that are meant to measure perf gain with huge data, are present in v7-0005 patch in v7 patch set[1]. [1] https://www.postgresql.org/message-id/CALDaNm1n1xW43neXSGs%3Dc7zt-mj%2BJHHbubWBVDYT9NfCoF8TuQ%40mail.gmail.com [2] shared_buffers = 40GB max_worker_processes = 32 max_parallel_maintenance_workers = 24 max_parallel_workers = 32 synchronous_commit = off checkpoint_timeout = 1d max_wal_size = 24GB min_wal_size = 15GB autovacuum = off With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: