Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
From | Greg Nancarrow |
---|---|
Subject | Re: Parallel INSERT (INTO ... SELECT ...) |
Date | |
Msg-id | CAJcOf-dCT3QXGpmrWRM4P0y5Pt5PYGe6jmQ+ZLWteGvfdFLCoA@mail.gmail.com Whole thread Raw |
In response to | RE: Parallel INSERT (INTO ... SELECT ...) ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>) |
List | pgsql-hackers |
On Mon, Feb 8, 2021 at 8:13 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote: > > > > Did it actually use a parallel plan in your testing? > > > When I ran these tests with the Parallel INSERT patch applied, it did > > > not naturally choose a parallel plan for any of these cases. > > > > Yes, these cases pick parallel plan naturally on my test environment. > > > > postgres=# explain verbose insert into testscan select a from x where > > a<80000 or (a%2=0 and a>199900000); > > QUERY PLAN > > ---------------------------------------------------------------------- > > ----------------------------- > > Gather (cost=4346.89..1281204.64 rows=81372 width=0) > > Workers Planned: 4 > > -> Insert on public.testscan (cost=3346.89..1272067.44 rows=0 > > width=0) > > -> Parallel Bitmap Heap Scan on public.x1 > > (cost=3346.89..1272067.44 rows=20343 width=8) > > Output: x1.a, NULL::integer > > Recheck Cond: ((x1.a < 80000) OR (x1.a > 199900000)) > > Filter: ((x1.a < 80000) OR (((x1.a % 2) = 0) AND (x1.a > > > 199900000))) > > -> BitmapOr (cost=3346.89..3346.89 rows=178808 > > width=0) > > -> Bitmap Index Scan on x1_a_idx > > (cost=0.00..1495.19 rows=80883 width=0) > > Index Cond: (x1.a < 80000) > > -> Bitmap Index Scan on x1_a_idx > > (cost=0.00..1811.01 rows=97925 width=0) > > Index Cond: (x1.a > 199900000) > > > > PSA is my postgresql.conf file, maybe you can have a look. Besides, I didn't > > do any parameters tuning in my test session. > > I reproduced this on my machine. > > I think we'd better do "analyze" before insert which helps reproduce this easier. > Like: > > ----- > analyze; > explain analyze verbose insert into testscan select a from x where a<80000 or (a%2=0 and a>199900000); > ----- > Thanks, I tried test_bimap.sql in my own environment, and added "analyze", and I also found it naturally chose a parallel INSERT with parallel bitmap heap scan for each of these cases. However, I didn't see any performance degradation when compared against serial INSERT with bitmap heap scan. The parallel plan in these cases seems to run a bit faster. (Note that I'm using a release build of Postgres, and using default postgresql.conf) test=# set max_parallel_workers_per_gather=4; SET test=# explain analyze verbose insert into testscan select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Gather (cost=4193.29..1255440.94 rows=74267 width=0) (actual time=210.587..212.135 rows=0 loops=1) Workers Planned: 4 Workers Launched: 4 -> Insert on public.testscan (cost=3193.29..1247014.24 rows=0 width=0) (actual time=195.296..195.298 rows=0 loops=5) Worker 0: actual time=189.512..189.514 rows=0 loops=1 Worker 1: actual time=194.843..194.844 rows=0 loops=1 Worker 2: actual time=193.986..193.988 rows=0 loops=1 Worker 3: actual time=188.035..188.037 rows=0 loops=1 -> Parallel Bitmap Heap Scan on public.x (cost=3193.29..1247014.24 rows=18567 width=8) (actual time=7.992..25.837 row s=26000 loops=5) Output: x.a, NULL::integer Recheck Cond: ((x.a < 80000) OR (x.a > 199900000)) Filter: ((x.a < 80000) OR (((x.a % 2) = 0) AND (x.a > 199900000))) Rows Removed by Filter: 10000 Heap Blocks: exact=261 Worker 0: actual time=1.473..14.458 rows=22465 loops=1 Worker 1: actual time=7.370..31.359 rows=30525 loops=1 Worker 2: actual time=8.765..19.838 rows=18549 loops=1 Worker 3: actual time=0.279..17.269 rows=23864 loops=1 -> BitmapOr (cost=3193.29..3193.29 rows=170535 width=0) (actual time=21.775..21.777 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1365.94 rows=73783 width=0) (actual time=11.961..11.961 rows= 79999 loops=1) Index Cond: (x.a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1790.21 rows=96752 width=0) (actual time=9.809..9.809 rows=10 0000 loops=1) Index Cond: (x.a > 199900000) Planning Time: 0.276 ms Execution Time: 212.189 ms (25 rows) test=# truncate testscan; TRUNCATE TABLE test=# set max_parallel_workers_per_gather=0; SET test=# explain analyze verbose insert into testscan select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Insert on public.testscan (cost=3193.29..3625636.35 rows=0 width=0) (actual time=241.222..241.224 rows=0 loops=1) -> Bitmap Heap Scan on public.x (cost=3193.29..3625636.35 rows=74267 width=8) (actual time=16.945..92.392 rows=129999 loops =1) Output: x.a, NULL::integer Recheck Cond: ((x.a < 80000) OR (x.a > 199900000)) Filter: ((x.a < 80000) OR (((x.a % 2) = 0) AND (x.a > 199900000))) Rows Removed by Filter: 50000 Heap Blocks: exact=975 -> BitmapOr (cost=3193.29..3193.29 rows=170535 width=0) (actual time=16.735..16.736 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1365.94 rows=73783 width=0) (actual time=9.222..9.223 rows=79999 lo ops=1) Index Cond: (x.a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1790.21 rows=96752 width=0) (actual time=7.511..7.511 rows=100000 l oops=1) Index Cond: (x.a > 199900000) Planning Time: 0.205 ms Execution Time: 241.274 ms (14 rows) ============ test=# set max_parallel_workers_per_gather=4; SET test=# explain analyze verbose insert into testscan_pk select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Gather (cost=4193.29..1255440.94 rows=74267 width=0) (actual time=572.242..573.683 rows=0 loops=1) Workers Planned: 4 Workers Launched: 4 -> Insert on public.testscan_pk (cost=3193.29..1247014.24 rows=0 width=0) (actual time=566.303..566.308 rows=0 loops=5) Worker 0: actual time=566.756..566.757 rows=0 loops=1 Worker 1: actual time=564.778..564.779 rows=0 loops=1 Worker 2: actual time=564.402..564.419 rows=0 loops=1 Worker 3: actual time=563.748..563.749 rows=0 loops=1 -> Parallel Bitmap Heap Scan on public.x (cost=3193.29..1247014.24 rows=18567 width=8) (actual time=16.479..37.327 ro ws=26000 loops=5) Output: x.a, NULL::integer Recheck Cond: ((x.a < 80000) OR (x.a > 199900000)) Filter: ((x.a < 80000) OR (((x.a % 2) = 0) AND (x.a > 199900000))) Rows Removed by Filter: 10000 Heap Blocks: exact=204 Worker 0: actual time=17.358..36.895 rows=24233 loops=1 Worker 1: actual time=12.711..33.538 rows=25616 loops=1 Worker 2: actual time=15.671..35.701 rows=24831 loops=1 Worker 3: actual time=17.656..39.310 rows=26645 loops=1 -> BitmapOr (cost=3193.29..3193.29 rows=170535 width=0) (actual time=18.541..18.542 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1365.94 rows=73783 width=0) (actual time=8.549..8.549 rows=79 999 loops=1) Index Cond: (x.a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1790.21 rows=96752 width=0) (actual time=9.990..9.990 rows=10 0000 loops=1) Index Cond: (x.a > 199900000) Planning Time: 0.240 ms Execution Time: 573.733 ms (25 rows) test=# set max_parallel_workers_per_gather=0; SET test=# truncate testscan_pk; TRUNCATE TABLE test=# explain analyze verbose insert into testscan_pk select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Insert on public.testscan_pk (cost=3193.29..3625636.35 rows=0 width=0) (actual time=598.997..598.998 rows=0 loops=1) -> Bitmap Heap Scan on public.x (cost=3193.29..3625636.35 rows=74267 width=8) (actual time=20.153..96.858 rows=129999 loops =1) Output: x.a, NULL::integer Recheck Cond: ((x.a < 80000) OR (x.a > 199900000)) Filter: ((x.a < 80000) OR (((x.a % 2) = 0) AND (x.a > 199900000))) Rows Removed by Filter: 50000 Heap Blocks: exact=975 -> BitmapOr (cost=3193.29..3193.29 rows=170535 width=0) (actual time=19.840..19.841 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1365.94 rows=73783 width=0) (actual time=9.276..9.276 rows=79999 lo ops=1) Index Cond: (x.a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1790.21 rows=96752 width=0) (actual time=10.562..10.562 rows=100000 loops=1) Index Cond: (x.a > 199900000) Planning Time: 0.204 ms Execution Time: 599.098 ms (14 rows) ============ test=# set max_parallel_workers_per_gather=4; SET test=# explain analyze verbose insert into testscan_index select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Gather (cost=4193.29..1255440.94 rows=74267 width=0) (actual time=560.460..562.386 rows=0 loops=1) Workers Planned: 4 Workers Launched: 4 -> Insert on public.testscan_index (cost=3193.29..1247014.24 rows=0 width=0) (actual time=553.434..553.435 rows=0 loops=5) Worker 0: actual time=548.751..548.752 rows=0 loops=1 Worker 1: actual time=552.008..552.009 rows=0 loops=1 Worker 2: actual time=553.094..553.095 rows=0 loops=1 Worker 3: actual time=553.389..553.390 rows=0 loops=1 -> Parallel Bitmap Heap Scan on public.x (cost=3193.29..1247014.24 rows=18567 width=8) (actual time=13.759..34.487 ro ws=26000 loops=5) Output: x.a, NULL::integer Recheck Cond: ((x.a < 80000) OR (x.a > 199900000)) Filter: ((x.a < 80000) OR (((x.a % 2) = 0) AND (x.a > 199900000))) Rows Removed by Filter: 10000 Heap Blocks: exact=183 Worker 0: actual time=8.698..29.924 rows=26173 loops=1 Worker 1: actual time=12.865..33.889 rows=27421 loops=1 Worker 2: actual time=13.088..32.823 rows=24591 loops=1 Worker 3: actual time=14.075..36.349 rows=26571 loops=1 -> BitmapOr (cost=3193.29..3193.29 rows=170535 width=0) (actual time=19.356..19.357 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1365.94 rows=73783 width=0) (actual time=10.330..10.330 rows= 79999 loops=1) Index Cond: (x.a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1790.21 rows=96752 width=0) (actual time=9.024..9.024 rows=10 0000 loops=1) Index Cond: (x.a > 199900000) Planning Time: 0.219 ms Execution Time: 562.442 ms (25 rows) test=# set max_parallel_workers_per_gather=0; SET test=# truncate testscan_index; TRUNCATE TABLE test=# explain analyze verbose insert into testscan_index select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Insert on public.testscan_index (cost=3193.29..3625636.35 rows=0 width=0) (actual time=607.619..607.621 rows=0 loops=1) -> Bitmap Heap Scan on public.x (cost=3193.29..3625636.35 rows=74267 width=8) (actual time=21.001..96.283 rows=129999 loops =1) Output: x.a, NULL::integer Recheck Cond: ((x.a < 80000) OR (x.a > 199900000)) Filter: ((x.a < 80000) OR (((x.a % 2) = 0) AND (x.a > 199900000))) Rows Removed by Filter: 50000 Heap Blocks: exact=975 -> BitmapOr (cost=3193.29..3193.29 rows=170535 width=0) (actual time=20.690..20.691 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1365.94 rows=73783 width=0) (actual time=9.097..9.097 rows=79999 lo ops=1) Index Cond: (x.a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1790.21 rows=96752 width=0) (actual time=11.591..11.591 rows=100000 loops=1) Index Cond: (x.a > 199900000) Planning Time: 0.205 ms Execution Time: 607.734 ms (14 rows) Even when I changed the queries to return more rows from the scan, to the point where it chose not to use a parallel INSERT bitmap heap scan (in favour of parallel seq scan), and then forced it to by disabling seqscan, I found that it was still at least as fast as serial INSERT with bitmap heap scan. Regards, Greg Nancarrow Fujitsu Australia
pgsql-hackers by date: