RE: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
From | Hou, Zhijie |
---|---|
Subject | RE: Parallel INSERT (INTO ... SELECT ...) |
Date | |
Msg-id | f47a6023dde14a4bb09a01bc17dfc41c@G08CNEXMBPEKD05.g08.fujitsu.local Whole thread Raw |
In response to | RE: Parallel INSERT (INTO ... SELECT ...) ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>) |
Responses |
RE: Parallel INSERT (INTO ... SELECT ...)
|
List | pgsql-hackers |
> > ---------------------- > > postgres=# select pg_size_pretty(pg_indexes_size('testscan_index')); > > pg_size_pretty > > ---------------- > > 4048 kB > > (1 row) > > > > postgres=# select pg_size_pretty(pg_relation_size('testscan_index')); > > pg_size_pretty > > ---------------- > > 4768 kB > > (1 row) > > Which of the above shows the table size? What does pg_indexes_size() > against an index (testscan_index) return? Sorry, Maybe the tablename is a little confused, but 'testscan_index' is actually a table's name. pg_indexes_size will return the index's size attatched to the table. pg_relation_size will return the table's size. Did I miss something ? > > IMO, due to the difference of inserts with parallel execution, the > > btree insert's cost is more than serial. > > > > At the same time, the parallel does not have a huge performance gain > > with bitmapscan, So the extra cost of btree index will result in > > performance degradation. > > How did you know that the parallelism didn't have a huge performance gain > with bitmap scan? > > [serial] > -> Bitmap Heap Scan on public.x (cost=3272.20..3652841.26 rows=79918 > width=8) (actual time=8.096..41.005 rows=129999 loops=1) > > [parallel] > -> Parallel Bitmap Heap Scan on public.x > (cost=3272.20..1260119.35 rows=19980 width=8) (actual time=5.832..14.787 > rows=26000 loops=5) I tested the case without insert(Just the query use bitmapscan): [serial]: postgres=# explain analyze select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on x (cost=3258.59..3647578.53 rows=81338 width=4) (actual time=8.091..34.222 rows=129999 loops=1) Recheck Cond: ((a < 80000) OR (a > 199900000)) Filter: ((a < 80000) OR (((a % 2) = 0) AND (a > 199900000))) Rows Removed by Filter: 50000 Heap Blocks: exact=975 -> BitmapOr (cost=3258.59..3258.59 rows=173971 width=0) (actual time=7.964..7.965 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1495.11 rows=80872 width=0) (actual time=3.451..3.451 rows=79999 loops=1) Index Cond: (a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1722.81 rows=93099 width=0) (actual time=4.513..4.513 rows=100000loops=1) Index Cond: (a > 199900000) Planning Time: 0.108 ms Execution Time: 38.136 ms [parallel] postgres=# explain analyze select a from x where a<80000 or (a%2=0 and a>199900000); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Gather (cost=4258.59..1266704.42 rows=81338 width=4) (actual time=9.177..22.592 rows=129999 loops=1) Workers Planned: 4 Workers Launched: 4 -> Parallel Bitmap Heap Scan on x (cost=3258.59..1257570.62 rows=20334 width=4) (actual time=6.402..12.882 rows=26000loops=5) Recheck Cond: ((a < 80000) OR (a > 199900000)) Filter: ((a < 80000) OR (((a % 2) = 0) AND (a > 199900000))) Rows Removed by Filter: 10000 Heap Blocks: exact=1 -> BitmapOr (cost=3258.59..3258.59 rows=173971 width=0) (actual time=8.785..8.786 rows=0 loops=1) -> Bitmap Index Scan on x_a_idx (cost=0.00..1495.11 rows=80872 width=0) (actual time=3.871..3.871 rows=79999loops=1) Index Cond: (a < 80000) -> Bitmap Index Scan on x_a_idx (cost=0.00..1722.81 rows=93099 width=0) (actual time=4.914..4.914 rows=100000loops=1) Index Cond: (a > 199900000) Planning Time: 0.158 ms Execution Time: 26.951 ms (15 rows) It did have performance gain, but I think it's not huge enough to ignore the extra's index cost. What do you think ? Best regards, houzj
pgsql-hackers by date: