RE: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Hou, Zhijie
Subject RE: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id b50e9012a81f410fa86f871f10551e56@G08CNEXMBPEKD05.g08.fujitsu.local
Whole thread Raw
In response to RE: Parallel INSERT (INTO ... SELECT ...)  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: Parallel INSERT (INTO ... SELECT ...)  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
List pgsql-hackers
> > > It did have performance gain, but I think it's not huge enough to
> > > ignore the extra's index cost.
> > > What do you think ?
> >
> > Yes... as you suspect, I'm afraid the benefit from parallel bitmap
> > scan may not compensate for the loss of the parallel insert operation.
> >
> > The loss is probably due to 1) more index page splits, 2) more buffer
> > writes (table and index), and 3) internal locks for things such as
> > relation extension and page content protection.  To investigate 3), we
> > should want something like [1], which tells us the wait event
> > statistics (wait count and time for each wait event) per session or
> > across the instance like Oracke, MySQL and EDB provides.  I want to
> continue this in the near future.
> 
> What would the result look like if you turn off
> parallel_leader_participation?  If the leader is freed from
> reading/writing the table and index, the index page splits and internal
> lock contention may decrease enough to recover part of the loss.
> 
> https://www.postgresql.org/docs/devel/parallel-plans.html
> 
> "In a parallel bitmap heap scan, one process is chosen as the leader. That
> process performs a scan of one or more indexes and builds a bitmap indicating
> which table blocks need to be visited. These blocks are then divided among
> the cooperating processes as in a parallel sequential scan. In other words,
> the heap scan is performed in parallel, but the underlying index scan is
> not."

If I disable parallel_leader_participation.

For max_parallel_workers_per_gather = 4, It still have performance degradation.

For max_parallel_workers_per_gather = 2, the performance degradation will not happen in most of the case.
There is sometimes a noise(performance degradation), but most of result(about 80%) is good.

Best regards,
houzj



Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Asynchronous Append on postgres_fdw nodes.
Next
From: Masahiko Sawada
Date:
Subject: Re: pg_get_first_normal_oid()?