Home > mailing lists

Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Parallel Inserts in CREATE TABLE AS
Date	May 21, 2021 13:16:30
Msg-id	CAA4eK1+7XQwwkJu8667ziSUhY+FzUWYfDhpztnvafxoiQi52Jw@mail.gmail.com Whole thread Raw
In response to	Re: Parallel Inserts in CREATE TABLE AS (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses	Re: Parallel Inserts in CREATE TABLE AS Re: Parallel Inserts in CREATE TABLE AS
List	pgsql-hackers

Tree view

On Fri, Mar 19, 2021 at 11:02 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Jan 27, 2021 at 1:47 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
>
> I analyzed performance of parallel inserts in CTAS for different cases
> with tuple size 32bytes, 59bytes, 241bytes and 1064bytes. We could
> gain if the tuple sizes are lower. But if the tuple size is larger
> i..e 1064bytes, there's a regression with parallel inserts. Upon
> further analysis, it turned out that the parallel workers are
> requiring frequent extra blocks addition while concurrently extending
> the relation(in RelationAddExtraBlocks) and the majority of the time
> spent is going into flushing those new empty pages/blocks onto the
> disk.
>

How you have ensured that the cost is due to the flushing of pages?
AFAICS, we don't flush the pages rather just write them and then
register those to be flushed by checkpointer, now it is possible that
the checkpointer sync queue gets full and the backend has to write by
itself but have we checked that? I think we can check via wait events,
if it is due to flush then we should see a lot of file sync
(WAIT_EVENT_DATA_FILE_SYNC) wait events.  The other possibility could
be that the free pages added to FSM by one worker are not being used
by another worker due to some reason. Can we debug and check if the
pages added by one worker are being used by another worker?

-- 
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Bharath Rupireddy
Date: 21 May 2021, 13:09:58
Subject: Re: Logical Replication - behavior of TRUNCATE ... CASCADE

From: Amit Kapila
Date: 21 May 2021, 13:21:35
Subject: Re: "ERROR: deadlock detected" when replicating TRUNCATE

Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

Previous

Next