Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAA4eK1KHJnGw4FvvxsmL7v2x-4TM7HTNgCn_4ZMG4B3yaHwbKg@mail.gmail.com
Whole thread Raw
In response to RE: Parallel INSERT (INTO ... SELECT ...)  ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>)
List pgsql-hackers
On Tue, Feb 9, 2021 at 5:49 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
> > > > postgres=# explain verbose insert into testscan select a from x
> > > > where
> > > > a<80000 or (a%2=0 and a>199900000);
> > > >                                             QUERY PLAN
> > > >
> > --------------------------------------------------------------------
> > > > --
> > > > -----------------------------
> > > >  Gather  (cost=4346.89..1281204.64 rows=81372 width=0)
> > > >    Workers Planned: 4
> > > >    ->  Insert on public.testscan  (cost=3346.89..1272067.44 rows=0
> > > > width=0)
> > > >          ->  Parallel Bitmap Heap Scan on public.x1
> > > > (cost=3346.89..1272067.44 rows=20343 width=8)
> > > >                Output: x1.a, NULL::integer
> > > >                Recheck Cond: ((x1.a < 80000) OR (x1.a > 199900000))
> > > >                Filter: ((x1.a < 80000) OR (((x1.a % 2) = 0) AND
> > > > (x1.a >
> > > > 199900000)))
> > > >                ->  BitmapOr  (cost=3346.89..3346.89 rows=178808
> > > > width=0)
> > > >                      ->  Bitmap Index Scan on x1_a_idx
> > > > (cost=0.00..1495.19 rows=80883 width=0)
> > > >                            Index Cond: (x1.a < 80000)
> > > >                      ->  Bitmap Index Scan on x1_a_idx
> > > > (cost=0.00..1811.01 rows=97925 width=0)
> > > >                            Index Cond: (x1.a > 199900000)
> > > >
> > > > PSA is my postgresql.conf file, maybe you can have a look. Besides,
> > > > I didn't do any parameters tuning in my test session.
> > >
> > > I reproduced this on my machine.
> > >
> > > I think we'd better do "analyze" before insert which helps reproduce this
> > easier.
> > > Like:
> > >
> > > -----
> > > analyze;
> > > explain analyze verbose insert into testscan select a from x where
> > > a<80000 or (a%2=0 and a>199900000);
> > > -----
> >
> > OK then.
> > Can you check if just the underlying SELECTs are run (without INSERT), is
> > there any performance degradation when compared to a non-parallel scan?
>
> It seems there is no performance degradation without insert.
>
> Till now, what I found is that:
> With tang's conf, when doing parallel insert, the walrecord is more than serial insert
> (IMO, this is the main reason why it has performance degradation)
> See the attatchment for the plan info.
>
> I have tried alter the target table to unlogged and
> then the performance degradation will not happen any more.
>
> And the additional walrecord seems related to the index on the target table.
>

I think you might want to see which exact WAL records are extra by
using pg_waldump?

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Custom compression methods
Next
From: John Naylor
Date:
Subject: Re: Perform COPY FROM encoding conversions in larger chunks