RE: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Hou, Zhijie
Subject RE: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id 95066de635ce41eca7d06910c9f663e7@G08CNEXMBPEKD05.g08.fujitsu.local
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
RE: Parallel INSERT (INTO ... SELECT ...)  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
List pgsql-hackers
> > > postgres=# explain verbose insert into testscan select a from x
> > > where
> > > a<80000 or (a%2=0 and a>199900000);
> > >                                             QUERY PLAN
> > >
> --------------------------------------------------------------------
> > > --
> > > -----------------------------
> > >  Gather  (cost=4346.89..1281204.64 rows=81372 width=0)
> > >    Workers Planned: 4
> > >    ->  Insert on public.testscan  (cost=3346.89..1272067.44 rows=0
> > > width=0)
> > >          ->  Parallel Bitmap Heap Scan on public.x1
> > > (cost=3346.89..1272067.44 rows=20343 width=8)
> > >                Output: x1.a, NULL::integer
> > >                Recheck Cond: ((x1.a < 80000) OR (x1.a > 199900000))
> > >                Filter: ((x1.a < 80000) OR (((x1.a % 2) = 0) AND
> > > (x1.a >
> > > 199900000)))
> > >                ->  BitmapOr  (cost=3346.89..3346.89 rows=178808
> > > width=0)
> > >                      ->  Bitmap Index Scan on x1_a_idx
> > > (cost=0.00..1495.19 rows=80883 width=0)
> > >                            Index Cond: (x1.a < 80000)
> > >                      ->  Bitmap Index Scan on x1_a_idx
> > > (cost=0.00..1811.01 rows=97925 width=0)
> > >                            Index Cond: (x1.a > 199900000)
> > >
> > > PSA is my postgresql.conf file, maybe you can have a look. Besides,
> > > I didn't do any parameters tuning in my test session.
> >
> > I reproduced this on my machine.
> >
> > I think we'd better do "analyze" before insert which helps reproduce this
> easier.
> > Like:
> >
> > -----
> > analyze;
> > explain analyze verbose insert into testscan select a from x where
> > a<80000 or (a%2=0 and a>199900000);
> > -----
> 
> OK then.
> Can you check if just the underlying SELECTs are run (without INSERT), is
> there any performance degradation when compared to a non-parallel scan?

It seems there is no performance degradation without insert.

Till now, what I found is that:
With tang's conf, when doing parallel insert, the walrecord is more than serial insert
(IMO, this is the main reason why it has performance degradation)
See the attatchment for the plan info.

I have tried alter the target table to unlogged and
then the performance degradation will not happen any more.

And the additional walrecord seems related to the index on the target table.
If the target table does not have any index, the wal record is the same between parallel plan and serial plan.
Also, it does not have performance degradation without index.

I am still looking at this problem, if someone think of something about it,
It's very grateful to share the knowledge with me.

Best regards,
houzj






Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: TRUNCATE on foreign table
Next
From: Bharath Rupireddy
Date:
Subject: Re: TRUNCATE on foreign table