Re: Batch insert in CTAS/MatView code - Mailing list pgsql-hackers
From | Paul Guo |
---|---|
Subject | Re: Batch insert in CTAS/MatView code |
Date | |
Msg-id | CAEET0ZED6Lbx1HxAPweAx+wQPZNQdD85u4LD35Qe91iZmshSAg@mail.gmail.com Whole thread Raw |
In response to | Re: Batch insert in CTAS/MatView code (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: Batch insert in CTAS/MatView code
Re: Batch insert in CTAS/MatView code |
List | pgsql-hackers |
create table tt as select a,b,c from t11;
\timing
drop table tt;
create table t11 (a int, b int, c int, d int);
insert into t11 select s,s,s,s from generate_series(1, 10000000) s;
analyze t11;
-- tuples that are untoasted and tuple size is 1984 bytes.
create table t12 (a name, b name, c name, d name, e name, f name, g name, h name, i name, j name, k name, l name, m name, n name, o name, p name, q name, r name, s name, t name, u name, v name, w name, x name, y name, z name, a1 name, a2 name, a3 name, a4 name, a5 name);
insert into t12 select 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e' from generate_series(1, 500000);
analyze t12;
-- tuples that are untoasted and tuple size is 2112 bytes.
create table t13 (a name, b name, c name, d name, e name, f name, g name, h name, i name, j name, k name, l name, m name, n name, o name, p name, q name, r name, s name, t name, u name, v name, w name, x name, y name, z name, a1 name, a2 name, a3 name, a4 name, a5 name, a6 name, a7 name);
insert into t13 select 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g' from generate_series(1, 500000);
analyze t13;
-- tuples that are toastable and tuple compressed size is 1084.
create table t14 (a text, b text, c text, d text, e text, f text, g text, h text, i text, j text, k text, l text, m text, n text, o text, p text, q text, r text, s text, t text, u text, v text, w text, x text, y text, z text);
insert into t14 select i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i, i from (select repeat('123456789', 10000) from generate_series(1,5000)) i;
analyze t14;
baseline patch diff% SQL
10.1 5.57 44.85% create table tt as select * from t11;
10.7 5.52 48.41% create table tt as select a,b,c from t11;
9.57 10.2 -6.58% create table tt as select * from t12;
9.64 8.63 10.48% create table tt as select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,a1,a2,a3,a4 from t12;
14.2 14.46 -1.83% create table tt as select * from t13;
11.88 12.05 -1.43% create table tt as select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,a1,a2,a3,a4,a5,a6 from t13;
3.17 3.25 -2.52% create table tt as select * from t14;
2.93 3.12 -6.48% create table tt as select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y from t14;
scenario2: all related kernel caches are populated after previous testing.
baseline patch diff% SQL
9.6 4.97 48.23% create table tt as select * from t11;
10.41 5.32 48.90% create table tt as select a,b,c from t11;
9.12 9.52 -4.38% create table tt as select * from t12;
9.66 8.6 10.97% create table tt as select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,a1,a2,a3,a4 from t12;
13.56 13.6 -0.30% create table tt as select * from t13;
11.36 11.7 -2.99% create table tt as select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,a1,a2,a3,a4,a5,a6 from t13;
3.08 3.13 -1.62% create table tt as select * from t14;
2.95 3.03 -2.71% create table tt as select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y from t14;
On 06/03/2019 22:06, Paul Guo wrote:
> The patch also modifies heap_multi_insert() a bit to do a bit further
> code-level optimization by using static memory, instead of using memory
> context and dynamic allocation.
If toasting is required, heap_prepare_insert() creates a palloc'd tuple.
That is still leaked to the current memory context.
Leaking into the current memory context is not a bad thing, because
resetting a memory context is faster than doing a lot of pfree() calls.
The callers just need to be prepared for that, and use a short-lived
memory context.
> By the way, while looking at the code, I noticed that there are 9 local
> arrays with large length in toast_insert_or_update() which seems to be a
> risk of stack overflow. Maybe we should put it as static or global.
Hmm. We currently reserve 512 kB between the kernel's limit, and the
limit we check in check_stack_depth(). See STACK_DEPTH_SLOP. Those
arrays add up to 52800 bytes on a 64-bit maching, if I did my math
right. So there's still a lot of headroom. I agree that it nevertheless
seems a bit excessive, though.
> With the patch,
>
> Time: 4728.142 ms (00:04.728)
> Time: 14203.983 ms (00:14.204)
> Time: 1008.669 ms (00:01.009)
>
> Baseline,
> Time: 11096.146 ms (00:11.096)
> Time: 13106.741 ms (00:13.107)
> Time: 1100.174 ms (00:01.100)
Nice speedup!
> While for toast and large column size there is < 10% decrease but for
> small column size the improvement is super good. Actually if I hardcode
> the batch count as 4 all test cases are better but the improvement for
> small column size is smaller than that with current patch. Pretty much
> the number 4 is quite case specific so I can not hardcode that in the
> patch. Of course we could further tune that but the current value seems
> to be a good trade-off?
Have you done any profiling, on why the multi-insert is slower with
large tuples? In principle, I don't see why it should be slower.
- Heikki
Attachment
pgsql-hackers by date: