RE: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Tang, Haiying
Subject RE: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id 2cf243f1ad7f4b808574c78840790f9d@G08CNEXMBPEKD05.g08.fujitsu.local
Whole thread Raw
In response to RE: Parallel INSERT (INTO ... SELECT ...)  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: Parallel INSERT (INTO ... SELECT ...)  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
> From: Amit Kapila <amit.kapila16@gmail.com>
> > Can we test cases when we have few rows in the Select table (say 
> > 1000) and there 500 or 1000 partitions. In that case, we won't 
> > select parallelism but we have to pay the price of checking 
> > parallel-safety of all partitions. Can you check this with 100, 200, 
> > 500, 1000 partitions table?
> 
> I also wanted to see such an extreme(?) case.  The 1,000 rows is not 
> the count per partition but the total count of all partitions.e.g., 
> when # of partitions is 100, # of rows per partition is 10.

Below results are in serial plan which select table total rows are 1,000. The Excution Time + Planning Time is still
lessthan unpatched.
 
(does this patch make some optimizes in serial insert? I'm a little confused here, Because the patched execution time
isless than unpatched, but I didn't find information in commit messages about it. If I missed something, please kindly
letme know.)
 
    
           |                patched                |                   master               |         %reg
           |
 

-----------|------------------|--------------------|--------------------|-------------------|---------------------|-----------------|
partitions |Execution Time(ms)| Planning Time(ms)  | Execution Time(ms) | Planning Time(ms) | %reg(Excution Time) |
%reg(alltime)  |
 

-----------|------------------|--------------------|--------------------|-------------------|---------------------|-----------------|
100        | 5.294            |  1.581             |  6.951             |  0.037            |   -24%              | -2%
           |
 
200        | 9.666            |  3.068             |  13.681            |  0.043            |   -29%              | -7%
           |
 
500        | 22.742           |  12.061            |  35.928            |  0.125            |   -37%              | -3%
           |
 
1000       | 46.386           |  24.872            |  75.523            |  0.142            |   -39%              | -6%
           |
 

I did another test which made check overhead obvious. this case is not fitting for partition purpose, but I put it here
asan extreme case too.
 
Select table total rows are 1,000, # of partitions is 2000. So only the first 1000 partitions have 1 row per partition,
thelast 1000 partitions have no data inserted. 
 

           |                patched                |                   master               |         %reg
           |
 

-----------|------------------|--------------------|--------------------|-------------------|---------------------|-----------------|
partitions |Execution Time(ms)| Planning Time(ms)  | Execution Time(ms) | Planning Time(ms) | %reg(Excution Time) |
%reg(alltime)  |
 

-----------|------------------|--------------------|--------------------|-------------------|---------------------|-----------------|
2000       | 45.758           |  51.697            |  80.272            |  0.136            |   -43               | 21%
           |
 

Regards,
Tang



pgsql-hackers by date:

Previous
From: Tatsuro Yamada
Date:
Subject: Re: list of extended statistics on psql
Next
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Stronger safeguard for archive recovery not to miss data