Bulk Inserts - Mailing list pgsql-hackers

From Pierre Frédéric Caillaud
Subject Bulk Inserts
Date
Msg-id op.uz83q3tecke6l8@soyouz
Whole thread Raw
Responses Re: Bulk Inserts
Re: Bulk Inserts
List pgsql-hackers
I've done a little experiment with bulk inserts.

=> heap_bulk_insert()

Behaves like heap_insert except it takes an array of tuples (HeapTuple  
*tups, int ntups).

- Grabs a page (same as heap_insert)

- While holding exclusive lock, inserts as many tuples as it can on the  
page.- Either the page gets full- Or we run out of tuples.

- Generate xlog : choice between- Full Xlog mode :    - if we inserted more than 10 tuples (totaly bogus heuristic),
logthe  
 
entire page    - Else, log individual tuples as heap_insert does- Light log mode :    - if page was empty, only xlog a
"newempty page" record, not page  
 
contents    - else, log fully    - heap_sync() at the end

- Release the page
- If we still have tuples to insert, repeat.

Am I right in assuming that :

1)
- If the page was empty,
- and log archiving isn't used,
- and the table is heap_sync()'d at the end,
=> only a "new empty page" record needs to be created, then the page can  
be completely filled ?

2)
- If the page isn't empty
- or log archiving is used,
=> logging either the inserted tuples or the entire page is OK to  
guarantee persistence ?

(I used kill -9 to test it, recovery seems to work).

Test on a concurrent COPY, 4 threads, on a table with 8 INT columns.

* 8.5 HEAD : Total Time 44 s
* Bulk inserts, Full XLog : Total Time 24 s
* Bulk inserts, Light XLog : Total Time 10 s

Quite a bit faster... I presume with more CPUs it would scale.

I'm not posting the patch because it's quite ugly (especially the part to  
store tuples in copy.c and bulk-insert them, I should probably have used a  
tuplestore...)
I think the tuples need to be stored and then bulk-inserted because the  
exclusive lock on the buffer can't be held for a long time.


Lock stats (from the patch I just posted) :

* 8.5 HEAD : Total Time 44 s

-------- Lock stats for PID 28043    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  28043       7          0          0        0.00
    0.00     
 
2500002     804378       23.59 ( 53.11 %)        7.38 ( 16.61 %)  WALInsert  28043       8          0          0
0.00       0.00       
 
25775         32        2.91 (  6.54 %)        0.90 (  2.02 %)  WALWrite

-------- Lock stats for PID 28044    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  28044       7          0          0        0.00
    0.00     
 
2500002     802515       22.26 ( 50.11 %)        8.70 ( 19.59 %)  WALInsert  28044       8          0          0
0.00       0.00       
 
25620         42        4.00 (  9.01 %)        1.12 (  2.52 %)  WALWrite

-------- Lock stats for PID 28045    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  28045       7          0          0        0.00
    0.00     
 
2500002     799145       22.47 ( 50.32 %)        8.72 ( 19.52 %)  WALInsert  28045       8          0          0
0.00       0.00       
 
25725         38        4.08 (  9.14 %)        1.05 (  2.35 %)  WALWrite

-------- Lock stats for PID 28042    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  28042       7          0          0        0.00
    0.00     
 
2500002     809477       23.49 ( 52.44 %)        7.89 ( 17.62 %)  WALInsert  28042       8          0          0
0.00       0.00       
 
25601         37        3.27 (  7.31 %)        1.05 (  2.34 %)  WALWrite


* Bulk inserts, Full XLog : Total Time 24 s

-------- Lock stats for PID 32486    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  32486       7          0          0        0.00
    0.00       
 
23765       1128        9.22 ( 38.98 %)        4.05 ( 17.14 %)  WALInsert  32486       8          0          0
0.00       0.00       
 
21120         19        2.64 ( 11.17 %)        1.32 (  5.59 %)  WALWrite

-------- Lock stats for PID 32484    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  32484       7          0          0        0.00
    0.00       
 
23865       1083        9.87 ( 41.68 %)        2.87 ( 12.11 %)  WALInsert  32484       8          0          0
0.00       0.00       
 
21105         11        1.68 (  7.11 %)        1.09 (  4.62 %)  WALWrite  32484    8508          0          0
0.00       0.00           
 
1          1        0.19 (  0.81 %)        0.00 (  0.00 %)  32484   18846          0          0        0.00        0.00
         
 
1          1        0.25 (  1.05 %)        0.00 (  0.00 %)

-------- Lock stats for PID 32485    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  32485       7          0          0        0.00
    0.00       
 
23816       1107        8.94 ( 37.75 %)        4.05 ( 17.09 %)  WALInsert  32485       8          0          0
0.00       0.00       
 
21109         21        2.59 ( 10.93 %)        1.36 (  5.77 %)  WALWrite  32485   16618          0          0
0.00       0.00           
 
1          2        0.23 (  0.98 %)        0.00 (  0.00 %)

-------- Lock stats for PID 32482    PID    Lock      ShAcq     ShWait     ShWaitT     ShHeldT       
ExAcq     ExWait             ExWaitT                ExHeldT     Name  32482       7          0          0        0.00
    0.00       
 
23813       1053        9.70 ( 40.75 %)        3.41 ( 14.32 %)  WALInsert  32482       8          0          0
0.00       0.00       
 
21119         15        2.24 (  9.43 %)        1.06 (  4.44 %)  WALWrite  32482    6770          0          0
0.00       0.00           
 
3          1        0.17 (  0.70 %)        0.00 (  0.00 %)

* Bulk inserts, Light XLog : Total Time 10 s

No Lock stats to show (wait tims is < 0.01 s)...



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Streaming Replication patch for CommitFest 2009-09
Next
From: "Kevin Grittner"
Date:
Subject: Re: Disable and enable of table and column constraints