[PATCH] 2PC state files on shared memory - Mailing list pgsql-hackers

From Michael Paquier
Subject [PATCH] 2PC state files on shared memory
Date
Msg-id c64c5f8b0908062031k3ff48428j824a9a46f28180ac@mail.gmail.com
Whole thread Raw
Responses Re: [PATCH] 2PC state files on shared memory  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

Hi all,
 
Based on an idea of Heikki Linnakangas, here is a patch in order to improve 2PC
by sending the state files of prepared transactions to shared memory instead of disk.
It is not possible to avoid the Xlog flush operation but reducing the amout of data sent to disk permits to accelerate 2PC process.

During a checkpoint, only the state files of prepared but not committed transactions are flushed to disk from shared memory.
The shared memory allocated for state files on shmem is made with an additionnal parameter called max_state_file_space in postgresql.conf.
Of course if there are too many transactions and not enough space on shared memory, state files are sent to disk originally.
 
By default, the space allocated is set at 0 as max_prepared_transaction is nul in 8.4.
 
For some other results, please reference to the wiki page I wrote about this 2PC improvement.
http://wiki.postgresql.org/wiki/2PC_improvement:_state_files_in_shared_memory
This page explains the simulation method for the patch analysis and gathers the main results.
 
Here are some of the performance results got by testing the code with a battery-backedup cache Disk Array with 8 disks in RAID0 configuration.
The four tables below depend on the scale factor at 1 or 100 of pgbench and if the results are normalized or not.
Normalized results have no unit but pure results are in TX/s.
Tests were made using transaction whose state file sizes are 600B and 712B via pgbench.
As it is possible to see, the patch permits to improve the transaction flow by up to 15-18%, what is not negligible.
 
1) Case scale factor 1, normalized results

State File Size (B)600712
Use of 2PCState file
 on Shmem
State file
 on Disk
No 2PCState file
 on Shmem
State file
 on Disk
No 2PC
Pgbench conf
ConnTransTps1-2Tps2-2Tps3-2Tps1-2Tps2-2Tps3-2
2100000.078663793010.07965301
5100000.105263158010.0843806101
10100000.096105528010.0716612401
25100000.106321839010.1284615401
35100000.138996139010.1210613601
50100000.130278527010.1407269301
60100000.133937563010.151709401
70100000.17218543010.1491329501
80100000.1775010.1778656101
90100000.179806362010.1523272201
100100000.182242991010.1526479801
2) Case scale factor 1, pure TX/s results
State File Size (B)600712
Use of 2PCState file
 on Shmem
State file
 on Disk
No 2PCState file
 on Shmem
State file
 on Disk
No 2PC
Pgbench conf
ConnTransTps1-2Tps2-2Tps3-2Tps1-2Tps2-2Tps3-2
210000116310172873113410332301
510000126310772844121310722743
1010000126511122704117510652600
2510000123310852477120510382338
3510000122010402335116910232229
501000011901045215811439922065
601000011511018201111119691905
70100001127971187710679381803
80100001091949174910218861645
9010000105092016439398311540
10010000101289515378897911433
 
3) Case scale factor 100, normalized results
State File Size (B)600712
Use of 2PCState file
 on Shmem
State file
 on Disk
No 2PCState file
 on Shmem
State file
 on Disk
No 2PC
Pgbench conf
ConnTransTps1-2Tps2-2Tps3-2Tps1-2Tps2-2Tps3-2
2100000.031791908010.0042662101
5100000.018481848010.0385873101
10100000.049115914010.0766101701
25100000.06954612010.0611724701
35100000.077677841010.0584642201
50100000.059885932010.0896130301
60100000.071888412010.0699774301
70100000.094007051010.0357142901
80100000.078838174010.0563583801
 
4) Case scale factor 100, pure results
State File Size (B)600712
Use of 2PCState file
 on Shmem
State file
 on Disk
No 2PCState file
 on Shmem
State file
 on Disk
No 2PC
Pgbench conf
ConnTransTps1-2Tps2-2Tps3-2Tps1-2Tps2-2Tps3-2
210000111310582788114711422314
510000124012122727118411252654
1010000122511502677120310902565
2510000121811232489117611042281
3510000121011152338115110842230
5010000115310902142112710392021
6010000112610591991108310211907
701000010871007185810149861770
8010000104698917129839441636
 
Regards,

--
Michael Paquier

NTT OSSC
Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] pgbench: new feature allowing to launch shell commands
Next
From: Fujii Masao
Date:
Subject: Re: pg_ctl stop -m fast after -m smart