Re: Distributed transactions and asynchronous commit - Mailing list pgsql-performance

From Giuseppe Broccolo
Subject Re: Distributed transactions and asynchronous commit
Date
Msg-id 51E67CF6.1010409@2ndquadrant.it
Whole thread Raw
In response to Re: Distributed transactions and asynchronous commit  (Xenofon Papadopoulos <xpapad@gmail.com>)
List pgsql-performance
Il 17/07/2013 12:52, Xenofon Papadopoulos ha scritto:
Thank you for your replies so far.
The DB in question is Postgres+ 9.2 running inside a VM with the following specs:

16 CPUs (dedicated to the VM)
60G RAM
RAID-10 storage on a SAN for pgdata and pgarchieves, using different LUNs for each.

We have 3 kind of queries:

- The vast majority of the queries are small SELECT/INSERT/UPDATEs which are part of distributed transactions
- A few small ones, which are mostly SELECTs
- A few bulk loads, where we add 100k - 1M of rows in tables

Our settings are:

shared_buffers: 8G
work_mem: 12M
checkpoint_segments: 64


shared_buffers could be set up to 20-30% of the available RAM: in your case, 16GB could be a reasonable value.

Autovacuum is somewhat aggressive, as our data changes quite often and without it the planner was completely off.
Right now we use:

 autovacuum_analyze_scale_factor: 0.1 
 autovacuum_analyze_threshold: 50 
autovacuum_freeze_max_age: 200000000 
 autovacuum_max_workers: 12                 
 autovacuum_naptime: 10s                   
 autovacuum_vacuum_cost_delay: 20ms       
 autovacuum_vacuum_cost_limit: -1              
 autovacuum_vacuum_scale_factor: 0.2         
 autovacuum_vacuum_threshold: 50

This means that auto vacuum will be triggered after around 50 updates aech time, if your database is doing a lot of updates/inserts (as I understood) an unnecessary amount of vacuum statements can be reached, which will generate a lot of IO. If the inserts/updates are small, this value could be decreased.

Giuseppe.

          

At high-peak hour, the disk utilization for the pgdata mountpoint is:

00:00:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:20:01     dev253-2   7711.62  24166.97  56657.95     10.48    735.28     95.09      0.11     86.11
13:30:01     dev253-2   5340.88  19465.30  39133.32     10.97    319.20     59.94      0.15     82.30
13:40:01     dev253-2   2791.02  13061.76  19330.40     11.61    349.95    125.38      0.33     90.73
13:50:01     dev253-2   3478.69  10503.84  25505.27     10.35    308.12     88.57      0.20     68.12
14:00:01     dev253-2   5269.12  33613.43  35830.13     13.18    232.48     44.09      0.19    100.05
14:10:01     dev253-2   4910.24  21767.22  33970.96     11.35    322.52     65.64      0.21    104.55
14:20:02     dev253-2   5358.95  40772.03  33682.46     13.89    721.81    134.32      0.20    104.92
14:30:01     dev253-2   4420.51  17256.16  33315.27     11.44    336.53     76.13      0.15     65.25
14:40:02     dev253-2   4884.13  28439.26  31604.76     12.29    265.32     54.26      0.20     97.51
14:50:01     dev253-2   3124.91   8077.46  22511.59      9.79     50.41     16.13      0.24     76.17

and for pgarchives:

00:00:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:20:01     dev253-3   2802.25      0.69  22417.32      8.00    465.05    165.94      0.02      4.32
13:30:01     dev253-3   1559.87  11159.45  12120.99     14.92     64.17     41.11      0.08     12.02
13:40:01     dev253-3    922.62   8066.62   7129.15     16.47     19.75     21.40      0.08      6.99
13:50:01     dev253-3   1194.81    895.34   9524.53      8.72     28.40     23.76      0.01      1.69
14:00:01     dev253-3   1919.12      0.46  15352.49      8.00     51.75     26.95      0.01      1.61
14:10:01     dev253-3   1770.59   9286.61  13873.79     13.08    139.86     78.97      0.08     14.46
14:20:02     dev253-3   1595.04  11810.63  12389.08     15.17    109.17     68.42      0.15     24.71
14:30:01     dev253-3   1793.71  12173.88  13957.79     14.57    141.56     78.89      0.08     13.61
14:40:02     dev253-3   1751.62      0.43  14012.53      8.00     43.38     24.76      0.01      1.40
14:50:01     dev253-3   1351.72   3225.19  10707.29     10.31     31.91     23.59      0.02      2.93




On Wed, Jul 17, 2013 at 1:09 PM, Giuseppe Broccolo <giuseppe.broccolo@2ndquadrant.it> wrote:
Hi,

Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:
In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Distributed transactions are base on two-phase-commit (2PC) algorithms for ensuring correct transaction completion,  so are synchronous.
However, I think this is not the main reason behind your extremely high disk I/O. You should check if your system is properly tuned to get the best performances.
First of all, you could take a look on your PostgreSQL configurations, and check if shared_memory is set properly taking into account your RAM availability. The conservative PostgreSQL default value is 24 MB, forcing system to exploit many disk I/O resources.
Aside from this, you could take a look if autovacuum is often triggered (generating a large amount of I/O) in case of large use of updates/inserts in your database.

Regards,

Giuseppe.

-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it



-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

pgsql-performance by date:

Previous
From: Xenofon Papadopoulos
Date:
Subject: Re: Distributed transactions and asynchronous commit
Next
From: Vasilis Ventirozos
Date:
Subject: Re: Distributed transactions and asynchronous commit