Home > mailing lists

Thread: Distributed transactions and asynchronous commit

Distributed transactions and asynchronous commit

From

Xenofon Papadopoulos

Date:

17 July 2013, 07:18:40

In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Thanks

Re: Distributed transactions and asynchronous commit

From

Vasilis Ventirozos

Date:

17 July 2013, 09:28:59

In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

Yep

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

could be, but you have to send us more info about your setup, your configuration , especially your io settings, output of vmstats would also be helpful

Thanks

Vasilis Ventirozos

Re: Distributed transactions and asynchronous commit

From

Giuseppe Broccolo

Date:

17 July 2013, 10:09:30

Hi,

Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:

In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Distributed transactions are base on two-phase-commit (2PC) algorithms for ensuring correct transaction completion, so are synchronous.
However, I think this is not the main reason behind your extremely high disk I/O. You should check if your system is properly tuned to get the best performances.
First of all, you could take a look on your PostgreSQL configurations, and check if shared_memory is set properly taking into account your RAM availability. The conservative PostgreSQL default value is 24 MB, forcing system to exploit many disk I/O resources.
Aside from this, you could take a look if autovacuum is often triggered (generating a large amount of I/O) in case of large use of updates/inserts in your database.

Regards,

Giuseppe.

-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

Re: Distributed transactions and asynchronous commit

From

Xenofon Papadopoulos

Date:

17 July 2013, 10:52:26

Thank you for your replies so far.

The DB in question is Postgres+ 9.2 running inside a VM with the following specs:

16 CPUs (dedicated to the VM)

60G RAM

RAID-10 storage on a SAN for pgdata and pgarchieves, using different LUNs for each.

We have 3 kind of queries:

- The vast majority of the queries are small SELECT/INSERT/UPDATEs which are part of distributed transactions

- A few small ones, which are mostly SELECTs

- A few bulk loads, where we add 100k - 1M of rows in tables

Our settings are:

shared_buffers: 8G

work_mem: 12M

checkpoint_segments: 64

Autovacuum is somewhat aggressive, as our data changes quite often and without it the planner was completely off.

Right now we use:

autovacuum_analyze_scale_factor: 0.1

autovacuum_analyze_threshold: 50

autovacuum_freeze_max_age: 200000000

autovacuum_max_workers: 12

autovacuum_naptime: 10s

autovacuum_vacuum_cost_delay: 20ms

autovacuum_vacuum_cost_limit: -1

autovacuum_vacuum_scale_factor: 0.2

autovacuum_vacuum_threshold: 50

At high-peak hour, the disk utilization for the pgdata mountpoint is:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util

13:20:01 dev253-2 7711.62 24166.97 56657.95 10.48 735.28 95.09 0.11 86.11

13:30:01 dev253-2 5340.88 19465.30 39133.32 10.97 319.20 59.94 0.15 82.30

13:40:01 dev253-2 2791.02 13061.76 19330.40 11.61 349.95 125.38 0.33 90.73

13:50:01 dev253-2 3478.69 10503.84 25505.27 10.35 308.12 88.57 0.20 68.12

14:00:01 dev253-2 5269.12 33613.43 35830.13 13.18 232.48 44.09 0.19 100.05

14:10:01 dev253-2 4910.24 21767.22 33970.96 11.35 322.52 65.64 0.21 104.55

14:20:02 dev253-2 5358.95 40772.03 33682.46 13.89 721.81 134.32 0.20 104.92

14:30:01 dev253-2 4420.51 17256.16 33315.27 11.44 336.53 76.13 0.15 65.25

14:40:02 dev253-2 4884.13 28439.26 31604.76 12.29 265.32 54.26 0.20 97.51

14:50:01 dev253-2 3124.91 8077.46 22511.59 9.79 50.41 16.13 0.24 76.17

and for pgarchives:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util

13:20:01 dev253-3 2802.25 0.69 22417.32 8.00 465.05 165.94 0.02 4.32

13:30:01 dev253-3 1559.87 11159.45 12120.99 14.92 64.17 41.11 0.08 12.02

13:40:01 dev253-3 922.62 8066.62 7129.15 16.47 19.75 21.40 0.08 6.99

13:50:01 dev253-3 1194.81 895.34 9524.53 8.72 28.40 23.76 0.01 1.69

14:00:01 dev253-3 1919.12 0.46 15352.49 8.00 51.75 26.95 0.01 1.61

14:10:01 dev253-3 1770.59 9286.61 13873.79 13.08 139.86 78.97 0.08 14.46

14:20:02 dev253-3 1595.04 11810.63 12389.08 15.17 109.17 68.42 0.15 24.71

14:30:01 dev253-3 1793.71 12173.88 13957.79 14.57 141.56 78.89 0.08 13.61

14:40:02 dev253-3 1751.62 0.43 14012.53 8.00 43.38 24.76 0.01 1.40

14:50:01 dev253-3 1351.72 3225.19 10707.29 10.31 31.91 23.59 0.02 2.93

On Wed, Jul 17, 2013 at 1:09 PM, Giuseppe Broccolo <giuseppe.broccolo@2ndquadrant.it> wrote:

Hi,

Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:
In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Distributed transactions are base on two-phase-commit (2PC) algorithms for ensuring correct transaction completion, so are synchronous.
However, I think this is not the main reason behind your extremely high disk I/O. You should check if your system is properly tuned to get the best performances.
First of all, you could take a look on your PostgreSQL configurations, and check if shared_memory is set properly taking into account your RAM availability. The conservative PostgreSQL default value is 24 MB, forcing system to exploit many disk I/O resources.
Aside from this, you could take a look if autovacuum is often triggered (generating a large amount of I/O) in case of large use of updates/inserts in your database.

Regards,

Giuseppe.

-- Giuseppe Broccolo - 2ndQuadrant Italy PostgreSQL Training, Services and Support giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

Re: Distributed transactions and asynchronous commit

From

Giuseppe Broccolo

Date:

17 July 2013, 11:14:23

Il 17/07/2013 12:52, Xenofon Papadopoulos ha scritto:

Thank you for your replies so far.
The DB in question is Postgres+ 9.2 running inside a VM with the following specs:

16 CPUs (dedicated to the VM)
60G RAM
RAID-10 storage on a SAN for pgdata and pgarchieves, using different LUNs for each.

We have 3 kind of queries:

- The vast majority of the queries are small SELECT/INSERT/UPDATEs which are part of distributed transactions
- A few small ones, which are mostly SELECTs
- A few bulk loads, where we add 100k - 1M of rows in tables

Our settings are:

shared_buffers: 8G
work_mem: 12M
checkpoint_segments: 64

shared_buffers could be set up to 20-30% of the available RAM: in your case, 16GB could be a reasonable value.

Autovacuum is somewhat aggressive, as our data changes quite often and without it the planner was completely off.
Right now we use:

autovacuum_analyze_scale_factor: 0.1
autovacuum_analyze_threshold: 50

autovacuum_freeze_max_age: 200000000

autovacuum_max_workers: 12
autovacuum_naptime: 10s
autovacuum_vacuum_cost_delay: 20ms
autovacuum_vacuum_cost_limit: -1
autovacuum_vacuum_scale_factor: 0.2
autovacuum_vacuum_threshold: 50

This means that auto vacuum will be triggered after around 50 updates aech time, if your database is doing a lot of updates/inserts (as I understood) an unnecessary amount of vacuum statements can be reached, which will generate a lot of IO. If the inserts/updates are small, this value could be decreased.

Giuseppe.

At high-peak hour, the disk utilization for the pgdata mountpoint is:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-2 7711.62 24166.97 56657.95 10.48 735.28 95.09 0.11 86.11
13:30:01 dev253-2 5340.88 19465.30 39133.32 10.97 319.20 59.94 0.15 82.30
13:40:01 dev253-2 2791.02 13061.76 19330.40 11.61 349.95 125.38 0.33 90.73
13:50:01 dev253-2 3478.69 10503.84 25505.27 10.35 308.12 88.57 0.20 68.12
14:00:01 dev253-2 5269.12 33613.43 35830.13 13.18 232.48 44.09 0.19 100.05
14:10:01 dev253-2 4910.24 21767.22 33970.96 11.35 322.52 65.64 0.21 104.55
14:20:02 dev253-2 5358.95 40772.03 33682.46 13.89 721.81 134.32 0.20 104.92
14:30:01 dev253-2 4420.51 17256.16 33315.27 11.44 336.53 76.13 0.15 65.25
14:40:02 dev253-2 4884.13 28439.26 31604.76 12.29 265.32 54.26 0.20 97.51
14:50:01 dev253-2 3124.91 8077.46 22511.59 9.79 50.41 16.13 0.24 76.17

and for pgarchives:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-3 2802.25 0.69 22417.32 8.00 465.05 165.94 0.02 4.32
13:30:01 dev253-3 1559.87 11159.45 12120.99 14.92 64.17 41.11 0.08 12.02
13:40:01 dev253-3 922.62 8066.62 7129.15 16.47 19.75 21.40 0.08 6.99
13:50:01 dev253-3 1194.81 895.34 9524.53 8.72 28.40 23.76 0.01 1.69
14:00:01 dev253-3 1919.12 0.46 15352.49 8.00 51.75 26.95 0.01 1.61
14:10:01 dev253-3 1770.59 9286.61 13873.79 13.08 139.86 78.97 0.08 14.46
14:20:02 dev253-3 1595.04 11810.63 12389.08 15.17 109.17 68.42 0.15 24.71
14:30:01 dev253-3 1793.71 12173.88 13957.79 14.57 141.56 78.89 0.08 13.61
14:40:02 dev253-3 1751.62 0.43 14012.53 8.00 43.38 24.76 0.01 1.40
14:50:01 dev253-3 1351.72 3225.19 10707.29 10.31 31.91 23.59 0.02 2.93
On Wed, Jul 17, 2013 at 1:09 PM, Giuseppe Broccolo <giuseppe.broccolo@2ndquadrant.it> wrote:
Hi,

Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:
In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Distributed transactions are base on two-phase-commit (2PC) algorithms for ensuring correct transaction completion, so are synchronous.
However, I think this is not the main reason behind your extremely high disk I/O. You should check if your system is properly tuned to get the best performances.
First of all, you could take a look on your PostgreSQL configurations, and check if shared_memory is set properly taking into account your RAM availability. The conservative PostgreSQL default value is 24 MB, forcing system to exploit many disk I/O resources.
Aside from this, you could take a look if autovacuum is often triggered (generating a large amount of I/O) in case of large use of updates/inserts in your database.

Regards,

Giuseppe.

-- Giuseppe Broccolo - 2ndQuadrant Italy PostgreSQL Training, Services and Support giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

Re: Distributed transactions and asynchronous commit

From

Vasilis Ventirozos

Date:

17 July 2013, 11:21:14

On Wed, Jul 17, 2013 at 11:52 AM, Xenofon Papadopoulos <xpapad@gmail.com> wrote:

Thank you for your replies so far.
The DB in question is Postgres+ 9.2 running inside a VM with the following specs:

16 CPUs (dedicated to the VM)
60G RAM
RAID-10 storage on a SAN for pgdata and pgarchieves, using different LUNs for each.

We have 3 kind of queries:

- The vast majority of the queries are small SELECT/INSERT/UPDATEs which are part of distributed transactions
- A few small ones, which are mostly SELECTs
- A few bulk loads, where we add 100k - 1M of rows in tables

Our settings are:

shared_buffers: 8G
work_mem: 12M
checkpoint_segments: 64

Autovacuum is somewhat aggressive, as our data changes quite often and without it the planner was completely off.
Right now we use:

autovacuum_analyze_scale_factor: 0.1
autovacuum_analyze_threshold: 50
autovacuum_freeze_max_age: 200000000
autovacuum_max_workers: 12
autovacuum_naptime: 10s
autovacuum_vacuum_cost_delay: 20ms
autovacuum_vacuum_cost_limit: -1
autovacuum_vacuum_scale_factor: 0.2
autovacuum_vacuum_threshold: 50

settings look ok, except vacuum and analyze threshold that is in my opinion too agressive (500 would make more sense) and workers at 6 you haven't mentioned wal_buffers and effective_io_concurrency settings but i dont think that it would make much of a difference

At high-peak hour, the disk utilization for the pgdata mountpoint is:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-2 7711.62 24166.97 56657.95 10.48 735.28 95.09 0.11 86.11
13:30:01 dev253-2 5340.88 19465.30 39133.32 10.97 319.20 59.94 0.15 82.30
13:40:01 dev253-2 2791.02 13061.76 19330.40 11.61 349.95 125.38 0.33 90.73
13:50:01 dev253-2 3478.69 10503.84 25505.27 10.35 308.12 88.57 0.20 68.12
14:00:01 dev253-2 5269.12 33613.43 35830.13 13.18 232.48 44.09 0.19 100.05
14:10:01 dev253-2 4910.24 21767.22 33970.96 11.35 322.52 65.64 0.21 104.55
14:20:02 dev253-2 5358.95 40772.03 33682.46 13.89 721.81 134.32 0.20 104.92
14:30:01 dev253-2 4420.51 17256.16 33315.27 11.44 336.53 76.13 0.15 65.25
14:40:02 dev253-2 4884.13 28439.26 31604.76 12.29 265.32 54.26 0.20 97.51
14:50:01 dev253-2 3124.91 8077.46 22511.59 9.79 50.41 16.13 0.24 76.17

assuming that sector = 512 bytes, it means that your san makes 20mb/sec read which if its not totally random-reads is quite low,

i would start from there, make tests to see if everything works ok, (bonnie++, dd , etc) and if you are getting the numbers you are supposed to

and for pgarchives:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-3 2802.25 0.69 22417.32 8.00 465.05 165.94 0.02 4.32
13:30:01 dev253-3 1559.87 11159.45 12120.99 14.92 64.17 41.11 0.08 12.02
13:40:01 dev253-3 922.62 8066.62 7129.15 16.47 19.75 21.40 0.08 6.99
13:50:01 dev253-3 1194.81 895.34 9524.53 8.72 28.40 23.76 0.01 1.69
14:00:01 dev253-3 1919.12 0.46 15352.49 8.00 51.75 26.95 0.01 1.61
14:10:01 dev253-3 1770.59 9286.61 13873.79 13.08 139.86 78.97 0.08 14.46
14:20:02 dev253-3 1595.04 11810.63 12389.08 15.17 109.17 68.42 0.15 24.71
14:30:01 dev253-3 1793.71 12173.88 13957.79 14.57 141.56 78.89 0.08 13.61
14:40:02 dev253-3 1751.62 0.43 14012.53 8.00 43.38 24.76 0.01 1.40
14:50:01 dev253-3 1351.72 3225.19 10707.29 10.31 31.91 23.59 0.02 2.93
On Wed, Jul 17, 2013 at 1:09 PM, Giuseppe Broccolo <giuseppe.broccolo@2ndquadrant.it> wrote:
Hi,

Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:
In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Distributed transactions are base on two-phase-commit (2PC) algorithms for ensuring correct transaction completion, so are synchronous.
However, I think this is not the main reason behind your extremely high disk I/O. You should check if your system is properly tuned to get the best performances.
First of all, you could take a look on your PostgreSQL configurations, and check if shared_memory is set properly taking into account your RAM availability. The conservative PostgreSQL default value is 24 MB, forcing system to exploit many disk I/O resources.
Aside from this, you could take a look if autovacuum is often triggered (generating a large amount of I/O) in case of large use of updates/inserts in your database.

Regards,

Giuseppe.

-- Giuseppe Broccolo - 2ndQuadrant Italy PostgreSQL Training, Services and Support giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

Re: Distributed transactions and asynchronous commit

From

Vasilis Ventirozos

Date:

17 July 2013, 11:26:58

On Wed, Jul 17, 2013 at 12:21 PM, Vasilis Ventirozos <v.ventirozos@gmail.com> wrote:

On Wed, Jul 17, 2013 at 11:52 AM, Xenofon Papadopoulos <xpapad@gmail.com> wrote:
Thank you for your replies so far.
The DB in question is Postgres+ 9.2 running inside a VM with the following specs:

16 CPUs (dedicated to the VM)
60G RAM
RAID-10 storage on a SAN for pgdata and pgarchieves, using different LUNs for each.

We have 3 kind of queries:

- The vast majority of the queries are small SELECT/INSERT/UPDATEs which are part of distributed transactions
- A few small ones, which are mostly SELECTs
- A few bulk loads, where we add 100k - 1M of rows in tables

Our settings are:

shared_buffers: 8G
work_mem: 12M
checkpoint_segments: 64

Autovacuum is somewhat aggressive, as our data changes quite often and without it the planner was completely off.
Right now we use:

autovacuum_analyze_scale_factor: 0.1
autovacuum_analyze_threshold: 50
autovacuum_freeze_max_age: 200000000
autovacuum_max_workers: 12
autovacuum_naptime: 10s
autovacuum_vacuum_cost_delay: 20ms
autovacuum_vacuum_cost_limit: -1
autovacuum_vacuum_scale_factor: 0.2
autovacuum_vacuum_threshold: 50

settings look ok, except vacuum and analyze threshold that is in my opinion too agressive (500 would make more sense) and workers at 6 you haven't mentioned wal_buffers and effective_io_concurrency settings but i dont think that it would make much of a difference


At high-peak hour, the disk utilization for the pgdata mountpoint is:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-2 7711.62 24166.97 56657.95 10.48 735.28 95.09 0.11 86.11
13:30:01 dev253-2 5340.88 19465.30 39133.32 10.97 319.20 59.94 0.15 82.30
13:40:01 dev253-2 2791.02 13061.76 19330.40 11.61 349.95 125.38 0.33 90.73
13:50:01 dev253-2 3478.69 10503.84 25505.27 10.35 308.12 88.57 0.20 68.12
14:00:01 dev253-2 5269.12 33613.43 35830.13 13.18 232.48 44.09 0.19 100.05
14:10:01 dev253-2 4910.24 21767.22 33970.96 11.35 322.52 65.64 0.21 104.55
14:20:02 dev253-2 5358.95 40772.03 33682.46 13.89 721.81 134.32 0.20 104.92
14:30:01 dev253-2 4420.51 17256.16 33315.27 11.44 336.53 76.13 0.15 65.25
14:40:02 dev253-2 4884.13 28439.26 31604.76 12.29 265.32 54.26 0.20 97.51
14:50:01 dev253-2 3124.91 8077.46 22511.59 9.79 50.41 16.13 0.24 76.17

assuming that sector = 512 bytes, it means that your san makes 20mb/sec read which if its not totally random-reads is quite low,
i would start from there, make tests to see if everything works ok, (bonnie++, dd , etc) and if you are getting the numbers you are supposed to

i would also check for index / table bloat, here's a script that it would do that for you
http://labs.omniti.com/labs/pgtreats/browser/trunk/tools/pg_bloat_report.pl

and for pgarchives:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-3 2802.25 0.69 22417.32 8.00 465.05 165.94 0.02 4.32
13:30:01 dev253-3 1559.87 11159.45 12120.99 14.92 64.17 41.11 0.08 12.02
13:40:01 dev253-3 922.62 8066.62 7129.15 16.47 19.75 21.40 0.08 6.99
13:50:01 dev253-3 1194.81 895.34 9524.53 8.72 28.40 23.76 0.01 1.69
14:00:01 dev253-3 1919.12 0.46 15352.49 8.00 51.75 26.95 0.01 1.61
14:10:01 dev253-3 1770.59 9286.61 13873.79 13.08 139.86 78.97 0.08 14.46
14:20:02 dev253-3 1595.04 11810.63 12389.08 15.17 109.17 68.42 0.15 24.71
14:30:01 dev253-3 1793.71 12173.88 13957.79 14.57 141.56 78.89 0.08 13.61
14:40:02 dev253-3 1751.62 0.43 14012.53 8.00 43.38 24.76 0.01 1.40
14:50:01 dev253-3 1351.72 3225.19 10707.29 10.31 31.91 23.59 0.02 2.93
On Wed, Jul 17, 2013 at 1:09 PM, Giuseppe Broccolo <giuseppe.broccolo@2ndquadrant.it> wrote:
Hi,

Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:
In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Distributed transactions are base on two-phase-commit (2PC) algorithms for ensuring correct transaction completion, so are synchronous.
However, I think this is not the main reason behind your extremely high disk I/O. You should check if your system is properly tuned to get the best performances.
First of all, you could take a look on your PostgreSQL configurations, and check if shared_memory is set properly taking into account your RAM availability. The conservative PostgreSQL default value is 24 MB, forcing system to exploit many disk I/O resources.
Aside from this, you could take a look if autovacuum is often triggered (generating a large amount of I/O) in case of large use of updates/inserts in your database.

Regards,

Giuseppe.

-- Giuseppe Broccolo - 2ndQuadrant Italy PostgreSQL Training, Services and Support giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

Re: Distributed transactions and asynchronous commit

From

Xenofon Papadopoulos

Date:

17 July 2013, 12:18:30

wal_buffers: 32M

effective_io_concurrency: 4

There is no bloat.

Note that we are using Postgres inside a VM, there is a VMFS layer on top of the LUNs which might affect the performance. That said, we're still wondering if this much I/O is normal and if we can somehow reduce it. Enabling async commits in a DB without distributed transactions resulted to a huge decrease in I/O, here there was almost no effect.

On Wed, Jul 17, 2013 at 2:21 PM, Vasilis Ventirozos <v.ventirozos@gmail.com> wrote:

On Wed, Jul 17, 2013 at 11:52 AM, Xenofon Papadopoulos <xpapad@gmail.com> wrote:
Thank you for your replies so far.
The DB in question is Postgres+ 9.2 running inside a VM with the following specs:

16 CPUs (dedicated to the VM)
60G RAM
RAID-10 storage on a SAN for pgdata and pgarchieves, using different LUNs for each.

We have 3 kind of queries:

- The vast majority of the queries are small SELECT/INSERT/UPDATEs which are part of distributed transactions
- A few small ones, which are mostly SELECTs
- A few bulk loads, where we add 100k - 1M of rows in tables

Our settings are:

shared_buffers: 8G
work_mem: 12M
checkpoint_segments: 64

Autovacuum is somewhat aggressive, as our data changes quite often and without it the planner was completely off.
Right now we use:

autovacuum_analyze_scale_factor: 0.1
autovacuum_analyze_threshold: 50
autovacuum_freeze_max_age: 200000000
autovacuum_max_workers: 12
autovacuum_naptime: 10s
autovacuum_vacuum_cost_delay: 20ms
autovacuum_vacuum_cost_limit: -1
autovacuum_vacuum_scale_factor: 0.2
autovacuum_vacuum_threshold: 50

settings look ok, except vacuum and analyze threshold that is in my opinion too agressive (500 would make more sense) and workers at 6 you haven't mentioned wal_buffers and effective_io_concurrency settings but i dont think that it would make much of a difference


At high-peak hour, the disk utilization for the pgdata mountpoint is:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-2 7711.62 24166.97 56657.95 10.48 735.28 95.09 0.11 86.11
13:30:01 dev253-2 5340.88 19465.30 39133.32 10.97 319.20 59.94 0.15 82.30
13:40:01 dev253-2 2791.02 13061.76 19330.40 11.61 349.95 125.38 0.33 90.73
13:50:01 dev253-2 3478.69 10503.84 25505.27 10.35 308.12 88.57 0.20 68.12
14:00:01 dev253-2 5269.12 33613.43 35830.13 13.18 232.48 44.09 0.19 100.05
14:10:01 dev253-2 4910.24 21767.22 33970.96 11.35 322.52 65.64 0.21 104.55
14:20:02 dev253-2 5358.95 40772.03 33682.46 13.89 721.81 134.32 0.20 104.92
14:30:01 dev253-2 4420.51 17256.16 33315.27 11.44 336.53 76.13 0.15 65.25
14:40:02 dev253-2 4884.13 28439.26 31604.76 12.29 265.32 54.26 0.20 97.51
14:50:01 dev253-2 3124.91 8077.46 22511.59 9.79 50.41 16.13 0.24 76.17

assuming that sector = 512 bytes, it means that your san makes 20mb/sec read which if its not totally random-reads is quite low,
i would start from there, make tests to see if everything works ok, (bonnie++, dd , etc) and if you are getting the numbers you are supposed to
and for pgarchives:

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
13:20:01 dev253-3 2802.25 0.69 22417.32 8.00 465.05 165.94 0.02 4.32
13:30:01 dev253-3 1559.87 11159.45 12120.99 14.92 64.17 41.11 0.08 12.02
13:40:01 dev253-3 922.62 8066.62 7129.15 16.47 19.75 21.40 0.08 6.99
13:50:01 dev253-3 1194.81 895.34 9524.53 8.72 28.40 23.76 0.01 1.69
14:00:01 dev253-3 1919.12 0.46 15352.49 8.00 51.75 26.95 0.01 1.61
14:10:01 dev253-3 1770.59 9286.61 13873.79 13.08 139.86 78.97 0.08 14.46
14:20:02 dev253-3 1595.04 11810.63 12389.08 15.17 109.17 68.42 0.15 24.71
14:30:01 dev253-3 1793.71 12173.88 13957.79 14.57 141.56 78.89 0.08 13.61
14:40:02 dev253-3 1751.62 0.43 14012.53 8.00 43.38 24.76 0.01 1.40
14:50:01 dev253-3 1351.72 3225.19 10707.29 10.31 31.91 23.59 0.02 2.93
On Wed, Jul 17, 2013 at 1:09 PM, Giuseppe Broccolo <giuseppe.broccolo@2ndquadrant.it> wrote:
Hi,

Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:
In the asynchronous commit documentation, it says:

The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous

Does this mean that all queries that are part of a distributed transaction are synchronous?

In our databases we have extremely high disk I/O, I'm wondering if distributed transactions may be the reason behind it.

Distributed transactions are base on two-phase-commit (2PC) algorithms for ensuring correct transaction completion, so are synchronous.
However, I think this is not the main reason behind your extremely high disk I/O. You should check if your system is properly tuned to get the best performances.
First of all, you could take a look on your PostgreSQL configurations, and check if shared_memory is set properly taking into account your RAM availability. The conservative PostgreSQL default value is 24 MB, forcing system to exploit many disk I/O resources.
Aside from this, you could take a look if autovacuum is often triggered (generating a large amount of I/O) in case of large use of updates/inserts in your database.

Regards,

Giuseppe.

-- Giuseppe Broccolo - 2ndQuadrant Italy PostgreSQL Training, Services and Support giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it