Thread: replication timeout in pg_basebackup
Our environment: Postgres version 9.2.2 running on CentOS 6.4
Our backups using pg_basebackup are frequently failing with following error
Our replication timeout is default 60 seconds. If we increase the replication time to say 180 seconds, we see better results but backups still fail occasionally.
Running strace on pg_basebackup process, we see that the fsync() call takes significant time and could be responsible for causing this timeout in postgres.
Has anybody else run into the same issue? Is there a way to run pg_basebackup without fsync() ?
Our backups using pg_basebackup are frequently failing with following error
"pg_basebackup: could not send feedback packet: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request."In postgres logs we see this log message "terminating walsender process due to replication timeout".
We are invoking pg_basebackup with these arguments : pg_basebackup -D backup_dir -X stream -l backup_dir
Our replication timeout is default 60 seconds. If we increase the replication time to say 180 seconds, we see better results but backups still fail occasionally.
Running strace on pg_basebackup process, we see that the fsync() call takes significant time and could be responsible for causing this timeout in postgres.
Has anybody else run into the same issue? Is there a way to run pg_basebackup without fsync() ?
On Mon, Mar 10, 2014 at 12:52 PM, Aggarwal, Ajay <aaggarwal@verizon.com> wrote:
Our environment: Postgres version 9.2.2 running on CentOS 6.4
Our backups using pg_basebackup are frequently failing with following error
"pg_basebackup: could not send feedback packet: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request."In postgres logs we see this log message "terminating walsender process due to replication timeout".
We are invoking pg_basebackup with these arguments : pg_basebackup -D backup_dir -X stream -l backup_dir
Our replication timeout is default 60 seconds. If we increase the replication time to say 180 seconds, we see better results but backups still fail occasionally.
Running strace on pg_basebackup process, we see that the fsync() call takes significant time and could be responsible for causing this timeout in postgres.
Use the pg_test_fsync utility which is available in postgresql contrib module to test your system sync methods performance.
Has anybody else run into the same issue? Is there a way to run pg_basebackup without fsync() ?
As of now there is no such options available, I feel it is better to find why the sync is taking time?
Regards,
Hari Babu
Fujitsu Australia
Thanks Hari Babu.
I think what is happening is that my dirty cache builds up quickly for the volume where I am backing up. This would trigger flush of these dirty pages to the disk. While this flush is going on pg_basebackup tries to do fsync() on a received WAL file and gets blocked.
While in this state, i.e. when dirty page count is high, following are the results of pg_test_fsync
# /usr/pgsql-9.2/bin/pg_test_fsync -f /backup/fsync_test
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 16.854 ops/sec
fdatasync 15.242 ops/sec
fsync 0.187 ops/sec
fsync_writethrough n/a
open_sync 14.747 ops/sec
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 6.137 ops/sec
fdatasync 14.899 ops/sec
fsync 0.007 ops/sec
fsync_writethrough n/a
open_sync 1.450 ops/sec
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 13.486 ops/sec
2 * 8kB open_sync writes 6.006 ops/sec
4 * 4kB open_sync writes 3.446 ops/sec
8 * 2kB open_sync writes 1.400 ops/sec
16 * 1kB open_sync writes 0.859 ops/sec
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 0.009 ops/sec
write, close, fsync 0.008 ops/sec
Non-Sync'ed 8kB writes:
write 99415.368 ops/sec
However when backups are not going on and dirty pages count is low, below are the results of this test
# /usr/pgsql-9.2/bin/pg_test_fsync -f /backup/fsync_test
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 1974.243 ops/sec
fdatasync 1410.804 ops/sec
fsync 181.129 ops/sec
fsync_writethrough n/a
open_sync 547.389 ops/sec
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 290.109 ops/sec
fdatasync 962.378 ops/sec
fsync 158.987 ops/sec
fsync_writethrough n/a
open_sync 642.309 ops/sec
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 1014.456 ops/sec
2 * 8kB open_sync writes 627.964 ops/sec
4 * 4kB open_sync writes 340.313 ops/sec
8 * 2kB open_sync writes 173.581 ops/sec
16 * 1kB open_sync writes 103.236 ops/sec
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 244.670 ops/sec
write, close, fsync 207.248 ops/sec
Non-Sync'ed 8kB writes:
write 202216.900 ops/sec
I think what is happening is that my dirty cache builds up quickly for the volume where I am backing up. This would trigger flush of these dirty pages to the disk. While this flush is going on pg_basebackup tries to do fsync() on a received WAL file and gets blocked.
While in this state, i.e. when dirty page count is high, following are the results of pg_test_fsync
# /usr/pgsql-9.2/bin/pg_test_fsync -f /backup/fsync_test
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 16.854 ops/sec
fdatasync 15.242 ops/sec
fsync 0.187 ops/sec
fsync_writethrough n/a
open_sync 14.747 ops/sec
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 6.137 ops/sec
fdatasync 14.899 ops/sec
fsync 0.007 ops/sec
fsync_writethrough n/a
open_sync 1.450 ops/sec
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 13.486 ops/sec
2 * 8kB open_sync writes 6.006 ops/sec
4 * 4kB open_sync writes 3.446 ops/sec
8 * 2kB open_sync writes 1.400 ops/sec
16 * 1kB open_sync writes 0.859 ops/sec
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 0.009 ops/sec
write, close, fsync 0.008 ops/sec
Non-Sync'ed 8kB writes:
write 99415.368 ops/sec
However when backups are not going on and dirty pages count is low, below are the results of this test
# /usr/pgsql-9.2/bin/pg_test_fsync -f /backup/fsync_test
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 1974.243 ops/sec
fdatasync 1410.804 ops/sec
fsync 181.129 ops/sec
fsync_writethrough n/a
open_sync 547.389 ops/sec
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 290.109 ops/sec
fdatasync 962.378 ops/sec
fsync 158.987 ops/sec
fsync_writethrough n/a
open_sync 642.309 ops/sec
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 1014.456 ops/sec
2 * 8kB open_sync writes 627.964 ops/sec
4 * 4kB open_sync writes 340.313 ops/sec
8 * 2kB open_sync writes 173.581 ops/sec
16 * 1kB open_sync writes 103.236 ops/sec
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 244.670 ops/sec
write, close, fsync 207.248 ops/sec
Non-Sync'ed 8kB writes:
write 202216.900 ops/sec
From: Haribabu Kommi [kommi.haribabu@gmail.com]
Sent: Monday, March 10, 2014 1:42 AM
To: Aggarwal, Ajay
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup
Sent: Monday, March 10, 2014 1:42 AM
To: Aggarwal, Ajay
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup
On Mon, Mar 10, 2014 at 12:52 PM, Aggarwal, Ajay <aaggarwal@verizon.com> wrote:
Our environment: Postgres version 9.2.2 running on CentOS 6.4
Our backups using pg_basebackup are frequently failing with following error
"pg_basebackup: could not send feedback packet: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request."In postgres logs we see this log message "terminating walsender process due to replication timeout".
We are invoking pg_basebackup with these arguments : pg_basebackup -D backup_dir -X stream -l backup_dir
Our replication timeout is default 60 seconds. If we increase the replication time to say 180 seconds, we see better results but backups still fail occasionally.
Running strace on pg_basebackup process, we see that the fsync() call takes significant time and could be responsible for causing this timeout in postgres.
Use the pg_test_fsync utility which is available in postgresql contrib module to test your system sync methods performance.
Has anybody else run into the same issue? Is there a way to run pg_basebackup without fsync() ?
As of now there is no such options available, I feel it is better to find why the sync is taking time?
Regards,
Hari Babu
Fujitsu Australia
On Tue, Mar 11, 2014 at 7:07 AM, Aggarwal, Ajay <aaggarwal@verizon.com> wrote: > Thanks Hari Babu. > > I think what is happening is that my dirty cache builds up quickly for the > volume where I am backing up. This would trigger flush of these dirty pages > to the disk. While this flush is going on pg_basebackup tries to do fsync() > on a received WAL file and gets blocked. But the sync is executed for every WAL file finish. Does your database is big in size? Does your setup is write-heavy operations? In Linux when it tries to write a bunch of buffers at once, the fysnc call might block for some time. In the following link there are some "Tuning Recommendations for write-heavy operations" which might be useful to you. http://www.westnet.com/~gsmith/content/linux-pdflush.htm Any other ideas to handle these kind of problems? Regards, Hari Babu Fujitsu Australia
I have already tried experimenting with linux dirty_ratio etc. You can only fine tune up to a limit. The backup process stillfills up the buffer cache very quickly. Yes, my database is about 5-6 GB in size and will grow bigger over time. If wish there was a way to slow down pg_basebackup or force it to use direct I/O. ________________________________________ From: Haribabu Kommi [kommi.haribabu@gmail.com] Sent: Monday, March 10, 2014 8:31 PM To: Aggarwal, Ajay Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] replication timeout in pg_basebackup On Tue, Mar 11, 2014 at 7:07 AM, Aggarwal, Ajay <aaggarwal@verizon.com> wrote: > Thanks Hari Babu. > > I think what is happening is that my dirty cache builds up quickly for the > volume where I am backing up. This would trigger flush of these dirty pages > to the disk. While this flush is going on pg_basebackup tries to do fsync() > on a received WAL file and gets blocked. But the sync is executed for every WAL file finish. Does your database is big in size? Does your setup is write-heavy operations? In Linux when it tries to write a bunch of buffers at once, the fysnc call might block for some time. In the following link there are some "Tuning Recommendations for write-heavy operations" which might be useful to you. http://www.westnet.com/~gsmith/content/linux-pdflush.htm Any other ideas to handle these kind of problems? Regards, Hari Babu Fujitsu Australia
If I don't include WAL files as part of my backup, I do not run into this issue. But a backup without WAL files is not whatI want. ________________________________________ From: Aggarwal, Ajay Sent: Monday, March 10, 2014 9:46 PM To: Haribabu Kommi Cc: pgsql-general@postgresql.org Subject: RE: [GENERAL] replication timeout in pg_basebackup I have already tried experimenting with linux dirty_ratio etc. You can only fine tune up to a limit. The backup process stillfills up the buffer cache very quickly. Yes, my database is about 5-6 GB in size and will grow bigger over time. If wish there was a way to slow down pg_basebackup or force it to use direct I/O. ________________________________________ From: Haribabu Kommi [kommi.haribabu@gmail.com] Sent: Monday, March 10, 2014 8:31 PM To: Aggarwal, Ajay Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] replication timeout in pg_basebackup On Tue, Mar 11, 2014 at 7:07 AM, Aggarwal, Ajay <aaggarwal@verizon.com> wrote: > Thanks Hari Babu. > > I think what is happening is that my dirty cache builds up quickly for the > volume where I am backing up. This would trigger flush of these dirty pages > to the disk. While this flush is going on pg_basebackup tries to do fsync() > on a received WAL file and gets blocked. But the sync is executed for every WAL file finish. Does your database is big in size? Does your setup is write-heavy operations? In Linux when it tries to write a bunch of buffers at once, the fysnc call might block for some time. In the following link there are some "Tuning Recommendations for write-heavy operations" which might be useful to you. http://www.westnet.com/~gsmith/content/linux-pdflush.htm Any other ideas to handle these kind of problems? Regards, Hari Babu Fujitsu Australia
On 3/9/2014 6:52 PM, Aggarwal, Ajay wrote:
Our replication timeout is default 60 seconds. If we increase the replication time to say 180 seconds, we see better results but backups still fail occasionally.
so increase it to 300 seconds, or whatever. thats an upper limit, it needs to be big enough that you DONT get into problems when doing stuff like basebackups.
-- john r pierce 37N 122W somewhere on the middle of the left coast
Thats exactly what I was thinking after all other experiments. Couple of questions:
1) why did you say that 300 seconds is the upper limit? Is this enforced by Postgres? What if I want to set it to 10 minutes?
2) whats the downside of bigger replication timeout?
Thanks.
Ajay
so increase it to 300 seconds, or whatever. thats an upper limit, it needs to be big enough that you DONT get into problems when doing stuff like basebackups.
1) why did you say that 300 seconds is the upper limit? Is this enforced by Postgres? What if I want to set it to 10 minutes?
2) whats the downside of bigger replication timeout?
Thanks.
Ajay
From: pgsql-general-owner@postgresql.org [pgsql-general-owner@postgresql.org] on behalf of John R Pierce [pierce@hogranch.com]
Sent: Monday, March 10, 2014 9:58 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup
Sent: Monday, March 10, 2014 9:58 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup
On 3/9/2014 6:52 PM, Aggarwal, Ajay wrote:
Our replication timeout is default 60 seconds. If we increase the replication time to say 180 seconds, we see better results but backups still fail occasionally.
so increase it to 300 seconds, or whatever. thats an upper limit, it needs to be big enough that you DONT get into problems when doing stuff like basebackups.
-- john r pierce 37N 122W somewhere on the middle of the left coast
On 3/11/2014 5:50 AM, Aggarwal, Ajay wrote: > Thats exactly what I was thinking after all other experiments. Couple > of questions: > 1) why did you say that 300 seconds is the upper limit? Is this > enforced by Postgres? What if I want to set it to 10 minutes? > 2) whats the downside of bigger replication timeout? I said, set it to 300 or whatever. An unfortunate extra comma confused my meaning. What I meant was, whatever you set it to, thats a upper limit. As I understand it, that timeout is how long the replication can lag the server before the server decides to stop replication. with it at 300, under heavy load, the replication could run as much as 5 minutes (300 seconds) behind before it errors. -- john r pierce 37N 122W somewhere on the middle of the left coast