Home > mailing lists

Re: Streaming Replication patch for CommitFest 2009-09 - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Streaming Replication patch for CommitFest 2009-09
Date	September 17, 2009 08:33:16
Msg-id	4AB21E6B.1020602@enterprisedb.com Whole thread Raw
In response to	Streaming Replication patch for CommitFest 2009-09 (Fujii Masao <masao.fujii@gmail.com>)
Responses	Re: Streaming Replication patch for CommitFest 2009-09
List	pgsql-hackers

Tree view

Some random comments:

I don't think we need the new PM_SHUTDOWN_3 postmaster state. We can
treat walsenders the same as the archive process, and kill and wait for
both of them to die in PM_SHUTDOWN_2 state.

I think there's something wrong with the napping in walsender. When I
perform px_xlog_switch(), it takes surprisingly long for it to trickle
to the standby. When I put a little proxy program in between the master
and slave that delays all messages from the slave to the master by one
second, it got worse, even though I would expect the master to still
keep sending WAL at full speed. I get logs like this:

2009-09-17 14:13:16.876 EEST LOG:  xlog send request 0/38000000; send
0/3700006C; write 0/3700006C
2009-09-17 14:13:16.877 EEST LOG:  xlog read request 0/37010000; send
0/37010000; write 0/3700006C
2009-09-17 14:13:17.077 EEST LOG:  xlog send request 0/38000000; send
0/37010000; write 0/3700006C
2009-09-17 14:13:17.077 EEST LOG:  xlog read request 0/37020000; send
0/37020000; write 0/3700006C
2009-09-17 14:13:17.078 EEST LOG:  xlog read request 0/37030000; send
0/37030000; write 0/3700006C
2009-09-17 14:13:17.278 EEST LOG:  xlog send request 0/38000000; send
0/37030000; write 0/3700006C
2009-09-17 14:13:17.279 EEST LOG:  xlog read request 0/37040000; send
0/37040000; write 0/3700006C
...
2009-09-17 14:13:22.796 EEST LOG:  xlog read request 0/37FD0000; send
0/37FD0000; write 0/376D0000
2009-09-17 14:13:22.896 EEST LOG:  xlog send request 0/38000000; send
0/37FD0000; write 0/376D0000
2009-09-17 14:13:22.896 EEST LOG:  xlog read request 0/37FE0000; send
0/37FE0000; write 0/376D0000
2009-09-17 14:13:22.896 EEST LOG:  xlog read request 0/37FF0000; send
0/37FF0000; write 0/376D0000
2009-09-17 14:13:22.897 EEST LOG:  xlog read request 0/38000000; send
0/38000000; write 0/376D0000
2009-09-17 14:14:09.932 EEST LOG:  xlog send request 0/38000428; send
0/38000000; write 0/38000000
2009-09-17 14:14:09.932 EEST LOG:  xlog read request 0/38000428; send
0/38000428; write 0/38000000

It looks like it's having 100 or 200 ms naps in between. Also, I
wouldn't expect to see so many "read request" acknowledgments from the
slave. The master doesn't really need to know how far the slave is,
except in synchronous replication when it has requested a flush to
slave. Another reason why master needs to know is so that the master can
recycle old log files, but for that we'd really only need an
acknowledgment once per WAL file or even less.

Why does XLogSend() care about page boundaries? Perhaps it's a leftover
from the old approach that read from wal_buffers?

Do we really need the support for asynchronous backend libpq commands?
Could walsender just keep blasting WAL to the slave, and only try to
read an acknowledgment after it has requested one, by setting
XLOGSTREAM_FLUSH flag. Or maybe we should be putting the socket into
non-blocking mode.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Andrew Dunstan
Date: 17 September 2009, 08:30:07
Subject: Re: generic copy options

From: Emmanuel Cecchet
Date: 17 September 2009, 08:55:22
Subject: Re: generic copy options

Re: Streaming Replication patch for CommitFest 2009-09 - Mailing list pgsql-hackers

Previous

Next