Re: Streaming Replication patch for CommitFest 2009-09 - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Streaming Replication patch for CommitFest 2009-09 |
Date | |
Msg-id | 4AB21E6B.1020602@enterprisedb.com Whole thread Raw |
In response to | Streaming Replication patch for CommitFest 2009-09 (Fujii Masao <masao.fujii@gmail.com>) |
Responses |
Re: Streaming Replication patch for CommitFest 2009-09
|
List | pgsql-hackers |
Some random comments: I don't think we need the new PM_SHUTDOWN_3 postmaster state. We can treat walsenders the same as the archive process, and kill and wait for both of them to die in PM_SHUTDOWN_2 state. I think there's something wrong with the napping in walsender. When I perform px_xlog_switch(), it takes surprisingly long for it to trickle to the standby. When I put a little proxy program in between the master and slave that delays all messages from the slave to the master by one second, it got worse, even though I would expect the master to still keep sending WAL at full speed. I get logs like this: 2009-09-17 14:13:16.876 EEST LOG: xlog send request 0/38000000; send 0/3700006C; write 0/3700006C 2009-09-17 14:13:16.877 EEST LOG: xlog read request 0/37010000; send 0/37010000; write 0/3700006C 2009-09-17 14:13:17.077 EEST LOG: xlog send request 0/38000000; send 0/37010000; write 0/3700006C 2009-09-17 14:13:17.077 EEST LOG: xlog read request 0/37020000; send 0/37020000; write 0/3700006C 2009-09-17 14:13:17.078 EEST LOG: xlog read request 0/37030000; send 0/37030000; write 0/3700006C 2009-09-17 14:13:17.278 EEST LOG: xlog send request 0/38000000; send 0/37030000; write 0/3700006C 2009-09-17 14:13:17.279 EEST LOG: xlog read request 0/37040000; send 0/37040000; write 0/3700006C ... 2009-09-17 14:13:22.796 EEST LOG: xlog read request 0/37FD0000; send 0/37FD0000; write 0/376D0000 2009-09-17 14:13:22.896 EEST LOG: xlog send request 0/38000000; send 0/37FD0000; write 0/376D0000 2009-09-17 14:13:22.896 EEST LOG: xlog read request 0/37FE0000; send 0/37FE0000; write 0/376D0000 2009-09-17 14:13:22.896 EEST LOG: xlog read request 0/37FF0000; send 0/37FF0000; write 0/376D0000 2009-09-17 14:13:22.897 EEST LOG: xlog read request 0/38000000; send 0/38000000; write 0/376D0000 2009-09-17 14:14:09.932 EEST LOG: xlog send request 0/38000428; send 0/38000000; write 0/38000000 2009-09-17 14:14:09.932 EEST LOG: xlog read request 0/38000428; send 0/38000428; write 0/38000000 It looks like it's having 100 or 200 ms naps in between. Also, I wouldn't expect to see so many "read request" acknowledgments from the slave. The master doesn't really need to know how far the slave is, except in synchronous replication when it has requested a flush to slave. Another reason why master needs to know is so that the master can recycle old log files, but for that we'd really only need an acknowledgment once per WAL file or even less. Why does XLogSend() care about page boundaries? Perhaps it's a leftover from the old approach that read from wal_buffers? Do we really need the support for asynchronous backend libpq commands? Could walsender just keep blasting WAL to the slave, and only try to read an acknowledgment after it has requested one, by setting XLOGSTREAM_FLUSH flag. Or maybe we should be putting the socket into non-blocking mode. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: