Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby |
Date | |
Msg-id | 3f0b79eb0907090016t38841368v45b916c9e57b1fe7@mail.gmail.com Whole thread Raw |
In response to | Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby (Fujii Masao <masao.fujii@gmail.com>) |
List | pgsql-hackers |
Hi, On Tue, Jul 7, 2009 at 8:51 PM, Fujii Masao<masao.fujii@gmail.com> wrote: > http://archives.postgresql.org/message-id/4951108A.5040608@enterprisedb.com >> I don't think we need or should >> allow running regular queries before entering "replication mode". the >> backend should become a walsender process directly after authentication. > > I changed the protocol according to your suggestion. > Here is the current protocol: Just to the record, I'd like to explain the correspondence relationship between Heikki's protocol and mine. > ReplicationStart (B) > Byte1('l'): Identifies the message as a replication-start indicator. > Int32(17): Length of message contents in bytes, including self. > Int32: The timeline ID > Int32: The start log file of replication > Int32: The start byte offset of replication This corresponds to "StartReplication <begin>". But this is sent from the primary to the standby, though "StartReplication" is sent in theopposite direction. So, in the current design, the primary determines the WAL streaming start position, which indicates the head of the next XLOG file of the switched file by walsender. > XLogData (B) > Byte1('w'): Identifies the message as XLOG records. > Int32: Length of message contents in bytes, including self. > Int8: Flag bits indicating how the records should be treated. > Int32: The log file number of the records. > Int32: The byte offset of the records. > Byte n: The XLOG records. This corresponds to "WALRange <begin> <end> <data>". But XLogData doesn't have <begin> in order to reduce the wire traffic because it can be calculated from <end> and the length of the records. > XLogResponse (F) > Byte1('r'): Identifies the message as ACK for XLOG records. > Int32: Length of message contents in bytes, including self. > Int8: Flag bits indicating how the records were treated. > Int32: The log file number of the records. > Int32: The byte offset of the records. This corresponds to "ReplicatedUpTo <end>". They are almost the same. > If there is a missing XLOG file which is required for recovery, the > startup process connects to the primary as a normal client, and > receives the binary contents of the file by using the following SQL. > This has nothing to do with the above protocol. So, the transfer of > missing file and synchronous XLOG streaming are performed > concurrently. > > COPY (SELECT pg_read_xlogfilie('filename', true)) TO STDOUT WITH BINARY This corresponds to "RequestWAL <begin> <end>". Since the XLOG file written to the standby has to be recoverable, I use the filename instead of XLogRecPtr here, and make the primary send the whole file. Also, this filename can indicate not only XLOG file but also a history file. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: