Thread: pg_receivexlog --status-interval add fsync feedback
Hi all, This patch is to add setting to send status packets after fsync to --status-interval of pg_receivexlog. If -1 is specified to --status-interval, status packets is sent as soon as after fsync. Others are the same as when 0 is specified to --status-interval. When requested by the server, send the status packet. To use replication slot, and specify -1 to --fsync-interval. As a result, simple WAL synchronous replication can be performed. In simple WAL synchronization, it is possible to test the replication. If there was F/O in synchronous replication, it is available in the substitute until standby is restored. Regards, -- Furuya Osamu
Attachment
On Tue, Aug 12, 2014 at 6:19 PM, <furuyao@pm.nttdata.co.jp> wrote: > Hi all, > > This patch is to add setting to send status packets after fsync to --status-interval of pg_receivexlog. > > If -1 is specified to --status-interval, status packets is sent as soon as after fsync. I don't think that it's good idea to control that behavior by using --status-interval. I'm sure that there are some users who both want that behavior and want set the maximum interval between a feedback is sent back to the server because these settings are available in walreceiver. But your version of --status-interval doesn't allow those settings at all. That is, if set to -1, the maximum interval between each feedback cannot be set. OTOH, if set to the positive number, feedback-as-soon-as-fsync behavior cannot be enabled. Therefore, I'm thinking that it's better to introduce new separate option for that behavior. Regards, -- Fujii Masao
> I don't think that it's good idea to control that behavior by using > --status-interval. I'm sure that there are some users who both want that > behavior and want set the maximum interval between a feedback is sent > back to the server because these settings are available in walreceiver. > But your version of --status-interval doesn't allow those settings at > all. That is, if set to -1, the maximum interval between each feedback > cannot be set. OTOH, if set to the positive number, > feedback-as-soon-as-fsync behavior cannot be enabled. Therefore, I'm > thinking that it's better to introduce new separate option for that > behavior. Thanks for the review! This patch was split option as you pointed out. If -r option is specified, status packets sent to server as soon as after fsync. Otherwise to send server status packet in the spacing of the --status-interval. Regards, -- Furuya Osamu
Attachment
On Wed, Aug 13, 2014 at 5:55 PM, <furuyao@pm.nttdata.co.jp> wrote: >> I don't think that it's good idea to control that behavior by using >> --status-interval. I'm sure that there are some users who both want that >> behavior and want set the maximum interval between a feedback is sent >> back to the server because these settings are available in walreceiver. >> But your version of --status-interval doesn't allow those settings at >> all. That is, if set to -1, the maximum interval between each feedback >> cannot be set. OTOH, if set to the positive number, >> feedback-as-soon-as-fsync behavior cannot be enabled. Therefore, I'm >> thinking that it's better to introduce new separate option for that >> behavior. > > Thanks for the review! > This patch was split option as you pointed out. > > If -r option is specified, status packets sent to server as soon as after fsync. > Otherwise to send server status packet in the spacing of the --status-interval. > Hi, I took a look at this patch. I applied patch to master successfully, and did not get error with compiling. Also it works fine. One question is why reply_fsync is defined as volatile variable? Sorry I could not understand reason of that. Currently patch modifies argument of some function (e.g., Handle CopyStream, Process LogDate Msg), and add the similar code to each function. I don't think it is good approach. For example, I think that we should gather these code into one function. Thought? Regards, ------- Sawada Masahiko
Thanks for the review! > One question is why reply_fsync is defined as volatile variable? > Sorry I could not understand reason of that. It was affected to time_to_abort -- since it is unnecessary, it deletes. > Currently patch modifies argument of some function (e.g., Handle > CopyStream, Process LogDate Msg), and add the similar code to each > function. > I don't think it is good approach. > For example, I think that we should gather these code into one function. Feedback was judged immediately after each fsync until now. I revised it in reference to walreceiver. Feedback of fsync is judged together with the judgment of --status-interval. Thereby, the specification to an argument became minimum. Regards, -- Furuya Osamu
Attachment
On Mon, Aug 18, 2014 at 7:55 PM, <furuyao@pm.nttdata.co.jp> wrote: > Thanks for the review! > >> One question is why reply_fsync is defined as volatile variable? >> Sorry I could not understand reason of that. > > It was affected to time_to_abort -- since it is unnecessary, it deletes. > >> Currently patch modifies argument of some function (e.g., Handle >> CopyStream, Process LogDate Msg), and add the similar code to each >> function. >> I don't think it is good approach. >> For example, I think that we should gather these code into one function. > > Feedback was judged immediately after each fsync until now. > I revised it in reference to walreceiver. > Feedback of fsync is judged together with the judgment of --status-interval. > Thereby, the specification to an argument became minimum. Thank you for updating the patch. I did not get error with applying, and compiling. It works fine. I think this function code has no problem. Could you please submit patch to commit fest app? Regards, ------- Sawada Masahiko
> Thank you for updating the patch. > > I did not get error with applying, and compiling. > It works fine. I think this function code has no problem. > Could you please submit patch to commit fest app? Thanks for the review! As you pointed out, submitted patch to commit fest app. Regards, -- Furuya Osamu
On Tue, Aug 19, 2014 at 9:52 AM, <furuyao@pm.nttdata.co.jp> wrote: >> Thank you for updating the patch. >> >> I did not get error with applying, and compiling. >> It works fine. I think this function code has no problem. >> Could you please submit patch to commit fest app? > > Thanks for the review! > > As you pointed out, submitted patch to commit fest app. When replication slot is not specified in pg_receivexlog, the flush location in the feedback message always indicates invalid. So there seems to be no need to send the feedback as soon as fsync is issued, in that case. How should this option work when replication slot is not specified? Regards, -- Fujii Masao
> When replication slot is not specified in pg_receivexlog, the flush > location in the feedback message always indicates invalid. So there seems > to be no need to send the feedback as soon as fsync is issued, in that > case. > How should this option work when replication slot is not specified? Thanks for the review! The present is not checking the existence of specification of -S. The use case when replication slot is not specified. Because it does fsync, it isn't an original intention. remote_write is set in synchronous_commit. To call attention to the user, append following documents. "If you want to report the flush position to the server, should use -S option." Regards, -- Furuya Osamu
Attachment
On Thu, Aug 21, 2014 at 2:54 PM, <furuyao@pm.nttdata.co.jp> wrote: >> When replication slot is not specified in pg_receivexlog, the flush >> location in the feedback message always indicates invalid. So there seems >> to be no need to send the feedback as soon as fsync is issued, in that >> case. >> How should this option work when replication slot is not specified? > > Thanks for the review! > > The present is not checking the existence of specification of -S. > > The use case when replication slot is not specified. > > Because it does fsync, it isn't an original intention. > remote_write is set in synchronous_commit. > > To call attention to the user, append following documents. > "If you want to report the flush position to the server, should use -S option." > Thank you for updating the patch. I reviewed the patch. First of all, I think that we should not append the above message to section of '-r' option. (Or these message might not be needed at all) Whether flush location in feedback message is valid, is not depend on '-r' option. If we use '-r' option and 'S' option (i.g., replication slot) then pg_receivexlog informs valid flush location to primary server at the same time as doing fsync. But, if we don't specify replication slot then the flush location in feedback message always invalid. So I think Fujii-san pointed out that sending of invalid flush location is not needed if pg_receivexlog does not use replication slot. Regards, ------- Sawada Masahiko
> Thank you for updating the patch. > I reviewed the patch. > > First of all, I think that we should not append the above message to > section of '-r' option. > (Or these message might not be needed at all) Whether flush location in > feedback message is valid, is not depend on '-r' option. > > If we use '-r' option and 'S' option (i.g., replication slot) then > pg_receivexlog informs valid flush location to primary server at the same > time as doing fsync. > But, if we don't specify replication slot then the flush location in > feedback message always invalid. > So I think Fujii-san pointed out that sending of invalid flush location > is not needed if pg_receivexlog does not use replication slot. Thanks for the review! I understand the attention message wasn't appropriate. To report the write location, even If you do not specify a replication slot. So the fix only appended messages. There was a description of the flush location section of '-S' option, but I intended to catch eye more and added a message. Is it better to make specification of the -S option indispensable? Regards, -- Furuya Osamu
On Fri, Aug 22, 2014 at 1:35 PM, <furuyao@pm.nttdata.co.jp> wrote: >> Thank you for updating the patch. >> I reviewed the patch. >> >> First of all, I think that we should not append the above message to >> section of '-r' option. >> (Or these message might not be needed at all) Whether flush location in >> feedback message is valid, is not depend on '-r' option. >> >> If we use '-r' option and 'S' option (i.g., replication slot) then >> pg_receivexlog informs valid flush location to primary server at the same >> time as doing fsync. >> But, if we don't specify replication slot then the flush location in >> feedback message always invalid. >> So I think Fujii-san pointed out that sending of invalid flush location >> is not needed if pg_receivexlog does not use replication slot. > > Thanks for the review! > > I understand the attention message wasn't appropriate. > > To report the write location, even If you do not specify a replication slot. > So the fix only appended messages. > > There was a description of the flush location section of '-S' option, > but I intended to catch eye more and added a message. > > Is it better to make specification of the -S option indispensable? The patch cannot be applied to HEAD cleanly. Could you update the patch? Regards, -- Fujii Masao
> > Thanks for the review! > > > > I understand the attention message wasn't appropriate. > > > > To report the write location, even If you do not specify a replication > slot. > > So the fix only appended messages. > > > > There was a description of the flush location section of '-S' option, > > but I intended to catch eye more and added a message. > > > > Is it better to make specification of the -S option indispensable? > > The patch cannot be applied to HEAD cleanly. Could you update the patch? Thank you for pointing out. Updated the patch. Regards, -- Furuya Osamu
Attachment
On 09/05/2014 08:51 AM, furuyao@pm.nttdata.co.jp wrote: >>> Thanks for the review! >>> >>> I understand the attention message wasn't appropriate. >>> >>> To report the write location, even If you do not specify a replication >> slot. >>> So the fix only appended messages. >>> >>> There was a description of the flush location section of '-S' option, >>> but I intended to catch eye more and added a message. >>> >>> Is it better to make specification of the -S option indispensable? >> >> The patch cannot be applied to HEAD cleanly. Could you update the patch? > > Thank you for pointing out. > Updated the patch. I don't understand what this patch does. When would you want to use the new --reply-fsync option? Is there any reason *not* to use it? In other words, do we need an option for this, couldn't you just always send the feedback message after fsync? - Heikki
> On 09/05/2014 08:51 AM, furuyao@pm.nttdata.co.jp wrote: > >>> Thanks for the review! > >>> > >>> I understand the attention message wasn't appropriate. > >>> > >>> To report the write location, even If you do not specify a > >>> replication > >> slot. > >>> So the fix only appended messages. > >>> > >>> There was a description of the flush location section of '-S' > >>> option, but I intended to catch eye more and added a message. > >>> > >>> Is it better to make specification of the -S option indispensable? > >> > >> The patch cannot be applied to HEAD cleanly. Could you update the > patch? > > > > Thank you for pointing out. > > Updated the patch. > > I don't understand what this patch does. When would you want to use the > new --reply-fsync option? Is there any reason *not* to use it? In other > words, do we need an option for this, couldn't you just always send the > feedback message after fsync? Thanks for the comment. --reply-fsync option is intended for use in synchronous mode. By specifying -F option and --slot option, process calls fsync() when it received the WAL, and flush location would be setin feedback message. Interval of sending feedback message depends on -s option in this state, so in the case of synchronous mode, waiting forfeedback message would occure. therefore, --reply-fsync option is necessary. because it can send the feedback message after fsync without waiting for theinterval of -s option. The reason for not sending the feedback message after fsync without waiting for the interval of -s option always, is to answerthe needs who want to use fsync only (NOT using synchronous mode). Regards, -- Furuya Osamu
On 09/29/2014 01:13 PM, furuyao@pm.nttdata.co.jp wrote: >> I don't understand what this patch does. When would you want to use >> the new --reply-fsync option? Is there any reason *not* to use it? >> In other words, do we need an option for this, couldn't you just >> always send the feedback message after fsync? > > Thanks for the comment. > > --reply-fsync option is intended for use in synchronous mode. > > By specifying -F option and --slot option, process calls fsync() when > it received the WAL, and flush location would be set in feedback > message. > > Interval of sending feedback message depends on -s option in this > state, so in the case of synchronous mode, waiting for feedback > message would occure. > > therefore, --reply-fsync option is necessary. because it can send the > feedback message after fsync without waiting for the interval of -s > option. > > The reason for not sending the feedback message after fsync without > waiting for the interval of -s option always, is to answer the needs > who want to use fsync only (NOT using synchronous mode). I still don't get it. AFAICS there are two ways to use pg_receivexlog. Either you use it as a synchronous standby, or not. What set of options would you pass if you want to use it as a synchronous standby? And if you don't? Could we just have a single "--synchronous" flag, instead of -F and --reply-fsync? - Heikki
> On 09/29/2014 01:13 PM, furuyao@pm.nttdata.co.jp wrote: > >> I don't understand what this patch does. When would you want to use > >> the new --reply-fsync option? Is there any reason *not* to use it? > >> In other words, do we need an option for this, couldn't you just > >> always send the feedback message after fsync? > > > > Thanks for the comment. > > > > --reply-fsync option is intended for use in synchronous mode. > > > > By specifying -F option and --slot option, process calls fsync() when > > it received the WAL, and flush location would be set in feedback > > message. > > > > Interval of sending feedback message depends on -s option in this > > state, so in the case of synchronous mode, waiting for feedback > > message would occure. > > > > therefore, --reply-fsync option is necessary. because it can send the > > feedback message after fsync without waiting for the interval of -s > > option. > > > > The reason for not sending the feedback message after fsync without > > waiting for the interval of -s option always, is to answer the needs > > who want to use fsync only (NOT using synchronous mode). > > I still don't get it. AFAICS there are two ways to use pg_receivexlog. > Either you use it as a synchronous standby, or not. > > What set of options would you pass if you want to use it as a synchronous > standby? And if you don't? Could we just have a single "--synchronous" > flag, instead of -F and --reply-fsync? Thanks for comment. If you set "synchronous_commit" as "remote_write", the options would be different . The set of options in each case, see the following. Synchronous standby(synchronous_commit=on)--fsync-interval=-1--reply-fsync--slot=slotname Synchronous standby(synchronous_commit=remote_write)--fsync-interval=-1--reply-fsync AsynchronousThere are no relative options. Well, if the response time delay(value of "--status-interval=interval") is acceptable, "--reply-fsync" is unnecessary. Instead of "--reply-fsync", using "--synchronous"(which summarizes the "--reply-fsync" and "fsync-interval = -1") might beeasy to understand. Although, in that case, "--fsync-interval=interval" would be fixed value. Isn't there any problem? Regards, -- Furuya Osamu
On 10/08/2014 07:23 AM, furuyao@pm.nttdata.co.jp wrote: >> What set of options would you pass if you want to use it as a synchronous >> standby? And if you don't? Could we just have a single "--synchronous" >> flag, instead of -F and --reply-fsync? > > If you set "synchronous_commit" as "remote_write", the options would be different . > The set of options in each case, see the following. > > > Synchronous standby(synchronous_commit=on) > --fsync-interval=-1 > --reply-fsync > --slot=slotname > > Synchronous standby(synchronous_commit=remote_write) > --fsync-interval=-1 > --reply-fsync > > Asynchronous > There are no relative options. > > > Well, if the response time delay(value of "--status-interval=interval") is acceptable, "--reply-fsync" is unnecessary. > Instead of "--reply-fsync", using "--synchronous"(which summarizes the "--reply-fsync" and "fsync-interval = -1") mightbe easy to understand. Although, in that case, "--fsync-interval=interval" would be fixed value. Isn't there any problem? I think we should remove --fsync-interval and --reply-fsync, and just have a --synchronous option, which would imply the same behavior you get with --fsync-interval=-1 --reply--fsync. That leaves the question of whether pg_receivexlog should do fsyncs when it's not acting as a synchronous standby. There isn't any real need to do so. In asynchronous mode, there are no guarantees anyway, and even without an fsync, the OS will eventually flush the data to disk. But we could do even better than that. It would be best if you didn't need even the --synchronous flag. The server knows whether the client is a synchronous standby or not. It could tell the client. - Heikki
> On 10/08/2014 07:23 AM, furuyao@pm.nttdata.co.jp wrote: > >> What set of options would you pass if you want to use it as a > >> synchronous standby? And if you don't? Could we just have a single > "--synchronous" > >> flag, instead of -F and --reply-fsync? > > > > If you set "synchronous_commit" as "remote_write", the options would > be different . > > The set of options in each case, see the following. > > > > > > Synchronous standby(synchronous_commit=on) > > --fsync-interval=-1 > > --reply-fsync > > --slot=slotname > > > > Synchronous standby(synchronous_commit=remote_write) > > --fsync-interval=-1 > > --reply-fsync > > > > Asynchronous > > There are no relative options. > > > > > > Well, if the response time delay(value of > "--status-interval=interval") is acceptable, "--reply-fsync" is > unnecessary. > > Instead of "--reply-fsync", using "--synchronous"(which summarizes > the "--reply-fsync" and "fsync-interval = -1") might be easy to > understand. Although, in that case, "--fsync-interval=interval" would > be fixed value. Isn't there any problem ? > > I think we should remove --fsync-interval and --reply-fsync, and just > have a --synchronous option, which would imply the same behavior you get > with --fsync-interval=-1 --reply--fsync. > > That leaves the question of whether pg_receivexlog should do fsyncs when > it's not acting as a synchronous standby. There isn't any real need to > do so. In asynchronous mode, there are no guarantees anyway, and even > without an fsync, the OS will eventually flush the data to disk. > > But we could do even better than that. It would be best if you didn't > need even the --synchronous flag. The server knows whether the client > is a synchronous standby or not. It could tell the client. Thanks for the reply. If we remove --fsync-interval, resoponse time to user will not be delay. Although, fsync will be executed multiple times in a short period. And there is no way to solve the problem without --fsync-interval, what should we do about it? In asynchronous mode, I think there is no problem since the specification is same with release version. > The server knows whether the client is a synchronous standby or not. > It could tell the client. When notifying the synchronous/asynchronous mode to the client from the server, do we need to change the format of the Message ? Regards, -- Furuya Osamu
On 10/08/2014 11:47 AM, furuyao@pm.nttdata.co.jp wrote: >> On 10/08/2014 07:23 AM, furuyao@pm.nttdata.co.jp wrote: >>>> What set of options would you pass if you want to use it as a >>>> synchronous standby? And if you don't? Could we just have a single >> "--synchronous" >>>> flag, instead of -F and --reply-fsync? >>> >>> If you set "synchronous_commit" as "remote_write", the options would >> be different . >>> The set of options in each case, see the following. >>> >>> >>> Synchronous standby(synchronous_commit=on) >>> --fsync-interval=-1 >>> --reply-fsync >>> --slot=slotname >>> >>> Synchronous standby(synchronous_commit=remote_write) >>> --fsync-interval=-1 >>> --reply-fsync >>> >>> Asynchronous >>> There are no relative options. >>> >>> >>> Well, if the response time delay(value of >> "--status-interval=interval") is acceptable, "--reply-fsync" is >> unnecessary. >>> Instead of "--reply-fsync", using "--synchronous"(which summarizes >> the "--reply-fsync" and "fsync-interval = -1") might be easy to >> understand. Although, in that case, "--fsync-interval=interval" would >> be fixed value. Isn't there any problem ? >> >> I think we should remove --fsync-interval and --reply-fsync, and just >> have a --synchronous option, which would imply the same behavior you get >> with --fsync-interval=-1 --reply--fsync. >> >> That leaves the question of whether pg_receivexlog should do fsyncs when >> it's not acting as a synchronous standby. There isn't any real need to >> do so. In asynchronous mode, there are no guarantees anyway, and even >> without an fsync, the OS will eventually flush the data to disk. >> >> But we could do even better than that. It would be best if you didn't >> need even the --synchronous flag. The server knows whether the client >> is a synchronous standby or not. It could tell the client. > > If we remove --fsync-interval, resoponse time to user will not be delay. > Although, fsync will be executed multiple times in a short period. > And there is no way to solve the problem without --fsync-interval, what should we do about it? I'm sorry, I didn't understand that. > In asynchronous mode, I think there is no problem since the specification is same with release version. > >> The server knows whether the client is a synchronous standby or not. >> It could tell the client. > > When notifying the synchronous/asynchronous mode to the client from the server, > do we need to change the format of the Message ? Yeah. Or rather, add a new message type, to indicate the synchronous/asynchronous status. - Heikki
> >>>> What set of options would you pass if you want to use it as a > >>>> synchronous standby? And if you don't? Could we just have a single > >> "--synchronous" > >>>> flag, instead of -F and --reply-fsync? > >>> > >>> If you set "synchronous_commit" as "remote_write", the options would > >> be different . > >>> The set of options in each case, see the following. > >>> > >>> > >>> Synchronous standby(synchronous_commit=on) > >>> --fsync-interval=-1 > >>> --reply-fsync > >>> --slot=slotname > >>> > >>> Synchronous standby(synchronous_commit=remote_write) > >>> --fsync-interval=-1 > >>> --reply-fsync > >>> > >>> Asynchronous > >>> There are no relative options. > >>> > >>> > >>> Well, if the response time delay(value of > >> "--status-interval=interval") is acceptable, "--reply-fsync" is > >> unnecessary. > >>> Instead of "--reply-fsync", using "--synchronous"(which summarizes > >> the "--reply-fsync" and "fsync-interval = -1") might be easy to > >> understand. Although, in that case, "--fsync-interval=interval" > >> would be fixed value. Isn't there any problem ? > >> > >> I think we should remove --fsync-interval and --reply-fsync, and just > >> have a --synchronous option, which would imply the same behavior you > >> get with --fsync-interval=-1 --reply--fsync. > >> > >> That leaves the question of whether pg_receivexlog should do fsyncs > >> when it's not acting as a synchronous standby. There isn't any real > >> need to do so. In asynchronous mode, there are no guarantees anyway, > >> and even without an fsync, the OS will eventually flush the data to > disk. > >> > >> But we could do even better than that. It would be best if you didn't > >> need even the --synchronous flag. The server knows whether the client > >> is a synchronous standby or not. It could tell the client. > > > > If we remove --fsync-interval, resoponse time to user will not be delay. > > Although, fsync will be executed multiple times in a short period. > > And there is no way to solve the problem without --fsync-interval, what > should we do about it? > > I'm sorry, I didn't understand that. > > > In asynchronous mode, I think there is no problem since the > specification is same with release version. > > > >> The server knows whether the client is a synchronous standby or not. > >> It could tell the client. > > > > When notifying the synchronous/asynchronous mode to the client from > > the server, do we need to change the format of the Message ? > > Yeah. Or rather, add a new message type, to indicate the > synchronous/asynchronous status. Thanks for the reply. > > If we remove --fsync-interval, resoponse time to user will not be delay. > > Although, fsync will be executed multiple times in a short period. > > And there is no way to solve the problem without --fsync-interval, what > should we do about it? > > I'm sorry, I didn't understand that. Here is an example. When WAL is sent at 100ms intervals, fsync() is executed 10 times per second. If --fsync-interval is set by 1 second, we have to wait SQL responce(kind of making WAL record) for 1 second, though, fsync()won't be executed several times in 1 second. I think --fsync-interval meets the demands of people who wants to restrain fsync() happens for several time in short period,but what do you think? Is it ok to delete --fsync-interval ? > >> The server knows whether the client is a synchronous standby or not. > >> It could tell the client. > > > > When notifying the synchronous/asynchronous mode to the client from > > the server, do we need to change the format of the Message ? > > Yeah. Or rather, add a new message type, to indicate the > synchronous/asynchronous status. What kind of 'message type' we have to add ? Do we need to separate 'w' into two types ? synchronous and asynchronous ? OR Add a new message type, kind of 'notify synchronous', and inform pg_receivexlog of synchronous status when it connect to the server. Regards, -- Furuya Osamu
On 10/09/2014 07:47 AM, furuyao@pm.nttdata.co.jp wrote: >>> If we remove --fsync-interval, resoponse time to user will not be delay. >>> Although, fsync will be executed multiple times in a short period. >>> And there is no way to solve the problem without --fsync-interval, what >> should we do about it? >> >> I'm sorry, I didn't understand that. > > Here is an example. > When WAL is sent at 100ms intervals, fsync() is executed 10 times per second. > If --fsync-interval is set by 1 second, we have to wait SQL responce(kind of making WAL record) for 1 second, though, fsync()won't be executed several times in 1 second. > I think --fsync-interval meets the demands of people who wants to restrain fsync() happens for several time in short period,but what do you think? > Is it ok to delete --fsync-interval ? I still don't see the problem. In synchronous mode, pg_receivexlog should have similar logic as walreceiver does. It should read as much WAL from the socket as it can without blocking, and fsync() and send reply after that. And also fsync whenever switching to new segment. If the master sends WAL every 100ms, then pg_recevexlog will indeed fsync() 10 times per second. There's nothing wrong with that. In asynchronous mode, only fsync whenever switching to new segment. >> Yeah. Or rather, add a new message type, to indicate the >> synchronous/asynchronous status. > > What kind of 'message type' we have to add ? > > Do we need to separate 'w' into two types ? synchronous and asynchronous ? > > OR > > Add a new message type, kind of 'notify synchronous', > and inform pg_receivexlog of synchronous status when it connect to the server. Better to add a new "notify" message type. And pg_recevexlog should be prepared to receive it at any time. The status might change on the fly, if the server's configuration is reloaded. - Heikki
> >>> If we remove --fsync-interval, resoponse time to user will not be > delay. > >>> Although, fsync will be executed multiple times in a short period. > >>> And there is no way to solve the problem without --fsync-interval, > >>> what > >> should we do about it? > >> > >> I'm sorry, I didn't understand that. > > > > Here is an example. > > When WAL is sent at 100ms intervals, fsync() is executed 10 times per > second. > > If --fsync-interval is set by 1 second, we have to wait SQL > responce(kind of making WAL record) for 1 second, though, fsync() won't > be executed several times in 1 second. > > I think --fsync-interval meets the demands of people who wants to > restrain fsync() happens for several time in short period, but what do > you think? > > Is it ok to delete --fsync-interval ? > > I still don't see the problem. > > In synchronous mode, pg_receivexlog should have similar logic as > walreceiver does. It should read as much WAL from the socket as it can > without blocking, and fsync() and send reply after that. And also fsync > whenever switching to new segment. If the master sends WAL every 100ms, > then pg_recevexlog will indeed fsync() 10 times per second. There's > nothing wrong with that. > > In asynchronous mode, only fsync whenever switching to new segment. > > >> Yeah. Or rather, add a new message type, to indicate the > >> synchronous/asynchronous status. > > > > What kind of 'message type' we have to add ? > > > > Do we need to separate 'w' into two types ? synchronous and > asynchronous ? > > > > OR > > > > Add a new message type, kind of 'notify synchronous', and inform > > pg_receivexlog of synchronous status when it connect to the server. > > Better to add a new "notify" message type. And pg_recevexlog should be > prepared to receive it at any time. The status might change on the fly, > if the server's configuration is reloaded. Thanks for the reply. > In synchronous mode, pg_receivexlog should have similar logic as walreceiver does. OK. I understand that removing --fsync-interval has no problem. > Better to add a new "notify" message type. And pg_recevexlog should be prepared to receive it at any time. The status mightchange on the fly, if the server's configuration is reloaded. OK. I'll consider it. Regards, -- Furuya Osamu
On Thu, Oct 9, 2014 at 6:42 PM, <furuyao@pm.nttdata.co.jp> wrote: >> >>> If we remove --fsync-interval, resoponse time to user will not be >> delay. >> >>> Although, fsync will be executed multiple times in a short period. >> >>> And there is no way to solve the problem without --fsync-interval, >> >>> what >> >> should we do about it? >> >> >> >> I'm sorry, I didn't understand that. >> > >> > Here is an example. >> > When WAL is sent at 100ms intervals, fsync() is executed 10 times per >> second. >> > If --fsync-interval is set by 1 second, we have to wait SQL >> responce(kind of making WAL record) for 1 second, though, fsync() won't >> be executed several times in 1 second. >> > I think --fsync-interval meets the demands of people who wants to >> restrain fsync() happens for several time in short period, but what do >> you think? >> > Is it ok to delete --fsync-interval ? >> >> I still don't see the problem. >> >> In synchronous mode, pg_receivexlog should have similar logic as >> walreceiver does. It should read as much WAL from the socket as it can >> without blocking, and fsync() and send reply after that. And also fsync >> whenever switching to new segment. If the master sends WAL every 100ms, >> then pg_recevexlog will indeed fsync() 10 times per second. There's >> nothing wrong with that. >> >> In asynchronous mode, only fsync whenever switching to new segment. >> >> >> Yeah. Or rather, add a new message type, to indicate the >> >> synchronous/asynchronous status. >> > >> > What kind of 'message type' we have to add ? >> > >> > Do we need to separate 'w' into two types ? synchronous and >> asynchronous ? >> > >> > OR >> > >> > Add a new message type, kind of 'notify synchronous', and inform >> > pg_receivexlog of synchronous status when it connect to the server. >> >> Better to add a new "notify" message type. And pg_recevexlog should be >> prepared to receive it at any time. The status might change on the fly, >> if the server's configuration is reloaded. > > Thanks for the reply. > >> In synchronous mode, pg_receivexlog should have similar logic as walreceiver does. > > OK. I understand that removing --fsync-interval has no problem. +1 for adding something like --synchronous option. To me, it sounds walreceiver-compatible mode rather than synchronous. >> Better to add a new "notify" message type. And pg_recevexlog should be prepared to receive it at any time. The statusmight change on the fly, if the server's configuration is reloaded. > > OK. I'll consider it. I don't think that's good idea because it prevents us from using pg_receivexlog as async walreceiver (i.e., received WAL data is fsynced and feedback is sent back to the server soon, but transaction commit in the server doesn't wait for the feedback). Regards, -- Fujii Masao
> >> >>> If we remove --fsync-interval, resoponse time to user will not > be > >> delay. > >> >>> Although, fsync will be executed multiple times in a short period. > >> >>> And there is no way to solve the problem without > >> >>> --fsync-interval, what > >> >> should we do about it? > >> >> > >> >> I'm sorry, I didn't understand that. > >> > > >> > Here is an example. > >> > When WAL is sent at 100ms intervals, fsync() is executed 10 times > >> > per > >> second. > >> > If --fsync-interval is set by 1 second, we have to wait SQL > >> responce(kind of making WAL record) for 1 second, though, fsync() > >> won't be executed several times in 1 second. > >> > I think --fsync-interval meets the demands of people who wants to > >> restrain fsync() happens for several time in short period, but what > >> do you think? > >> > Is it ok to delete --fsync-interval ? > >> > >> I still don't see the problem. > >> > >> In synchronous mode, pg_receivexlog should have similar logic as > >> walreceiver does. It should read as much WAL from the socket as it > >> can without blocking, and fsync() and send reply after that. And also > >> fsync whenever switching to new segment. If the master sends WAL > >> every 100ms, then pg_recevexlog will indeed fsync() 10 times per > >> second. There's nothing wrong with that. > >> > >> In asynchronous mode, only fsync whenever switching to new segment. > >> > >> >> Yeah. Or rather, add a new message type, to indicate the > >> >> synchronous/asynchronous status. > >> > > >> > What kind of 'message type' we have to add ? > >> > > >> > Do we need to separate 'w' into two types ? synchronous and > >> asynchronous ? > >> > > >> > OR > >> > > >> > Add a new message type, kind of 'notify synchronous', and inform > >> > pg_receivexlog of synchronous status when it connect to the server. > >> > >> Better to add a new "notify" message type. And pg_recevexlog should > >> be prepared to receive it at any time. The status might change on the > >> fly, if the server's configuration is reloaded. > > > > Thanks for the reply. > > > >> In synchronous mode, pg_receivexlog should have similar logic as > walreceiver does. > > > > OK. I understand that removing --fsync-interval has no problem. > > +1 for adding something like --synchronous option. To me, > it sounds walreceiver-compatible mode rather than synchronous. > > >> Better to add a new "notify" message type. And pg_recevexlog should > be prepared to receive it at any time. The status might change on the > fly, if the server's configuration is reloaded. > > > > OK. I'll consider it. > > I don't think that's good idea because it prevents us from using > pg_receivexlog as async walreceiver (i.e., received WAL data is fsynced > and feedback is sent back to the server soon, but transaction commit in > the server doesn't wait for the feedback). Thanks for the comment. > it prevents us from using pg_receivexlog as async walreceiver. Yes. If sync or async is switched by message, in case pg_receivexlog is async, it won't work like a walreceiver. User don't have to set options if synchronous status can be distinguished by message, though they cannot specify the action. > adding something like --synchronous option Integrate required options for sync like above, is more appropriate ? Regards, -- Furuya Osamu
On 10 October 2014 09:28, Fujii Masao <masao.fujii@gmail.com> wrote: >>> In synchronous mode, pg_receivexlog should have similar logic as walreceiver does. >> >> OK. I understand that removing --fsync-interval has no problem. > > +1 for adding something like --synchronous option. To me, > it sounds walreceiver-compatible mode rather than synchronous. > >>> Better to add a new "notify" message type. And pg_recevexlog should be prepared to receive it at any time. The statusmight change on the fly, if the server's configuration is reloaded. >> >> OK. I'll consider it. > > I don't think that's good idea because it prevents us from using > pg_receivexlog as async walreceiver (i.e., received WAL data is > fsynced and feedback is sent back to the server soon, > but transaction commit in the server doesn't wait for the feedback). Sync rep works by setting parameters on the master. Standby servers send replies by default, though you can turn replies off. pg_receivexlog should work the same, but can't do this because it doesn't know the fsync position unless it fsyncs. So its not appropriate to have an option called "--synchronous" in the same way that there is no parameter called "synchronous" on the standby, for good reason. A new parameter to send feedback should be called --feedback A second parameter to decide whether to fsync should be called --fsync if (feedback && fsync) send fsynced LSN else if (feedback) send received LSN ; /* else send no feedback */ -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> >>> In synchronous mode, pg_receivexlog should have similar logic as > walreceiver does. > >> > >> OK. I understand that removing --fsync-interval has no problem. > > > > +1 for adding something like --synchronous option. To me, > > it sounds walreceiver-compatible mode rather than synchronous. > > > >>> Better to add a new "notify" message type. And pg_recevexlog should > be prepared to receive it at any time. The status might change on the > fly, if the server's configuration is reloaded. > >> > >> OK. I'll consider it. > > > > I don't think that's good idea because it prevents us from using > > pg_receivexlog as async walreceiver (i.e., received WAL data is > > fsynced and feedback is sent back to the server soon, but transaction > > commit in the server doesn't wait for the feedback). > > Sync rep works by setting parameters on the master. Standby servers send > replies by default, though you can turn replies off. > > pg_receivexlog should work the same, but can't do this because it doesn't > know the fsync position unless it fsyncs. > > So its not appropriate to have an option called "--synchronous" in the > same way that there is no parameter called "synchronous" on the standby, > for good reason. > > A new parameter to send feedback should be called --feedback A second > parameter to decide whether to fsync should be called --fsync > > if (feedback && fsync) > send fsynced LSN > else if (feedback) > send received LSN > ; /* else send no feedback */ Thanks for the comment. The patch cannot be applied to HEAD cleanly so I updated. >So its not appropriate to have an option called "--synchronous" in the same way that there is no >parameter called "synchronous"on the standby, for good reason. In case of gathering options to one option, change the name "--synchronous" to other name solves the problem ? >A new parameter to send feedback should be called --feedback >A second parameter to decide whether to fsync should be called --fsync I think keep using "--reply-fsync" and "--fsync-interval" is better than make new options. Thought? Regards, -- Furuya Osamu
Attachment
On 17 October 2014 09:55, <furuyao@pm.nttdata.co.jp> wrote: >>A new parameter to send feedback should be called --feedback >>A second parameter to decide whether to fsync should be called --fsync > > I think keep using "--reply-fsync" and "--fsync-interval" is better than make new options. > Thought? We already have hot_standby_feedback, so using the name feedback is best idea. I am suggesting that we send feedback even if we do not fsync, to allow the master to track our progress. Hence the name of the second parameter was just fsync. So both names were suggested because of links to those terms already being used for similar reasons elsewhere in Postgres. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/17/2014 01:59 PM, Simon Riggs wrote: > On 17 October 2014 09:55, <furuyao@pm.nttdata.co.jp> wrote: > >>> A new parameter to send feedback should be called --feedback > >>> A second parameter to decide whether to fsync should be called --fsync >> >> I think keep using "--reply-fsync" and "--fsync-interval" is better than make new options. >> Thought? > > We already have hot_standby_feedback, so using the name feedback is best idea. > > I am suggesting that we send feedback even if we do not fsync, to > allow the master to track our progress. Hence the name of the second > parameter was just fsync. > > So both names were suggested because of links to those terms already > being used for similar reasons elsewhere in Postgres. We seem to be going in circles. You suggested having two options, --feedback, and --fsync, which is almost exactly what Furuya posted originally. I objected to that, because I think that user interface is too complicated. Instead, I suggested having just a single option called --synchronous, or even better, have no option at all and have the server tell the client if it's participating in synchronous replication, and have pg_receivexlog automatically fsync when it is, and not otherwise [1]. That way you don't need to expose any new options to the user. What did you think of that idea? [1] http://www.postgresql.org/message-id/5434E0EF.9050304@vmware.com - Heikki
On 22 October 2014 14:26, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > We seem to be going in circles. You suggested having two options, > --feedback, and --fsync, which is almost exactly what Furuya posted > originally. I objected to that, because I think that user interface is too > complicated. Instead, I suggested having just a single option called > --synchronous, or even better, have no option at all and have the server > tell the client if it's participating in synchronous replication, and have > pg_receivexlog automatically fsync when it is, and not otherwise [1]. That > way you don't need to expose any new options to the user. What did you think > of that idea? Sorry, if we're going in circles. Yes, I like the idea. The master setting of synchronous_standby_names defines which standbys/rep clients will have their feedback used to release waiting COMMITs. That uses application_name, which is set at the replication client connection. (That has a default value, but lets assume the user sets this). So when a replication client connects, the WALSender knows its priority, as defined in sync_standby_names. I guess we can have WALSender send a protocol message to the client every time the sync priority changes (as a result of a changed value of sync_standby_names). Which is the function SyncRepInitConfig() -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Oct 22, 2014 at 10:47 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 22 October 2014 14:26, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > >> We seem to be going in circles. You suggested having two options, >> --feedback, and --fsync, which is almost exactly what Furuya posted >> originally. I objected to that, because I think that user interface is too >> complicated. Instead, I suggested having just a single option called >> --synchronous I'm OK with this. But the option name "synchronous" seems confusing because pg_receivexlog with sync option can work as async standby. So the better name is needed. > or even better, have no option at all and have the server >> tell the client if it's participating in synchronous replication, and have >> pg_receivexlog automatically fsync when it is, and not otherwise [1]. That >> way you don't need to expose any new options to the user. What did you think >> of that idea? > > Sorry, if we're going in circles. > > Yes, I like the idea. > > The master setting of synchronous_standby_names defines which > standbys/rep clients will have their feedback used to release waiting > COMMITs. That uses application_name, which is set at the replication > client connection. (That has a default value, but lets assume the user > sets this). So when a replication client connects, the WALSender knows > its priority, as defined in sync_standby_names. > > I guess we can have WALSender send a protocol message to the client > every time the sync priority changes (as a result of a changed value > of sync_standby_names). Which is the function SyncRepInitConfig() Sorry, I'm going around in the circle. But I'd like to say again, I don't think this is good idea. It prevents asynchronous pg_receivexlog from fsyncing WAL data and sending feedbacks more frequently at all. They are useful, for example, when we want to monitor the write location of asynchronous pg_receivexlog in almost realtime. But if we adopt the idea, since feedback cannot be sent soon in async mode, pg_stat_replication always returns the not-up-to-date location. Regards, -- Fujii Masao
On 23 October 2014 15:39, Fujii Masao <masao.fujii@gmail.com> wrote: > Sorry, I'm going around in the circle. But I'd like to say again, I don't think > this is good idea. It prevents asynchronous pg_receivexlog from fsyncing > WAL data and sending feedbacks more frequently at all. They are useful, > for example, when we want to monitor the write location of asynchronous > pg_receivexlog in almost realtime. But if we adopt the idea, since feedback > cannot be sent soon in async mode, pg_stat_replication always returns > the not-up-to-date location. Why not send a message every 10 seconds when its not sync rep? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/23/2014 06:01 PM, Simon Riggs wrote: > On 23 October 2014 15:39, Fujii Masao <masao.fujii@gmail.com> wrote: > >> Sorry, I'm going around in the circle. But I'd like to say again, I don't think >> this is good idea. It prevents asynchronous pg_receivexlog from fsyncing >> WAL data and sending feedbacks more frequently at all. They are useful, >> for example, when we want to monitor the write location of asynchronous >> pg_receivexlog in almost realtime. But if we adopt the idea, since feedback >> cannot be sent soon in async mode, pg_stat_replication always returns >> the not-up-to-date location. > > Why not send a message every 10 seconds when its not sync rep? Or even after every write(). It's a tiny amount of network traffic anyway. - Heikki
> >> Sorry, I'm going around in the circle. But I'd like to say again, I > >> don't think this is good idea. It prevents asynchronous > >> pg_receivexlog from fsyncing WAL data and sending feedbacks more > >> frequently at all. They are useful, for example, when we want to > >> monitor the write location of asynchronous pg_receivexlog in almost > >> realtime. But if we adopt the idea, since feedback cannot be sent > >> soon in async mode, pg_stat_replication always returns the > not-up-to-date location. > > > > Why not send a message every 10 seconds when its not sync rep? > > Or even after every write(). It's a tiny amount of network traffic anyway. I understand that send feedback message frequently will keep pg_stat_replication up-to-date state. Are there really no needs who wants to fsync even in async mode ? I think the people who dislike Data lost will like that idea. Thought? Nevertheless in sync or async, returning feedback and executing fsync() same as like walreceiver is such a problem? -- Furuya Osamu
On 10/24/2014 01:24 PM, furuyao@pm.nttdata.co.jp wrote: >>>> Sorry, I'm going around in the circle. But I'd like to say again, I >>>> don't think this is good idea. It prevents asynchronous >>>> pg_receivexlog from fsyncing WAL data and sending feedbacks more >>>> frequently at all. They are useful, for example, when we want to >>>> monitor the write location of asynchronous pg_receivexlog in almost >>>> realtime. But if we adopt the idea, since feedback cannot be sent >>>> soon in async mode, pg_stat_replication always returns the >> not-up-to-date location. >>> >>> Why not send a message every 10 seconds when its not sync rep? >> >> Or even after every write(). It's a tiny amount of network traffic anyway. > > I understand that send feedback message frequently will keep pg_stat_replication up-to-date state. > > Are there really no needs who wants to fsync even in async mode ? > I think the people who dislike Data lost will like that idea. The OS will not sit on the written but not fsync'd data forever, it will get flushed to disk in a few seconds even without the fsync. It's just that there is no guarantee on when it will hit the disk, but there are no strong guarantees in asynchronous replication anyway. > Nevertheless in sync or async, returning feedback and executing > fsync() same as like walreceiver is such a problem? Probably would be OK. It would increase the I/O a lot, thanks to a lot of small writes and fsyncs, which might be surprising for a tool like pg_receivexlog. One idea is to change the default behavior to be like walreceiver, and fsync() after every write. But provide an option to get the old behavior, "--no-fsync". - Heikki
On Fri, Oct 24, 2014 at 11:21 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 10/24/2014 01:24 PM, furuyao@pm.nttdata.co.jp wrote: >>>>> >>>>> Sorry, I'm going around in the circle. But I'd like to say again, I >>>>> don't think this is good idea. It prevents asynchronous >>>>> pg_receivexlog from fsyncing WAL data and sending feedbacks more >>>>> frequently at all. They are useful, for example, when we want to >>>>> monitor the write location of asynchronous pg_receivexlog in almost >>>>> realtime. But if we adopt the idea, since feedback cannot be sent >>>>> soon in async mode, pg_stat_replication always returns the >>> >>> not-up-to-date location. >>>> >>>> >>>> Why not send a message every 10 seconds when its not sync rep? >>> >>> >>> Or even after every write(). It's a tiny amount of network traffic >>> anyway. >> >> >> I understand that send feedback message frequently will keep >> pg_stat_replication up-to-date state. >> >> Are there really no needs who wants to fsync even in async mode ? >> I think the people who dislike Data lost will like that idea. > > > The OS will not sit on the written but not fsync'd data forever, it will get > flushed to disk in a few seconds even without the fsync. It's just that > there is no guarantee on when it will hit the disk, but there are no strong > guarantees in asynchronous replication anyway. This makes me come up with the idea about adding new GUC parameter like wal_receiver_fsync so that a user can disable walreceiver to fsync WAL after every write(). IOW, it can allow walreceiver to work like current pg_receivexlog. One problem of the above idea is that the startup process can replay WAL which has not been fsync'd yet. This violates the WAL rule. To resolve this, the startup process would need to ensure that WAL to replay has already been fsync'd, and otherwise issue the fsync. This basically makes the WAL replay longer, but we can reduce the amount of I/O by walreceiver. This seems also useful even with synchronous replication if synchronous_commit is set to remote_write. Since walreceiver doesn't issue the fsync and can focus on receiving and writing WAL, the performance of replication would be improved. >> Nevertheless in sync or async, returning feedback and executing >> fsync() same as like walreceiver is such a problem? > > > Probably would be OK. It would increase the I/O a lot, thanks to a lot of > small writes and fsyncs, which might be surprising for a tool like > pg_receivexlog. > > One idea is to change the default behavior to be like walreceiver, and > fsync() after every write. But provide an option to get the old behavior, > "--no-fsync". I'm OK with this. That is, by default, pg_receivexlog issues the fsync and sends the feedback message back after every write(). But something like --no-fsync is specified, WAL file is fsync'd only when it's closed and a feedback message is sent back only when --status-interval time is elapsed. Regards, -- Fujii Masao
On Wed, Oct 22, 2014 at 9:26 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > We seem to be going in circles. You suggested having two options, > --feedback, and --fsync, which is almost exactly what Furuya posted > originally. I objected to that, because I think that user interface is too > complicated. Instead, I suggested having just a single option called > --synchronous, or even better, have no option at all and have the server > tell the client if it's participating in synchronous replication, and have > pg_receivexlog automatically fsync when it is, and not otherwise [1]. That > way you don't need to expose any new options to the user. What did you think > of that idea? I think it's pretty weird to make the fsync behavior of the client is controlled by the server. I also don't think that fsync() on the client side is useless in asynchronous replication. Yeah, it's true that there are no *guarantees* with asynchronous replication, but the bound on how long the data can take to get out to disk is a heck of a lot shorter if you fsync frequently than if you don't. And on the flip side, that has a performance impact. So I actually think the design you proposed is not as good as what was proposed by Furuya and Simon. But I don't feel incredibly strongly about it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> > We seem to be going in circles. You suggested having two options, > > --feedback, and --fsync, which is almost exactly what Furuya posted > > originally. I objected to that, because I think that user interface > is > > too complicated. Instead, I suggested having just a single option > > called --synchronous, or even better, have no option at all and have > > the server tell the client if it's participating in synchronous > > replication, and have pg_receivexlog automatically fsync when it is, > > and not otherwise [1]. That way you don't need to expose any new > > options to the user. What did you think of that idea? > > I think it's pretty weird to make the fsync behavior of the client is > controlled by the server. > > I also don't think that fsync() on the client side is useless in > asynchronous replication. Yeah, it's true that there are no > *guarantees* with asynchronous replication, but the bound on how long > the data can take to get out to disk is a heck of a lot shorter if you > fsync frequently than if you don't. And on the flip side, that has a > performance impact. > > So I actually think the design you proposed is not as good as what was > proposed by Furuya and Simon. But I don't feel incredibly strongly about > it. Thanks for lots of comments!! I fixed the patch. As a default, it behave like a walreceiver. Same as walreceiver, it fsync and send a feedback after fsync. When --no-fsync is specified, it will fsync only wal has switched. feedback message is specified by --status-interval. Regards, -- Furuya Osamu
Attachment
On Fri, Oct 31, 2014 at 5:46 PM, <furuyao@pm.nttdata.co.jp> wrote: >> > We seem to be going in circles. You suggested having two options, >> > --feedback, and --fsync, which is almost exactly what Furuya posted >> > originally. I objected to that, because I think that user interface >> is >> > too complicated. Instead, I suggested having just a single option >> > called --synchronous, or even better, have no option at all and have >> > the server tell the client if it's participating in synchronous >> > replication, and have pg_receivexlog automatically fsync when it is, >> > and not otherwise [1]. That way you don't need to expose any new >> > options to the user. What did you think of that idea? >> >> I think it's pretty weird to make the fsync behavior of the client is >> controlled by the server. >> >> I also don't think that fsync() on the client side is useless in >> asynchronous replication. Yeah, it's true that there are no >> *guarantees* with asynchronous replication, but the bound on how long >> the data can take to get out to disk is a heck of a lot shorter if you >> fsync frequently than if you don't. And on the flip side, that has a >> performance impact. >> >> So I actually think the design you proposed is not as good as what was >> proposed by Furuya and Simon. But I don't feel incredibly strongly about >> it. > > Thanks for lots of comments!! > > I fixed the patch. > As a default, it behave like a walreceiver. > Same as walreceiver, it fsync and send a feedback after fsync. On second thought, flipping the default behavior seems not worthwhile here. Which might surprise the existing users and cause some troubles to them. I'd like to back to the Heikki's original suggestion like just adding --synchronous option. That is, only when --synchronous is specified, WAL is flushed and feedback is sent back as soon as WAL is received. I changed the patch heavily in that way. Please find the attached patch. By default, pg_receivexlog flushes WAL data only when WAL file is closed. If --synchronous is specified, like the standby's WAL receiver, sync commands are issued as soon as there is WAL data which has not been flushed yet. Also status packets are sent back to the server just after WAL data is flushed whatever --status-interval is set to. I added the description of this behavior to the doc. Thought? Regards, -- Fujii Masao
Attachment
> On Fri, Oct 31, 2014 at 5:46 PM, <furuyao@pm.nttdata.co.jp> wrote: > >> > We seem to be going in circles. You suggested having two options, > >> > --feedback, and --fsync, which is almost exactly what Furuya posted > >> > originally. I objected to that, because I think that user interface > >> is > >> > too complicated. Instead, I suggested having just a single option > >> > called --synchronous, or even better, have no option at all and > >> > have the server tell the client if it's participating in > >> > synchronous replication, and have pg_receivexlog automatically > >> > fsync when it is, and not otherwise [1]. That way you don't need > to > >> > expose any new options to the user. What did you think of that idea? > >> > >> I think it's pretty weird to make the fsync behavior of the client > is > >> controlled by the server. > >> > >> I also don't think that fsync() on the client side is useless in > >> asynchronous replication. Yeah, it's true that there are no > >> *guarantees* with asynchronous replication, but the bound on how long > >> the data can take to get out to disk is a heck of a lot shorter if > >> you fsync frequently than if you don't. And on the flip side, that > >> has a performance impact. > >> > >> So I actually think the design you proposed is not as good as what > >> was proposed by Furuya and Simon. But I don't feel incredibly > >> strongly about it. > > > > Thanks for lots of comments!! > > > > I fixed the patch. > > As a default, it behave like a walreceiver. > > Same as walreceiver, it fsync and send a feedback after fsync. > > On second thought, flipping the default behavior seems not worthwhile > here. > Which might surprise the existing users and cause some troubles to them. > I'd like to back to the Heikki's original suggestion like just adding > --synchronous option. That is, only when --synchronous is specified, WAL > is flushed and feedback is sent back as soon as WAL is received. > > I changed the patch heavily in that way. Please find the attached patch. > By default, pg_receivexlog flushes WAL data only when WAL file is closed. > If --synchronous is specified, like the standby's WAL receiver, sync > commands are issued as soon as there is WAL data which has not been flushed > yet. Also status packets are sent back to the server just after WAL data > is flushed whatever --status-interval is set to. I added the description > of this behavior to the doc. > > Thought? I think it's as you pointed out. Thank you for the patch. I did a review of the patch. There was no problem. I confirmed the following. 1. applied cleanly and compilation was without warnings and errors 2. all regress tests was passed ok 3. when --synchronous is not specified, do not fsync except wal has switched 4. it get normal backup when pg_basebackup has executed with '-X s' 5. when --synchronous is specified, it fsync every time when it receive WAL data 6. when --synchronous is specified, in spite of -s option, feedback are sent back when it fsync 7. when --slog is not specified, it wouldn't feedback fsync location 8. no problem in document Regards, -- Furuya Osamu
On Mon, Nov 10, 2014 at 7:19 PM, <furuyao@pm.nttdata.co.jp> wrote: >> On Fri, Oct 31, 2014 at 5:46 PM, <furuyao@pm.nttdata.co.jp> wrote: >> >> > We seem to be going in circles. You suggested having two options, >> >> > --feedback, and --fsync, which is almost exactly what Furuya posted >> >> > originally. I objected to that, because I think that user interface >> >> is >> >> > too complicated. Instead, I suggested having just a single option >> >> > called --synchronous, or even better, have no option at all and >> >> > have the server tell the client if it's participating in >> >> > synchronous replication, and have pg_receivexlog automatically >> >> > fsync when it is, and not otherwise [1]. That way you don't need >> to >> >> > expose any new options to the user. What did you think of that idea? >> >> >> >> I think it's pretty weird to make the fsync behavior of the client >> is >> >> controlled by the server. >> >> >> >> I also don't think that fsync() on the client side is useless in >> >> asynchronous replication. Yeah, it's true that there are no >> >> *guarantees* with asynchronous replication, but the bound on how long >> >> the data can take to get out to disk is a heck of a lot shorter if >> >> you fsync frequently than if you don't. And on the flip side, that >> >> has a performance impact. >> >> >> >> So I actually think the design you proposed is not as good as what >> >> was proposed by Furuya and Simon. But I don't feel incredibly >> >> strongly about it. >> > >> > Thanks for lots of comments!! >> > >> > I fixed the patch. >> > As a default, it behave like a walreceiver. >> > Same as walreceiver, it fsync and send a feedback after fsync. >> >> On second thought, flipping the default behavior seems not worthwhile >> here. >> Which might surprise the existing users and cause some troubles to them. >> I'd like to back to the Heikki's original suggestion like just adding >> --synchronous option. That is, only when --synchronous is specified, WAL >> is flushed and feedback is sent back as soon as WAL is received. >> >> I changed the patch heavily in that way. Please find the attached patch. >> By default, pg_receivexlog flushes WAL data only when WAL file is closed. >> If --synchronous is specified, like the standby's WAL receiver, sync >> commands are issued as soon as there is WAL data which has not been flushed >> yet. Also status packets are sent back to the server just after WAL data >> is flushed whatever --status-interval is set to. I added the description >> of this behavior to the doc. >> >> Thought? > > I think it's as you pointed out. > Thank you for the patch. > I did a review of the patch. > There was no problem. So I'm thinking to push the patch. Does anyone object to that? Regards, -- Fujii Masao
Fujii Masao wrote: > --- 127,152 ---- > When this option is used, <application>pg_receivexlog</> will report > a flush position to the server, indicating when each segment has been > synchronized to disk so that the server can remove that segment if it > ! is not otherwise needed. <literal>--synchronous</literal> option must > ! be specified when making <application>pg_receivexlog</> run as > ! synchronous standby by using replication slot. Otherwise WAL data > ! cannot be flushed frequently enough for this to work correctly. > </para> > </listitem> > </varlistentry> Whitespace damage here. > + printf(_(" --synchronous flush transaction log in real time\n")); "in real time" sounds odd. How about "flush transaction log immediately after writing", or maybe "have transaction log writes be synchronous". > --- 781,791 ---- > now = feGetCurrentTimestamp(); > > /* > ! * Issue sync command as soon as there are WAL data which > ! * has not been flushed yet if synchronous option is true. > */ > if (lastFlushPosition < blockpos && > ! walfile != -1 && synchronous) I'd put the "synchronous" condition first in the if(), and start the comment with it rather than putting it at the end. Both seem weird. > --- 793,807 ---- > progname, current_walfile_name, strerror(errno)); > goto error; > } > lastFlushPosition = blockpos; > ! > ! /* > ! * Send feedback so that the server sees the latest WAL locations > ! * immediately if synchronous option is true. > ! */ > ! if (!sendFeedback(conn, blockpos, now, false)) > ! goto error; > ! last_status = now; I'm not clear about this comment .. why does it say "if synchronous option is true" when it's not checking the condition? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Thanks for reviewing the patch! On Thu, Nov 13, 2014 at 4:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Fujii Masao wrote: > >> --- 127,152 ---- >> When this option is used, <application>pg_receivexlog</> will report >> a flush position to the server, indicating when each segment has been >> synchronized to disk so that the server can remove that segment if it >> ! is not otherwise needed. <literal>--synchronous</literal> option must >> ! be specified when making <application>pg_receivexlog</> run as >> ! synchronous standby by using replication slot. Otherwise WAL data >> ! cannot be flushed frequently enough for this to work correctly. >> </para> >> </listitem> >> </varlistentry> > > Whitespace damage here. Fixed. >> + printf(_(" --synchronous flush transaction log in real time\n")); > > "in real time" sounds odd. How about "flush transaction log > immediately after writing", or maybe "have transaction log writes be > synchronous". The former sounds better to me. So I chose it. >> --- 781,791 ---- >> now = feGetCurrentTimestamp(); >> >> /* >> ! * Issue sync command as soon as there are WAL data which >> ! * has not been flushed yet if synchronous option is true. >> */ >> if (lastFlushPosition < blockpos && >> ! walfile != -1 && synchronous) > > I'd put the "synchronous" condition first in the if(), and start the > comment with it rather than putting it at the end. Both seem weird. Fixed, i.e., moved the "synchronous" condition first in the if()'s test and also moved the comment "If synchronous option is true" also first in the comment. >> progname, current_walfile_name, strerror(errno)); >> goto error; >> } >> lastFlushPosition = blockpos; >> ! >> ! /* >> ! * Send feedback so that the server sees the latest WAL locations >> ! * immediately if synchronous option is true. >> ! */ >> ! if (!sendFeedback(conn, blockpos, now, false)) >> ! goto error; >> ! last_status = now; > > I'm not clear about this comment .. why does it say "if synchronous > option is true" when it's not checking the condition? I added that comment because the code exists with the if() block checking "synchronous" condition. But it seems confusing. Just removed that part from the comment. Attached is the updated version of the patch. Regards, -- Fujii Masao
Attachment
On Thu, Nov 13, 2014 at 12:38 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > Thanks for reviewing the patch! > > On Thu, Nov 13, 2014 at 4:05 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> Fujii Masao wrote: >> >>> --- 127,152 ---- >>> When this option is used, <application>pg_receivexlog</> will report >>> a flush position to the server, indicating when each segment has been >>> synchronized to disk so that the server can remove that segment if it >>> ! is not otherwise needed. <literal>--synchronous</literal> option must >>> ! be specified when making <application>pg_receivexlog</> run as >>> ! synchronous standby by using replication slot. Otherwise WAL data >>> ! cannot be flushed frequently enough for this to work correctly. >>> </para> >>> </listitem> >>> </varlistentry> >> >> Whitespace damage here. > > Fixed. > >>> + printf(_(" --synchronous flush transaction log in real time\n")); >> >> "in real time" sounds odd. How about "flush transaction log >> immediately after writing", or maybe "have transaction log writes be >> synchronous". > > The former sounds better to me. So I chose it. > >>> --- 781,791 ---- >>> now = feGetCurrentTimestamp(); >>> >>> /* >>> ! * Issue sync command as soon as there are WAL data which >>> ! * has not been flushed yet if synchronous option is true. >>> */ >>> if (lastFlushPosition < blockpos && >>> ! walfile != -1 && synchronous) >> >> I'd put the "synchronous" condition first in the if(), and start the >> comment with it rather than putting it at the end. Both seem weird. > > Fixed, i.e., moved the "synchronous" condition first in the if()'s test > and also moved the comment "If synchronous option is true" also first > in the comment. > >>> progname, current_walfile_name, strerror(errno)); >>> goto error; >>> } >>> lastFlushPosition = blockpos; >>> ! >>> ! /* >>> ! * Send feedback so that the server sees the latest WAL locations >>> ! * immediately if synchronous option is true. >>> ! */ >>> ! if (!sendFeedback(conn, blockpos, now, false)) >>> ! goto error; >>> ! last_status = now; >> >> I'm not clear about this comment .. why does it say "if synchronous >> option is true" when it's not checking the condition? > > I added that comment because the code exists with the if() block > checking "synchronous" condition. But it seems confusing. Just removed > that part from the comment. > > Attached is the updated version of the patch. I've just pushed this. Regards, -- Fujii Masao
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Tue, Nov 18, 2014 at 2:36 AM, Fujii Masao<span dir="ltr"><<a href="mailto:masao.fujii@gmail.com" target="_blank">masao.fujii@gmail.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><divclass="h5">On Thu, Nov 13, 2014 at 12:38 PM, Fujii Masao <<a href="mailto:masao.fujii@gmail.com">masao.fujii@gmail.com</a>>wrote:<br /> > Thanks for reviewing the patch!<br />><br /> > On Thu, Nov 13, 2014 at 4:05 AM, Alvaro Herrera<br /> > <<a href="mailto:alvherre@2ndquadrant.com">alvherre@2ndquadrant.com</a>>wrote:<br /> >> Fujii Masao wrote:<br /> >><br/> >>> --- 127,152 ----<br /> >>> When this option is used, <application>pg_receivexlog</>will report<br /> >>> a flush position to the server, indicatingwhen each segment has been<br /> >>> synchronized to disk so that the server can remove thatsegment if it<br /> >>> ! is not otherwise needed. <literal>--synchronous</literal> optionmust<br /> >>> ! be specified when making <application>pg_receivexlog</> run as<br />>>> ! synchronous standby by using replication slot. Otherwise WAL data<br /> >>> ! cannotbe flushed frequently enough for this to work correctly.<br /> >>> </para><br /> >>> </listitem><br /> >>> </varlistentry><br /> >><br /> >> Whitespacedamage here.<br /> ><br /> > Fixed.<br /> ><br /> >>> + printf(_(" --synchronous flush transaction log in real time\n"));<br /> >><br /> >> "in real time" sounds odd. How about "flush transactionlog<br /> >> immediately after writing", or maybe "have transaction log writes be<br /> >> synchronous".<br/> ><br /> > The former sounds better to me. So I chose it.<br /> ><br /> >>> --- 781,791----<br /> >>> now = feGetCurrentTimestamp();<br /> >>><br /> >>> /*<br /> >>> ! * Issue sync command as soon as there are WAL data which<br /> >>>! * has not been flushed yet if synchronous option is true.<br /> >>> */<br/> >>> if (lastFlushPosition < blockpos &&<br /> >>> ! walfile != -1 && synchronous)<br /> >><br /> >> I'd put the "synchronous" condition first in the if(),and start the<br /> >> comment with it rather than putting it at the end. Both seem weird.<br /> ><br /> >Fixed, i.e., moved the "synchronous" condition first in the if()'s test<br /> > and also moved the comment "If synchronousoption is true" also first<br /> > in the comment.<br /> ><br /> >>> progname, current_walfile_name, strerror(errno));<br /> >>> gotoerror;<br /> >>> }<br /> >>> lastFlushPosition = blockpos;<br/> >>> !<br /> >>> ! /*<br /> >>> ! * Sendfeedback so that the server sees the latest WAL locations<br /> >>> ! * immediately ifsynchronous option is true.<br /> >>> ! */<br /> >>> ! if (!sendFeedback(conn,blockpos, now, false))<br /> >>> ! goto error;<br /> >>>! last_status = now;<br /> >><br /> >> I'm not clear about this comment .. whydoes it say "if synchronous<br /> >> option is true" when it's not checking the condition?<br /> ><br /> >I added that comment because the code exists with the if() block<br /> > checking "synchronous" condition. But itseems confusing. Just removed<br /> > that part from the comment.<br /> ><br /> > Attached is the updated versionof the patch.<br /><br /></div></div>I've just pushed this.<br /></blockquote></div>Marked this patch as committedon the commit fest app.<br />-- <br /><div class="gmail_signature">Michael<br /></div></div></div>
On Wed, Nov 19, 2014 at 10:20 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > > > On Tue, Nov 18, 2014 at 2:36 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> >> On Thu, Nov 13, 2014 at 12:38 PM, Fujii Masao <masao.fujii@gmail.com> >> wrote: >> > Thanks for reviewing the patch! >> > >> > On Thu, Nov 13, 2014 at 4:05 AM, Alvaro Herrera >> > <alvherre@2ndquadrant.com> wrote: >> >> Fujii Masao wrote: >> >> >> >>> --- 127,152 ---- >> >>> When this option is used, <application>pg_receivexlog</> >> >>> will report >> >>> a flush position to the server, indicating when each >> >>> segment has been >> >>> synchronized to disk so that the server can remove that >> >>> segment if it >> >>> ! is not otherwise needed. <literal>--synchronous</literal> >> >>> option must >> >>> ! be specified when making <application>pg_receivexlog</> run >> >>> as >> >>> ! synchronous standby by using replication slot. Otherwise WAL >> >>> data >> >>> ! cannot be flushed frequently enough for this to work >> >>> correctly. >> >>> </para> >> >>> </listitem> >> >>> </varlistentry> >> >> >> >> Whitespace damage here. >> > >> > Fixed. >> > >> >>> + printf(_(" --synchronous flush transaction log in real >> >>> time\n")); >> >> >> >> "in real time" sounds odd. How about "flush transaction log >> >> immediately after writing", or maybe "have transaction log writes be >> >> synchronous". >> > >> > The former sounds better to me. So I chose it. >> > >> >>> --- 781,791 ---- >> >>> now = feGetCurrentTimestamp(); >> >>> >> >>> /* >> >>> ! * Issue sync command as soon as there are WAL data >> >>> which >> >>> ! * has not been flushed yet if synchronous option is >> >>> true. >> >>> */ >> >>> if (lastFlushPosition < blockpos && >> >>> ! walfile != -1 && synchronous) >> >> >> >> I'd put the "synchronous" condition first in the if(), and start the >> >> comment with it rather than putting it at the end. Both seem weird. >> > >> > Fixed, i.e., moved the "synchronous" condition first in the if()'s test >> > and also moved the comment "If synchronous option is true" also first >> > in the comment. >> > >> >>> progname, >> >>> current_walfile_name, strerror(errno)); >> >>> goto error; >> >>> } >> >>> lastFlushPosition = blockpos; >> >>> ! >> >>> ! /* >> >>> ! * Send feedback so that the server sees the >> >>> latest WAL locations >> >>> ! * immediately if synchronous option is true. >> >>> ! */ >> >>> ! if (!sendFeedback(conn, blockpos, now, false)) >> >>> ! goto error; >> >>> ! last_status = now; >> >> >> >> I'm not clear about this comment .. why does it say "if synchronous >> >> option is true" when it's not checking the condition? >> > >> > I added that comment because the code exists with the if() block >> > checking "synchronous" condition. But it seems confusing. Just removed >> > that part from the comment. >> > >> > Attached is the updated version of the patch. >> >> I've just pushed this. > > Marked this patch as committed on the commit fest app. Thanks! Regards, -- Fujii Masao