Thread: Synchronous replication patch v2
Hi, Attached is the latest version of Synch Rep patch. Sorry for my late posting. I fixed some bugs in v1patch and integrated walreceiver into core. Attached contain some patches: * synch_rep_v2.patch This is the whole patch of Synch Rep against head. This also contain "Infrastructure changes for recover" patch by Simon because I'd like to make walreceiver work in consistent recovery mode. To be reviewed easily, I split synch_rep_v2.patch into 8 pieces. * 01_signal_handling_v2.patch This is the patch of signal handling changes for Synch Rep. I already posted this. http://archives.postgresql.org/message-id/3f0b79eb0811040404r799d3170v5bb9f201000f1771@mail.gmail.com * 02_pqcomm_v1.patch This is the patch of non-blocking pqcomm. Since walsender sends xlog records and receive the reply from walreceiver concurrently, this feature is needed. * 03_libpq_v1.patch This is the patch of new libpq API for Synch Rep. This is mainly used by walreceiver. I didn't want to introduce a new communication layer, I modified libpq for Synch Rep. * 04_recovery_conf_v1.patch This is the patch for postmaster to read recovery.conf like GUC. Thereby, processes other than startup process can also get parameters in recovery.conf. Walreceiver gets a host / port of a primary server from recovery.conf using this feature. * 05_split_xlogwrt_v1.patch This is the patch of splitting XLogWrite into 3 pieces. Two of them are used for walreceiver to write xlog records on the standby server. * 06_walsender_v2.patch This is the patch for sending xlog records. This contains - walsender itself - communication between walsender and backends (other than signal handling changes) - management of xlog positions * 07_walreceiver_v1.patch This is the patch of walreceiver which receives and writes xlog records. Walreceiver works in consistent recovery mode. * 08_arch_in_recovery_v1.patch This is the patch for archiver to work in consistent recovery mode. In the standby server, walreceiver writes xlog records to pg_xlog, and pg_standby reads them from archive location. So, someone has to archive them. I make walreceiver start archiver for archiving them. If you want to apply 8 patches one by one, please apply "Infrastructure changes for recover" patch and create the following empty files first. - src/backend/postmaster/walsender.c - src/backend/postmaster/walreceiver.c - src/include/postmaster/walsender.h - src/include/postmaster/walreceiver.h Since source code comments and documents are insufficient, I think I will mainly enrich them next week. And, if there is a patch which is hard to be reviewed, I will split it further. Any comments welcome! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
Fujii Masao wrote: > Attached is the latest version of Synch Rep patch. Why do we need a separate XLogsndRqst variable in shared memory? Don't we always want to send the WAL up to the same point as we flush it? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Fri, 2008-11-14 at 19:00 +0200, Heikki Linnakangas wrote: > Fujii Masao wrote: > > Attached is the latest version of Synch Rep patch. > > Why do we need a separate XLogsndRqst variable in shared memory? Don't > we always want to send the WAL up to the same point as we flush it? If we're doing synch rep and we're committing. What happens when we're doing async rep or running something like a large load. I wouldn't want to presume that the network packet size and the disk write size are always identical. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs wrote: > On Fri, 2008-11-14 at 19:00 +0200, Heikki Linnakangas wrote: >> Why do we need a separate XLogsndRqst variable in shared memory? Don't >> we always want to send the WAL up to the same point as we flush it? > > If we're doing synch rep and we're committing. You flush and send the WAL, up to the same point? > What happens when we're > doing async rep or running something like a large load. You don't flush, and you don't request the WAL to be sent? The background writer and WAL sender can still wake up periodically, and write and send the WAL as they find convenient. > I wouldn't want > to presume that the network packet size and the disk write size are > always identical. Huh? No-one's presuming that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Fri, 2008-11-14 at 19:23 +0200, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Fri, 2008-11-14 at 19:00 +0200, Heikki Linnakangas wrote: > >> Why do we need a separate XLogsndRqst variable in shared memory? Don't > >> we always want to send the WAL up to the same point as we flush it? > > > > If we're doing synch rep and we're committing. > > You flush and send the WAL, up to the same point? Yes, but you may make progress towards it in different size steps. > > What happens when we're > > doing async rep or running something like a large load. > > You don't flush, and you don't request the WAL to be sent? The > background writer and WAL sender can still wake up periodically, and > write and send the WAL as they find convenient. With WAL writes we write and flush at the same time. With WAL sending that doesn't sound such a good plan. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
> This is the whole patch of Synch Rep against head. This also contain > "Infrastructure changes for recover" patch by Simon because I'd like to > make walreceiver work in consistent recovery mode. Given that both this patch and Simon's hot standby patches need this infrastructure, it seems like it would be awfully good if we could get that piece into final form and committed. Tom Lane and Heikki both reviewed this for the September commitfest and even into October, but then, at least as I understand it, the reviewing train kind of ran out of steam. I'm not sure what the best way forward is. Do we need to assign a new reviewer, or do either of Tom and Heikki feel like picking it up again? ...Robert
On Sat, Nov 15, 2008 at 2:00 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Fujii Masao wrote: >> >> Attached is the latest version of Synch Rep patch. > > Why do we need a separate XLogsndRqst variable in shared memory? Don't we > always want to send the WAL up to the same point as we flush it? It's because the backends flush and send the WAL concurrently. I mean that the backend might request for the WAL to be sent up to the further point while another backend is writing it. If that variable is not separated, the backend cannot request until the former flushing finishes. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Robert Haas wrote: >> This is the whole patch of Synch Rep against head. This also contain >> "Infrastructure changes for recover" patch by Simon because I'd like to >> make walreceiver work in consistent recovery mode. > > Given that both this patch and Simon's hot standby patches need this > infrastructure, it seems like it would be awfully good if we could get > that piece into final form and committed. Tom Lane and Heikki both > reviewed this for the September commitfest and even into October, but > then, at least as I understand it, the reviewing train kind of ran out > of steam. I'm not sure what the best way forward is. Do we need to > assign a new reviewer, or do either of Tom and Heikki feel like > picking it up again? I'll take a look at it, but the code is tricky enough that it would be a good idea if others, including Tom, review it too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mon, 2008-11-17 at 15:44 +0200, Heikki Linnakangas wrote: > Robert Haas wrote: > >> This is the whole patch of Synch Rep against head. This also contain > >> "Infrastructure changes for recover" patch by Simon because I'd like to > >> make walreceiver work in consistent recovery mode. > > > > Given that both this patch and Simon's hot standby patches need this > > infrastructure, it seems like it would be awfully good if we could get > > that piece into final form and committed. Tom Lane and Heikki both > > reviewed this for the September commitfest and even into October, but > > then, at least as I understand it, the reviewing train kind of ran out > > of steam. I'm not sure what the best way forward is. Do we need to > > assign a new reviewer, or do either of Tom and Heikki feel like > > picking it up again? > > I'll take a look at it, but the code is tricky enough that it would be a > good idea if others, including Tom, review it too. Yes please to multiple reviewers. The Hot Standby patch can be considered v9 of the infrastructure patch, as mentioned on the wiki. I'll look to separate them so we review the correct code. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs wrote: > The Hot Standby patch can be considered v9 of the infrastructure patch, > as mentioned on the wiki. > > I'll look to separate them so we review the correct code. Ok, I started reviewing the other v9 (http://archives.postgresql.org/message-id/1223472898.4747.310.camel@ebony.2ndQuadrant). I'll look at other patches meanwhile you separate them again. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mon, 2008-11-17 at 18:56 +0200, Heikki Linnakangas wrote: > Simon Riggs wrote: > > The Hot Standby patch can be considered v9 of the infrastructure patch, > > as mentioned on the wiki. > > > > I'll look to separate them so we review the correct code. > > Ok, I started reviewing the other v9 > (http://archives.postgresql.org/message-id/1223472898.4747.310.camel@ebony.2ndQuadrant). > I'll look at other patches meanwhile you separate them again. Arghhh! Sorry, slip of finger earlier. Latest Hot Standby is v10. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Hi, On Fri, Nov 14, 2008 at 7:15 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > Hi, > > Attached is the latest version of Synch Rep patch. Sorry for my late posting. > I fixed some bugs in v1patch and integrated walreceiver into core. Attached > contain some patches: Attached is v3 of Synch Rep patch. I changed the signal handling, as pointed out in another thread. http://archives.postgresql.org/pgsql-hackers/2008-11/msg01041.php Other changes are; * In v2, whenever XLogSend sent the WAL, it was initializing StringInfo, but since it's useless, I changed it so that one StringInfo might be used. * In v2, since the WAL files were not recycled in the standby, fsynch by walreceiver when initializing the WAL files (XLogFileInit) was one of the bottleneck. So, I made the bgwriter recycle the old WAL files when creating the restart point, and the startup process recycle the restored file (RECOVERYXLOG) when trying to remove it. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
"Fujii Masao" <masao.fujii@gmail.com> wrote: > On Fri, Nov 14, 2008 at 7:15 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > > Attached is the latest version of Synch Rep patch. Sorry for my late posting. > > I fixed some bugs in v1patch and integrated walreceiver into core. I have some comments about v3 patch. [1] Should walsender database be real or not? [2] User-configurable replication_timeout is dangerous [3] How to configure wal_buffers and wal_sender_delay? [4] XLogArchivingActive on replication mode [5] Usage of XLog Send-Flush-Wait [6] Bit-OR or Logical-OR ---- [1] Should walsender database be real or not? Index: bin/initdb/initdb.c + "CREATE DATABASE walsender;\n", You create walsender as a *real* database, but is it really needed? Can we make it a virtual database only for walreceiver? And backend process crashes when I login the walsender database: $ psql walsender psql: server closed the connection unexpectedly This probably means the server terminated abnormally before orwhile processing the request. Even if we need to have the database in real, I'd like to use another name for it. The name 'walsender' seems to be an internal module name but it should be a feature name (ex. 'replication'). ---- [2] User-configurable replication_timeout is dangerous Index: backend/utils/misc/guc.c + {"replication_timeout", PGC_USERSET, WAL_REPLICATION, You export replication_timeout as a PGC_USERSET variable, but it is dangerous. It allows non-superusers to kill servers easily by setting it too low value. Walsender dies with FATAL on timeout. BTW, how about adding an option to choose FATAL or ERROR on timeout? I think FATAL is too strong for some cases. ---- [3] How to configure wal_buffers and wal_sender_delay? The parameter wal_buffers might be more important in synch rep, but we don't have a mean to know whether wal_buffers is enough or not. Should we have a new system view 'pg_stat_xlog' to find shortage of wal_buffers and wal_sender_delay? ---- [4] XLogArchivingActive on replication mode XLogArchivingActive is not modified in the patch, but is it safe? For example, we might need to disable COPY-without-wal optimization when replication is enabled. ---- [5] Usage of XLog Send-Flush-Wait There are many following sequences: RequestXLogSend(WriteRqstPtr, true); XLogFlush(WriteRqstPtr); WaitXLogSend(WriteRqstPtr,false, true); It introduces additional complexities to use XLogFlush. Is it possible to push RequestXLogSend and WaitXLogSend into XLogFlush? It might require a new flag argument in XLogFlush. ---- [6] Bit-OR or Logical-ORWaitXLogSend(XactLastRecEnd, false, forceSyncCommit | haveNonTemp); should beWaitXLogSend(XactLastRecEnd, false, forceSyncCommit || haveNonTemp); | is bit OR and || is logical OR. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On Tue, Nov 25, 2008 at 11:42 AM, ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > "Fujii Masao" <masao.fujii@gmail.com> wrote: >> On Fri, Nov 14, 2008 at 7:15 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> > Attached is the latest version of Synch Rep patch. Sorry for my late posting. >> > I fixed some bugs in v1patch and integrated walreceiver into core. > > I have some comments about v3 patch. Hi, Itagaki-san. Thanks for taking time to review the patch. > > [1] Should walsender database be real or not? > [2] User-configurable replication_timeout is dangerous > [3] How to configure wal_buffers and wal_sender_delay? > [4] XLogArchivingActive on replication mode > [5] Usage of XLog Send-Flush-Wait > [6] Bit-OR or Logical-OR > > ---- > [1] Should walsender database be real or not? > Index: bin/initdb/initdb.c > + "CREATE DATABASE walsender;\n", > > You create walsender as a *real* database, but is it really needed? > Can we make it a virtual database only for walreceiver? I think that the real database is better, because it's simpler and the user can re-create it easily even if it is lost accidentally. Of course, if we introduce the new API to handle a virtual database, we can virtualize it. Is it worth introducing it? > > And backend process crashes when I login the walsender database: > $ psql walsender > psql: server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. Obviously undesirable behavior :( I will fix it. > > Even if we need to have the database in real, I'd like to use another > name for it. The name 'walsender' seems to be an internal module name > but it should be a feature name (ex. 'replication'). Agreed. The name 'replication' is more suitable, I also think. Any other ideas? > > ---- > [2] User-configurable replication_timeout is dangerous > Index: backend/utils/misc/guc.c > + {"replication_timeout", PGC_USERSET, WAL_REPLICATION, > > You export replication_timeout as a PGC_USERSET variable, but it is > dangerous. It allows non-superusers to kill servers easily by setting it > too low value. Walsender dies with FATAL on timeout. > > BTW, how about adding an option to choose FATAL or ERROR on timeout? > I think FATAL is too strong for some cases. OK, I will add the new GUC option for specifying the behavior when the timeout occurs. I think the valid values are ERROR, FATAL and PANIC. * ERROR only cancels that the backend waits for replication. The backends (including the canceled one) and walsender continueprocessing. But, synchronous replication might be broken. * FATAL terminates walsender. The backends continue processing without replication. * PANIC terminates all processes. In previous discussion, someone wanted this behavior. Or, should I define replication_timeout as PGC_SUSET? > > ---- > [3] How to configure wal_buffers and wal_sender_delay? > The parameter wal_buffers might be more important in synch rep, > but we don't have a mean to know whether wal_buffers is enough or not. > Should we have a new system view 'pg_stat_xlog' to find shortage > of wal_buffers and wal_sender_delay? Couting the number of times which the buffer page is replaced in AdvanceXLInsertBuffer might be help to configure them. Of course, this kind of feature is very useful. But, is it essential for Synch Rep? If not, I'd like to postpone it to 8.5. > > ---- > [4] XLogArchivingActive on replication mode > XLogArchivingActive is not modified in the patch, but is it safe? > For example, we might need to disable COPY-without-wal optimization > when replication is enabled. You are right! I will fix it. > > ---- > [5] Usage of XLog Send-Flush-Wait > There are many following sequences: > RequestXLogSend(WriteRqstPtr, true); > XLogFlush(WriteRqstPtr); > WaitXLogSend(WriteRqstPtr, false, true); > > It introduces additional complexities to use XLogFlush. > Is it possible to push RequestXLogSend and WaitXLogSend into XLogFlush? > It might require a new flag argument in XLogFlush. OK, I will push them into XLogFlush. > > ---- > [6] Bit-OR or Logical-OR > WaitXLogSend(XactLastRecEnd, false, forceSyncCommit | haveNonTemp); > should be > WaitXLogSend(XactLastRecEnd, false, forceSyncCommit || haveNonTemp); > | is bit OR and || is logical OR. Oops! I will fix it. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao escreveu: > (...) >> Even if we need to have the database in real, I'd like to use another >> name for it. The name 'walsender' seems to be an internal module name >> but it should be a feature name (ex. 'replication'). >> > > Agreed. The name 'replication' is more suitable, I also think. > Any other ideas? > 'walsender' should be a schema in the 'replication' database. Other modules, in replication feature, could be placed there too. []s -- Dickson S. Guedes Administrador de Banco de Dados Confesol - Projeto Colmeia Florianopolis, SC, Brasil (48) 3322-1185, ramal: 26
Dickson S. Guedes escribió: > Fujii Masao escreveu: >> (...) >>> Even if we need to have the database in real, I'd like to use another >>> name for it. The name 'walsender' seems to be an internal module name >>> but it should be a feature name (ex. 'replication'). >>> >> >> Agreed. The name 'replication' is more suitable, I also think. >> Any other ideas? > > 'walsender' should be a schema in the 'replication' database. Other > modules, in replication feature, could be placed there too. Hmm, what is this database there for? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Nov 25, 2008 at 10:57 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Dickson S. Guedes escribió: >> Fujii Masao escreveu: >>> (...) >>>> Even if we need to have the database in real, I'd like to use another >>>> name for it. The name 'walsender' seems to be an internal module name >>>> but it should be a feature name (ex. 'replication'). >>>> >>> >>> Agreed. The name 'replication' is more suitable, I also think. >>> Any other ideas? >> >> 'walsender' should be a schema in the 'replication' database. Other >> modules, in replication feature, could be placed there too. > > Hmm, what is this database there for? It's for authentication for replication. This was discussed before. Please see the following thread and feel free to comment. http://archives.postgresql.org/pgsql-hackers/2008-11/msg00187.php Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao escribió: > On Tue, Nov 25, 2008 at 10:57 PM, Alvaro Herrera > <alvherre@commandprompt.com> wrote: > > Dickson S. Guedes escribió: > >> Fujii Masao escreveu: > >>> (...) > >>>> Even if we need to have the database in real, I'd like to use another > >>>> name for it. The name 'walsender' seems to be an internal module name > >>>> but it should be a feature name (ex. 'replication'). > >>>> > >>> > >>> Agreed. The name 'replication' is more suitable, I also think. > >>> Any other ideas? > >> > >> 'walsender' should be a schema in the 'replication' database. Other > >> modules, in replication feature, could be placed there too. > > > > Hmm, what is this database there for? > > It's for authentication for replication. This was discussed before. > Please see the following thread and feel free to comment. > http://archives.postgresql.org/pgsql-hackers/2008-11/msg00187.php Hmm ... I think this means that the suggestion by Dickson does not make much sense, right? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, Nov 26, 2008 at 12:23 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Fujii Masao escribió: >> On Tue, Nov 25, 2008 at 10:57 PM, Alvaro Herrera >> <alvherre@commandprompt.com> wrote: >> > Dickson S. Guedes escribió: >> >> Fujii Masao escreveu: >> >>> (...) >> >>>> Even if we need to have the database in real, I'd like to use another >> >>>> name for it. The name 'walsender' seems to be an internal module name >> >>>> but it should be a feature name (ex. 'replication'). >> >>>> >> >>> >> >>> Agreed. The name 'replication' is more suitable, I also think. >> >>> Any other ideas? >> >> >> >> 'walsender' should be a schema in the 'replication' database. Other >> >> modules, in replication feature, could be placed there too. >> > >> > Hmm, what is this database there for? >> >> It's for authentication for replication. This was discussed before. >> Please see the following thread and feel free to comment. >> http://archives.postgresql.org/pgsql-hackers/2008-11/msg00187.php > > Hmm ... I think this means that the suggestion by Dickson does not make > much sense, right? Oh, I'm sorry! -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Alvaro Herrera wrote: > Fujii Masao escribió: >> On Tue, Nov 25, 2008 at 10:57 PM, Alvaro Herrera >> <alvherre@commandprompt.com> wrote: >>> Dickson S. Guedes escribió: >>>> Fujii Masao escreveu: >>>>> (...) >>>>>> Even if we need to have the database in real, I'd like to use another >>>>>> name for it. The name 'walsender' seems to be an internal module name >>>>>> but it should be a feature name (ex. 'replication'). >>>>>> >>>>> Agreed. The name 'replication' is more suitable, I also think. >>>>> Any other ideas? >>>> 'walsender' should be a schema in the 'replication' database. Other >>>> modules, in replication feature, could be placed there too. >>> Hmm, what is this database there for? >> It's for authentication for replication. This was discussed before. >> Please see the following thread and feel free to comment. >> http://archives.postgresql.org/pgsql-hackers/2008-11/msg00187.php > > Hmm ... I think this means that the suggestion by Dickson does not make > much sense, right? Right. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Tue, Nov 25, 2008 at 12:23:45PM -0300, Alvaro Herrera wrote: > Fujii Masao escribió: > > On Tue, Nov 25, 2008 at 10:57 PM, Alvaro Herrera > > <alvherre@commandprompt.com> wrote: > > > Dickson S. Guedes escribió: > > >> Fujii Masao escreveu: > > >>> (...) > > >>>> Even if we need to have the database in real, I'd like to use > > >>>> another name for it. The name 'walsender' seems to be an > > >>>> internal module name but it should be a feature name (ex. > > >>>> 'replication'). > > >>>> > > >>> > > >>> Agreed. The name 'replication' is more suitable, I also think. > > >>> Any other ideas? > > >> > > >> 'walsender' should be a schema in the 'replication' database. > > >> Other modules, in replication feature, could be placed there > > >> too. > > > > > > Hmm, what is this database there for? > > > > It's for authentication for replication. This was discussed > > before. Please see the following thread and feel free to comment. > > http://archives.postgresql.org/pgsql-hackers/2008-11/msg00187.php > > Hmm ... I think this means that the suggestion by Dickson does not > make much sense, right? It sounds to me like this should use SQL/MED connections, if it's holding auth information :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter wrote: > It sounds to me like this should use SQL/MED connections, if it's > holding auth information :) No, the SQL/MED stuff holds authentication information to authenticate to other data sources. This is about authentication of *incoming* connections. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Wed, Nov 26, 2008 at 09:15:49PM +0200, Heikki Linnakangas wrote: > David Fetter wrote: >> It sounds to me like this should use SQL/MED connections, if it's >> holding auth information :) > > No, the SQL/MED stuff holds authentication information to authenticate > to other data sources. This is about authentication of *incoming* > connections. Thanks for clearing that up :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Hello, On Tue, Nov 25, 2008 at 6:03 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> [2] User-configurable replication_timeout is dangerous >> Index: backend/utils/misc/guc.c >> + {"replication_timeout", PGC_USERSET, WAL_REPLICATION, >> >> You export replication_timeout as a PGC_USERSET variable, but it is >> dangerous. It allows non-superusers to kill servers easily by setting it >> too low value. Walsender dies with FATAL on timeout. Unlike other background processes, FATAL by walsender doesn't kill the whole server. In FATAL case, walsender is treated like the normal backend, and only walsender dies. Please see reaper() in postmaster.c. Just to be safe, I re-export the parameter as PGC_SUSET in the latest patch. Is still this parameter dangerous? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
"Fujii Masao" <masao.fujii@gmail.com> writes: >>> You export replication_timeout as a PGC_USERSET variable, but it is >>> dangerous. It allows non-superusers to kill servers easily by setting it >>> too low value. Walsender dies with FATAL on timeout. > Unlike other background processes, FATAL by walsender doesn't kill the > whole server. In FATAL case, walsender is treated like the normal backend, > and only walsender dies. Please see reaper() in postmaster.c. > Just to be safe, I re-export the parameter as PGC_SUSET in the latest > patch. Is still this parameter dangerous? If this parameter is only used by a background process, then both of those are wrong. It should be marked SIGHUP to reflect the fact that the only effective way of modifying it is via postgresql.conf. regards, tom lane
Hi, thanks for the comment! On Sat, Nov 29, 2008 at 1:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Fujii Masao" <masao.fujii@gmail.com> writes: >>>> You export replication_timeout as a PGC_USERSET variable, but it is >>>> dangerous. It allows non-superusers to kill servers easily by setting it >>>> too low value. Walsender dies with FATAL on timeout. > >> Unlike other background processes, FATAL by walsender doesn't kill the >> whole server. In FATAL case, walsender is treated like the normal backend, >> and only walsender dies. Please see reaper() in postmaster.c. > >> Just to be safe, I re-export the parameter as PGC_SUSET in the latest >> patch. Is still this parameter dangerous? > > If this parameter is only used by a background process, then both of > those are wrong. It should be marked SIGHUP to reflect the fact that > the only effective way of modifying it is via postgresql.conf. This parameter is used by the backend to wait for replication to complete for the specified time at most. If the timeout has come, the backend send the timtout-interrupt to walsender. Then walsender invokes FATAL error, which doesn't kill the whole server. So, I choosed PGC_USERSET at first. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center