Thread: Re: pgsql: walreceiver uses a temporary replication slot by default
Hi Peter, (Adding Andres and Sergei in CC.) On Tue, Jan 14, 2020 at 01:57:34PM +0000, Peter Eisentraut wrote: > walreceiver uses a temporary replication slot by default > > If no permanent replication slot is configured using > primary_slot_name, the walreceiver now creates and uses a temporary > replication slot. A new setting wal_receiver_create_temp_slot can be > used to disable this behavior, for example, if the remote instance is > out of replication slots. A recent message from Seigei Kornilov has attracted my attention to this commit: https://www.postgresql.org/message-id/370331579618998@vla3-6a5326aeb4ee.qloud-c.yandex.net In the thread about switching primary_conninfo to be reloadable, we have argued at great lengths that we should never have the WAL receiver fetch by itself the GUC parameters used for the connection with its primary. Here is the main area of the discussion: https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de The previous thread was long enough so it can easily be missed. However, it seems to me that we may need to revisit a couple of things for this commit? In short, the following things: - wal_receiver_create_temp_slot should be made PGC_POSTMASTER, similarly to primary_slot_name and primary_conninfo. - WalReceiverMain() should not load the parameter from the GUC context by itself. - RequestXLogStreaming(), called by the startup process, should be in charge of defining if a temp slot should be used or not. -- Michael
Attachment
Hello > In short, the following things: > - wal_receiver_create_temp_slot should be made PGC_POSTMASTER, > similarly to primary_slot_name and primary_conninfo. > - WalReceiverMain() should not load the parameter from the GUC context > by itself. > - RequestXLogStreaming(), called by the startup process, should be in > charge of defining if a temp slot should be used or not. I would like to cross-post here a patch with such changes that I posted in "allow online change primary_conninfo" thread. This thread is more appropriate for discussion about wal_receiver_create_temp_slot. PS: I posted this patch in both threads mostly to make cfbot happy. regards, Sergei
Attachment
On 2020-01-22 06:55, Michael Paquier wrote: > In the thread about switching primary_conninfo to be reloadable, we > have argued at great lengths that we should never have the WAL > receiver fetch by itself the GUC parameters used for the connection > with its primary. Here is the main area of the discussion: > https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de The way I understood that discussion was that the issue is having both the startup process and the WAL receiver having possibly inconsistent knowledge about the current configuration. That doesn't apply in this case, because the setting is only used by the WAL receiver. Maybe I misunderstood. > The previous thread was long enough so it can easily be missed. > However, it seems to me that we may need to revisit a couple of things > for this commit? In short, the following things: > - wal_receiver_create_temp_slot should be made PGC_POSTMASTER, > similarly to primary_slot_name and primary_conninfo. > - WalReceiverMain() should not load the parameter from the GUC context > by itself. > - RequestXLogStreaming(), called by the startup process, should be in > charge of defining if a temp slot should be used or not. That would be a reasonable fix if we think the above is really an issue. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2020-02-10 16:46:04 +0100, Peter Eisentraut wrote: > On 2020-01-22 06:55, Michael Paquier wrote: > > In the thread about switching primary_conninfo to be reloadable, we > > have argued at great lengths that we should never have the WAL > > receiver fetch by itself the GUC parameters used for the connection > > with its primary. Here is the main area of the discussion: > > https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de > > The way I understood that discussion was that the issue is having both the > startup process and the WAL receiver having possibly inconsistent knowledge > about the current configuration. That doesn't apply in this case, because > the setting is only used by the WAL receiver. Maybe I misunderstood. Yes, that was my concern there. I do agree there's much less of an issue here. I still architecturally don't find it attractive that the active configuration between walreceiver and startup process can diverge though. Imagine if we e.g. added the ability to receive WAL over multiple connections from one host, or from multiple hosts (e.g. to be able to get the bulk of the WAL from a cascading node, but also to provide syncrep acknowledgements directly to the primary), or to allow for logical replication without needing all WAL locally on a standby doing decoding. It seems not great if there's potentially diverging configuration (hot standby feedback, temporary slots, ... ) between those walreceivers, just depending on when they started. Here the model e.g. paralell workers use, which explicitly ensure that the GUC state is the same in workers and the leader, is considerably better, imo. So I think adding more of these parameters affecting walreceivers without coordination is not going quite in the right direction. Greetings, Andres Freund
On Mon, Feb 10, 2020 at 01:46:04PM -0800, Andres Freund wrote: > I still architecturally don't find it attractive that the active > configuration between walreceiver and startup process can diverge > though. Imagine if we e.g. added the ability to receive WAL over > multiple connections from one host, or from multiple hosts (e.g. to be > able to get the bulk of the WAL from a cascading node, but also to > provide syncrep acknowledgements directly to the primary), or to allow > for logical replication without needing all WAL locally on a standby > doing decoding. It seems not great if there's potentially diverging > configuration (hot standby feedback, temporary slots, ... ) between > those walreceivers, just depending on when they started. Here the model > e.g. parallel workers use, which explicitly ensure that the GUC state is > the same in workers and the leader, is considerably better, imo. Yes, I still think that we should fix that inconsistency, mark the new GUC wal_receiver_create_temp_slot as PGC_POSTMASTER, and add a note at the top of RequestXLogStreaming() and walreceiver.c about the assumptions we'd prefer rely to for the GUCs starting a WAL receiver. > So I think adding more of these parameters affecting walreceivers > without coordination is not going quite in the right direction. Indeed. Adding more comments would be one way to prevent the situation to happen here, I fear that others may forget this stuff in the future. -- Michael
Attachment
On Wed, Jan 22, 2020 at 06:58:46PM +0300, Sergei Kornilov wrote: > I would like to cross-post here a patch with such changes that I posted in "allow online change primary_conninfo" thread. > This thread is more appropriate for discussion about wal_receiver_create_temp_slot. > > PS: I posted this patch in both threads mostly to make cfbot happy. Thanks for posting this patch, Sergei. Here is a review to make things move on. - * Create temporary replication slot if no slot name is configured or - * the slot from the previous run was temporary, unless - * wal_receiver_create_temp_slot is disabled. We also need to handle - * the case where the previous run used a temporary slot but - * wal_receiver_create_temp_slot was changed in the meantime. In that - * case, we delete the old slot name in shared memory. (This would + * Create temporary replication slot if requested. In that + * case, we update slot name in shared memory. (This would The set of comments you are removing from walreceiver.c to decide if a temporary slot needs to be created or not should be moved to walreceiverfuncs.c as you move the logic from the WAL receiver startup phase to the moment the WAL receiver spawn is requested. I agree with the simplifications in WalReceiverMain() as you have switched wal_receiver_create_temp_slot to be PGC_POSTMASTER, so modifications are no longer a state that matter. It would be more consistent with primary_conn_info and primary_slot_name if wal_receiver_create_temp_slot is passed down as an argument of RequestXLogStreaming(). As per the discussion done on this thread, let's also switch the parameter default to be disabled. Peter, as the committer of 3297308, it would be good if you could chime in. -- Michael
Attachment
Hello > Thanks for posting this patch, Sergei. Here is a review to make > things move on. Thank you, here is updated patch > The set of comments you are removing from walreceiver.c to decide if a > temporary slot needs to be created or not should be moved to > walreceiverfuncs.c as you move the logic from the WAL receiver startup > phase to the moment the WAL receiver spawn is requested. I changed this comments because they describes behavior during change value of wal_receiver_create_temp_slot. But yes, I need to add some comments to RequestXLogStreaming. > It would be more consistent with primary_conn_info and > primary_slot_name if wal_receiver_create_temp_slot is passed down as > an argument of RequestXLogStreaming(). Yep, I thought about that. Changed. > As per the discussion done on this thread, let's also switch the > parameter default to be disabled. Done (my vote is also for disabling this option by default). regards, Sergei
Attachment
On Mon, Feb 17, 2020 at 04:57:04PM +0300, Sergei Kornilov wrote: > Thank you, here is updated patch Thanks > I changed this comments because they describes behavior during > change value of wal_receiver_create_temp_slot. But yes, I need to > add some comments to RequestXLogStreaming. I have reworked that part, adding more comments about the use of GUC parameters when establishing the connection to the primary for a WAL receiver. And also I have added an extra comment to walreceiver.c about the use of GUcs in general, to avoid this stuff again in the future. There were some extra nits with the format of postgresql.conf.sample. >> As per the discussion done on this thread, let's also switch the >> parameter default to be disabled. > > Done (my vote is also for disabling this option by default). We visibly tend to move in this direction, at least based on our discussion. Let's see where this leads. For now, I have registered this patch to next CF (https://commitfest.postgresql.org/27/2456/), with yourself as author and myself as reviewer, and then let's wait for mainly Peter E. and others for more input. -- Michael
Attachment
Hello > I have reworked that part, adding more comments about the use of GUC > parameters when establishing the connection to the primary for a WAL > receiver. And also I have added an extra comment to walreceiver.c > about the use of GUcs in general, to avoid this stuff again in the > future. There were some extra nits with the format of > postgresql.conf.sample. Thank you! I just noticed that you removed my proposed change to this condition in RequestXLogStreaming - if (slotname != NULL) + if (slotname != NULL && slotname[0] != '\0') We need this change to set is_temp_slot properly. PrimarySlotName GUC can usually be an empty string, so just "slotname !=NULL" is not enough. I attached patch with this change. regards, Sergei
Attachment
On Tue, Mar 17, 2020 at 11:39:11PM +0300, Sergei Kornilov wrote: > We need this change to set is_temp_slot properly. PrimarySlotName > GUC can usually be an empty string, so just "slotname != NULL" is > not enough. Yep, or a temporary slot would never be created even if there is no slot defined, and the priority goes to primary_slot_name if set. > I attached patch with this change. Thanks, I have added a new open item for v13 to track this effort: https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items -- Michael