Thread: Re: pgsql: Track last_inactive_time in pg_replication_slots.
On Mon, Mar 25, 2024 at 7:11 AM Amit Kapila <akapila@postgresql.org> wrote: > Track last_inactive_time in pg_replication_slots. > > This commit adds a new property called last_inactive_time for slots. It is > set to 0 whenever a slot is made active/acquired and set to the current > timestamp whenever the slot is inactive/released or restored from the disk. > Note that we don't set the last_inactive_time for the slots currently being > synced from the primary to the standby because such slots are typically > inactive as decoding is not allowed on those. So the field name is last_inactive_time, but if I'm reading this description right, it's actually the last time the slot was active, except for the weird exception for slots being synced. I'm wondering if this field needs to be renamed. And I'm suspicious that having an exception for slots being synced is a bad idea. That makes too much of a judgement about how the user will use this field. It's usually better to just expose the data, and if the user needs helps to make sense of that data, then give them that help separately. In this case, that would mean removing the exception, but making it easy to tell the difference between slots are inactive because they're being synced and slots that are inactive for some other reason. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 7:11 AM Amit Kapila <akapila@postgresql.org> wrote: > > Track last_inactive_time in pg_replication_slots. > > > > This commit adds a new property called last_inactive_time for slots. It is > > set to 0 whenever a slot is made active/acquired and set to the current > > timestamp whenever the slot is inactive/released or restored from the disk. > > Note that we don't set the last_inactive_time for the slots currently being > > synced from the primary to the standby because such slots are typically > > inactive as decoding is not allowed on those. > > So the field name is last_inactive_time, but if I'm reading this > description right, it's actually the last time the slot was active, > except for the weird exception for slots being synced. I'm wondering > if this field needs to be renamed. > We considered the other two names as last_inactive_at and last_active_time. For the first (last_inactive_at), there was an argument that most other fields that display time ends with _time. For the second (last_active_time), there was an argument that it could be misleading as one could think that it should be updated each time WAL record decoding is happening [1]. The other possibility is to name it last_used_time but I think it won't be much different from last_active_time. > And I'm suspicious that having an exception for slots being synced is > a bad idea. That makes too much of a judgement about how the user will > use this field. It's usually better to just expose the data, and if > the user needs helps to make sense of that data, then give them that > help separately. The reason we didn't set this for sync slots is that they won't be usable (one can't use them to decode WAL) unless standby is promoted [2]. But I see your point as well. So, I have copied the others involved in this discussion to see what they think. > > In this case, that would mean removing the exception, > but making it easy to tell the difference between slots are inactive > because they're being synced and slots that are inactive for some > other reason. > I think this can be differentiated with the help of 'synced' column in pg_replication_slots. [1] - https://www.postgresql.org/message-id/Zf1yx9QMbhgJ/Lfy%40ip-10-97-1-34.eu-west-3.compute.internal [2] - https://www.postgresql.org/message-id/CAJpy0uBGv85dFiWMnNLm6NuEs3eTVicsJCyRvMGbR8H%2BfOVBnA%40mail.gmail.com -- With Regards, Amit Kapila.
On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > We considered the other two names as last_inactive_at and > last_active_time. For the first (last_inactive_at), there was an > argument that most other fields that display time ends with _time. For > the second (last_active_time), there was an argument that it could be > misleading as one could think that it should be updated each time WAL > record decoding is happening [1]. The other possibility is to name it > last_used_time but I think it won't be much different from > last_active_time. I don't understand the bit about updating it each time WAL record decoding is happening. If it's the last active time, and the slot is currently active, then the answer is either "right now" or "currently undefined." I'd expect to see NULL in the system view in such a case. And if that's so, then there's nothing to update each time a record is decoded, because it's just still going to show NULL. Why does this field get set to the current time when the slot is restored from disk? -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > We considered the other two names as last_inactive_at and > > last_active_time. For the first (last_inactive_at), there was an > > argument that most other fields that display time ends with _time. For > > the second (last_active_time), there was an argument that it could be > > misleading as one could think that it should be updated each time WAL > > record decoding is happening [1]. The other possibility is to name it > > last_used_time but I think it won't be much different from > > last_active_time. > > I don't understand the bit about updating it each time WAL record > decoding is happening. If it's the last active time, and the slot is > currently active, then the answer is either "right now" or "currently > undefined." I'd expect to see NULL in the system view in such a case. > And if that's so, then there's nothing to update each time a record is > decoded, because it's just still going to show NULL. > IIUC, Bertrand's point was that users can interpret last_active_time as a value that gets updated each time they decode a change which is not what we are doing. So, this can confuse users. Your expectation of answer (NULL) when the slot is active is correct and that is what will happen. > Why does this field get set to the current time when the slot is > restored from disk? > It is because we don't want to include the time the server is down in the last_inactive_time. Say, if we are shutting down the server at time X and the server remains down for another two hours, we don't want to include those two hours as the slot inactive time. The related theory is that this field will be used to invalidate inactive slots based on a threshold (say inactive_timeout). Say, before the shutdown, we release the slot and set the current_time for last_inactive_time for each slot and persist that information as well. Now, if the server is down for a long time, we may invalidate the slots as soon as the server comes up. So, instead, we just set this field at the time we read slots for disk and then reset it to 0/NULL as soon as the slot became active. -- With Regards, Amit Kapila.
Hi, On Mon, Mar 25, 2024 at 08:38:16PM +0530, Amit Kapila wrote: > On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > We considered the other two names as last_inactive_at and > > > last_active_time. For the first (last_inactive_at), there was an > > > argument that most other fields that display time ends with _time. For > > > the second (last_active_time), there was an argument that it could be > > > misleading as one could think that it should be updated each time WAL > > > record decoding is happening [1]. The other possibility is to name it > > > last_used_time but I think it won't be much different from > > > last_active_time. > > > > I don't understand the bit about updating it each time WAL record > > decoding is happening. If it's the last active time, and the slot is > > currently active, then the answer is either "right now" or "currently > > undefined." I'd expect to see NULL in the system view in such a case. > > And if that's so, then there's nothing to update each time a record is > > decoded, because it's just still going to show NULL. > > > > IIUC, Bertrand's point was that users can interpret last_active_time > as a value that gets updated each time they decode a change which is > not what we are doing. So, this can confuse users. Your expectation of > answer (NULL) when the slot is active is correct and that is what will > happen. Yeah, and so would be the confusion: why is last_active_time NULL while one is using the slot? > > Why does this field get set to the current time when the slot is > > restored from disk? > > > > It is because we don't want to include the time the server is down in > the last_inactive_time. Say, if we are shutting down the server at > time X and the server remains down for another two hours, we don't > want to include those two hours as the slot inactive time. The related > theory is that this field will be used to invalidate inactive slots > based on a threshold (say inactive_timeout). Say, before the shutdown, > we release the slot and set the current_time for last_inactive_time > for each slot and persist that information as well. Now, if the server > is down for a long time, we may invalidate the slots as soon as the > server comes up. So, instead, we just set this field at the time we > read slots for disk and then reset it to 0/NULL as soon as the slot > became active. Right, and we also want to invalidate the slot if not used duration > timeout, so that setting the field to zero when the slot is restored from disk is also not an option. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Mar 25, 2024 at 8:46 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 08:38:16PM +0530, Amit Kapila wrote: > > On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > We considered the other two names as last_inactive_at and > > > > last_active_time. For the first (last_inactive_at), there was an > > > > argument that most other fields that display time ends with _time. For > > > > the second (last_active_time), there was an argument that it could be > > > > misleading as one could think that it should be updated each time WAL > > > > record decoding is happening [1]. The other possibility is to name it > > > > last_used_time but I think it won't be much different from > > > > last_active_time. > > > > > > I don't understand the bit about updating it each time WAL record > > > decoding is happening. If it's the last active time, and the slot is > > > currently active, then the answer is either "right now" or "currently > > > undefined." I'd expect to see NULL in the system view in such a case. > > > And if that's so, then there's nothing to update each time a record is > > > decoded, because it's just still going to show NULL. > > > > > > > IIUC, Bertrand's point was that users can interpret last_active_time > > as a value that gets updated each time they decode a change which is > > not what we are doing. So, this can confuse users. Your expectation of > > answer (NULL) when the slot is active is correct and that is what will > > happen. > > Yeah, and so would be the confusion: why is last_active_time NULL while one is > using the slot? > It is because we set it to zero when we acquire the slot and that value will remain the same till the slot is active. I am not sure if I understood your question so what I am saying might not make sense. -- With Regards, Amit Kapila.
Hi, On Mon, Mar 25, 2024 at 08:59:55PM +0530, Amit Kapila wrote: > On Mon, Mar 25, 2024 at 8:46 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 08:38:16PM +0530, Amit Kapila wrote: > > > On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > > > On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > We considered the other two names as last_inactive_at and > > > > > last_active_time. For the first (last_inactive_at), there was an > > > > > argument that most other fields that display time ends with _time. For > > > > > the second (last_active_time), there was an argument that it could be > > > > > misleading as one could think that it should be updated each time WAL > > > > > record decoding is happening [1]. The other possibility is to name it > > > > > last_used_time but I think it won't be much different from > > > > > last_active_time. > > > > > > > > I don't understand the bit about updating it each time WAL record > > > > decoding is happening. If it's the last active time, and the slot is > > > > currently active, then the answer is either "right now" or "currently > > > > undefined." I'd expect to see NULL in the system view in such a case. > > > > And if that's so, then there's nothing to update each time a record is > > > > decoded, because it's just still going to show NULL. > > > > > > > > > > IIUC, Bertrand's point was that users can interpret last_active_time > > > as a value that gets updated each time they decode a change which is > > > not what we are doing. So, this can confuse users. Your expectation of > > > answer (NULL) when the slot is active is correct and that is what will > > > happen. > > > > Yeah, and so would be the confusion: why is last_active_time NULL while one is > > using the slot? > > > > It is because we set it to zero when we acquire the slot and that > value will remain the same till the slot is active. I am not sure if I > understood your question so what I am saying might not make sense. There is no "real" question, I was just highlighting the confusion in case we name the field "last_active_time". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Mar 25, 2024 at 11:16 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > IIUC, Bertrand's point was that users can interpret last_active_time > > as a value that gets updated each time they decode a change which is > > not what we are doing. So, this can confuse users. Your expectation of > > answer (NULL) when the slot is active is correct and that is what will > > happen. > > Yeah, and so would be the confusion: why is last_active_time NULL while one is > using the slot? I agree that users could get confused here, but the solution to that shouldn't be to give the field a name that is the opposite of what it actually does. I expect a field called last_inactive_time to tell me the last time that the slot was inactive. Here, it tells me the last time that a currently-inactive slot previously *WAS* active. How can you justify calling that the last *INACTIVE* time? AFAICS, the user who has the confusion that you mention here is simply wrong. If they are looking at a field called "last active time" and the slot is active, then the correct answer is "right now" or "undefined" and that is what they will see. Sure, they might not understand that. But flipping the name of the field on its head cannot be the right way to help them. With the current naming, I expect to have the exact opposite confusion as your hypothetical confused user. I'm going to be looking at a slot that's currently inactive, and it's going to tell me that the last_inactive_time was at some time in the past. And I'm going to say "what the heck is going on here, the slot is inactive *right now*!" Half of me wonders whether we should avoid this whole problem by renaming it to something like last_state_change or last_state_change_time, or maybe just state_change like we do in pg_stat_activity, and making it mean the last time the slot flipped between active and inactive in either direction. I'm not sure if this is better, but unless I'm misunderstanding something, the current situation is terrible. -- Robert Haas EDB: http://www.enterprisedb.com
Hi, On Mon, Mar 25, 2024 at 11:49:00AM -0400, Robert Haas wrote: > On Mon, Mar 25, 2024 at 11:16 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > IIUC, Bertrand's point was that users can interpret last_active_time > > > as a value that gets updated each time they decode a change which is > > > not what we are doing. So, this can confuse users. Your expectation of > > > answer (NULL) when the slot is active is correct and that is what will > > > happen. > > > > Yeah, and so would be the confusion: why is last_active_time NULL while one is > > using the slot? > > I agree that users could get confused here, but the solution to that > shouldn't be to give the field a name that is the opposite of what it > actually does. I expect a field called last_inactive_time to tell me > the last time that the slot was inactive. Here, it tells me the last > time that a currently-inactive slot previously *WAS* active. How can > you justify calling that the last *INACTIVE* time? > > AFAICS, the user who has the confusion that you mention here is simply > wrong. If they are looking at a field called "last active time" and > the slot is active, then the correct answer is "right now" or > "undefined" and that is what they will see. Sure, they might not > understand that. But flipping the name of the field on its head cannot > be the right way to help them. > > With the current naming, I expect to have the exact opposite confusion > as your hypothetical confused user. I'm going to be looking at a slot > that's currently inactive, and it's going to tell me that the > last_inactive_time was at some time in the past. And I'm going to say > "what the heck is going on here, the slot is inactive *right now*!" > > Half of me wonders whether we should avoid this whole problem by > renaming it to something like last_state_change or > last_state_change_time, or maybe just state_change like we do in > pg_stat_activity, and making it mean the last time the slot flipped > between active and inactive in either direction. I'm not sure if this > is better, but unless I'm misunderstanding something, the current > situation is terrible. > Now that I read your arguments I think that last_<active|inactive>_time could be both missleading because at the end they rely on users "expectation". Would "released_time" sounds better? (at the end this is exactly what it does represent unless for the case where it is restored from disk for which the meaning would still makes sense to me though). It seems to me that released_time does not lead to any expectation then removing any confusion. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote: > On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > > And I'm suspicious that having an exception for slots being synced is > > a bad idea. That makes too much of a judgement about how the user will > > use this field. It's usually better to just expose the data, and if > > the user needs helps to make sense of that data, then give them that > > help separately. > > The reason we didn't set this for sync slots is that they won't be > usable (one can't use them to decode WAL) unless standby is promoted > [2]. But I see your point as well. So, I have copied the others > involved in this discussion to see what they think. Yeah I also see Robert's point. If we also sync the "last inactive time" field then we would need to take care of the corner case mentioned by Shveta in [1] during promotion. [1]: https://www.postgresql.org/message-id/CAJpy0uCLu%2BmqAwAMum%3DpXE9YYsy0BE7hOSw_Wno5vjwpFY%3D63g%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Mar 25, 2024 at 12:12 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > Now that I read your arguments I think that last_<active|inactive>_time could be > both missleading because at the end they rely on users "expectation". Well, the user is always going to expect *something* -- that's just how language works. > Would "released_time" sounds better? (at the end this is exactly what it does > represent unless for the case where it is restored from disk for which the meaning > would still makes sense to me though). It seems to me that released_time does not > lead to any expectation then removing any confusion. Yeah, that's not bad. I mean, I don't agree that released_time doesn't lead to any expectation, but what it leads me to expect is that you're going to tell me the time at which the slot was released. So if it's currently active, then I see NULL, because it's not released; but if it's inactive, then I see the time at which it became so. In the same vein, I think deactivated_at or inactive_since might be good names to consider. I think they get at the same thing as released_time, but they avoid introducing a completely new word (release, as opposed to active/inactive). -- Robert Haas EDB: http://www.enterprisedb.com
Hi, On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote: > On Mon, Mar 25, 2024 at 12:12 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > Would "released_time" sounds better? (at the end this is exactly what it does > > represent unless for the case where it is restored from disk for which the meaning > > would still makes sense to me though). It seems to me that released_time does not > > lead to any expectation then removing any confusion. > > Yeah, that's not bad. I mean, I don't agree that released_time doesn't > lead to any expectation, > but what it leads me to expect is that you're > going to tell me the time at which the slot was released. So if it's > currently active, then I see NULL, because it's not released; but if > it's inactive, then I see the time at which it became so. > > In the same vein, I think deactivated_at or inactive_since might be > good names to consider. I think they get at the same thing as > released_time, but they avoid introducing a completely new word > (release, as opposed to active/inactive). > Yeah, I'd vote for inactive_since then. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Mar 25, 2024 at 04:49:12PM +0000, Bertrand Drouvot wrote: > On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote: >> In the same vein, I think deactivated_at or inactive_since might be >> good names to consider. I think they get at the same thing as >> released_time, but they avoid introducing a completely new word >> (release, as opposed to active/inactive). > > Yeah, I'd vote for inactive_since then. Having only skimmed some of the related discussions, I'm inclined to agree that inactive_since provides the clearest description for the column. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Mar 26, 2024 at 1:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 04:49:12PM +0000, Bertrand Drouvot wrote: > > On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote: > >> In the same vein, I think deactivated_at or inactive_since might be > >> good names to consider. I think they get at the same thing as > >> released_time, but they avoid introducing a completely new word > >> (release, as opposed to active/inactive). > > > > Yeah, I'd vote for inactive_since then. > > Having only skimmed some of the related discussions, I'm inclined to agree > that inactive_since provides the clearest description for the column. I think we all have some agreement on inactive_since. So, I'm attaching the patch for that change. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Mar 25, 2024 at 9:55 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 12:12 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > Would "released_time" sounds better? (at the end this is exactly what it does > > represent unless for the case where it is restored from disk for which the meaning > > would still makes sense to me though). It seems to me that released_time does not > > lead to any expectation then removing any confusion. > > Yeah, that's not bad. I mean, I don't agree that released_time doesn't > lead to any expectation, but what it leads me to expect is that you're > going to tell me the time at which the slot was released. So if it's > currently active, then I see NULL, because it's not released; but if > it's inactive, then I see the time at which it became so. > > In the same vein, I think deactivated_at or inactive_since might be > good names to consider. I think they get at the same thing as > released_time, but they avoid introducing a completely new word > (release, as opposed to active/inactive). > We have a consensus on inactive_since, so I'll make that change. I would also like to solicit your opinion on the other slot-level parameter we are planning to introduce. This new slot-level parameter will be named as inactive_timeout. This will indicate that once the slot is inactive for the inactive_timeout period, we will invalidate the slot. We are also discussing to have this parameter (inactive_timeout) as GUC [1]. We can have this new parameter both at the slot level and as well as a GUC, or just one of those. [1] - https://www.postgresql.org/message-id/20240325195443.GA2923888%40nathanxps13 -- With Regards, Amit Kapila.
On Mon, Mar 25, 2024 at 9:54 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote: > > On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > And I'm suspicious that having an exception for slots being synced is > > > a bad idea. That makes too much of a judgement about how the user will > > > use this field. It's usually better to just expose the data, and if > > > the user needs helps to make sense of that data, then give them that > > > help separately. > > > > The reason we didn't set this for sync slots is that they won't be > > usable (one can't use them to decode WAL) unless standby is promoted > > [2]. But I see your point as well. So, I have copied the others > > involved in this discussion to see what they think. > > Yeah I also see Robert's point. If we also sync the "last inactive time" field then > we would need to take care of the corner case mentioned by Shveta in [1] during > promotion. I have suggested one potential solution for that in [1]. Please have a look. [1]: https://www.postgresql.org/message-id/CAJpy0uB-yE%2BRiw7JQ4hW0%2BigJxvPc%2Brq%2B9c7WyTa1Jz7%2B2gAiA%40mail.gmail.com thanks Shveta
On Tue, Mar 26, 2024 at 1:50 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 1:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 04:49:12PM +0000, Bertrand Drouvot wrote: > > > On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote: > > >> In the same vein, I think deactivated_at or inactive_since might be > > >> good names to consider. I think they get at the same thing as > > >> released_time, but they avoid introducing a completely new word > > >> (release, as opposed to active/inactive). > > > > > > Yeah, I'd vote for inactive_since then. > > > > Having only skimmed some of the related discussions, I'm inclined to agree > > that inactive_since provides the clearest description for the column. > > I think we all have some agreement on inactive_since. So, I'm > attaching the patch for that change. pg_proc.dat needs to be changed to refer to 'inactive_since' instead of 'last_inactive_time' in the attached patch. thanks Shveta
On Tue, Mar 26, 2024 at 9:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 9:54 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote: > > > On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > And I'm suspicious that having an exception for slots being synced is > > > > a bad idea. That makes too much of a judgement about how the user will > > > > use this field. It's usually better to just expose the data, and if > > > > the user needs helps to make sense of that data, then give them that > > > > help separately. > > > > > > The reason we didn't set this for sync slots is that they won't be > > > usable (one can't use them to decode WAL) unless standby is promoted > > > [2]. But I see your point as well. So, I have copied the others > > > involved in this discussion to see what they think. > > > > Yeah I also see Robert's point. If we also sync the "last inactive time" field then > > we would need to take care of the corner case mentioned by Shveta in [1] during > > promotion. > > I have suggested one potential solution for that in [1]. Please have a look. > > [1]: https://www.postgresql.org/message-id/CAJpy0uB-yE%2BRiw7JQ4hW0%2BigJxvPc%2Brq%2B9c7WyTa1Jz7%2B2gAiA%40mail.gmail.com I posted the v21 patch implementing the above idea in the other thread - https://www.postgresql.org/message-id/CALj2ACXRFx9g7A9RFJZF7eBe%3Dzxk7%3DapMRFuCgJJKYB7O%3Dvgwg%40mail.gmail.com. For ease, I'm also attaching the patch in here. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On 2024-Mar-26, Amit Kapila wrote: > We have a consensus on inactive_since, so I'll make that change. Sounds reasonable. So this is a timestamptz if the slot is inactive, NULL if active, right? What value is it going to have for sync slots? > I would also like to solicit your opinion on the other slot-level > parameter we are planning to introduce. This new slot-level parameter > will be named as inactive_timeout. Maybe inactivity_timeout? > This will indicate that once the slot is inactive for the > inactive_timeout period, we will invalidate the slot. We are also > discussing to have this parameter (inactive_timeout) as GUC [1]. We > can have this new parameter both at the slot level and as well as a > GUC, or just one of those. replication_slot_inactivity_timeout? -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ Tom: There seems to be something broken here. Teodor: I'm in sackcloth and ashes... Fixed. http://postgr.es/m/482D1632.8010507@sigaev.ru
On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Mar-26, Amit Kapila wrote: > > > We have a consensus on inactive_since, so I'll make that change. > > Sounds reasonable. So this is a timestamptz if the slot is inactive, > NULL if active, right? > Yes. > What value is it going to have for sync slots? > The behavior will be the same for non-sync slots. In each sync cycle, we acquire/release the sync slots. So at the time of release, inactive_since will be updated. See email [1]. > > I would also like to solicit your opinion on the other slot-level > > parameter we are planning to introduce. This new slot-level parameter > > will be named as inactive_timeout. > > Maybe inactivity_timeout? > > > This will indicate that once the slot is inactive for the > > inactive_timeout period, we will invalidate the slot. We are also > > discussing to have this parameter (inactive_timeout) as GUC [1]. We > > can have this new parameter both at the slot level and as well as a > > GUC, or just one of those. > > replication_slot_inactivity_timeout? > So, it seems you are okay to have this parameter both at slot level and as a GUC. About names, let us see what others think. Thanks for the input on the names. [1] - https://www.postgresql.org/message-id/CAA4eK1KrPGwfZV9LYGidjxHeW%2BrxJ%3DE2ThjXvwRGLO%3DiLNuo%3DQ%40mail.gmail.com -- With Regards, Amit Kapila.
On 2024-Mar-26, Amit Kapila wrote: > On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Mar-26, Amit Kapila wrote: > > > I would also like to solicit your opinion on the other slot-level > > > parameter we are planning to introduce. This new slot-level parameter > > > will be named as inactive_timeout. > > > > Maybe inactivity_timeout? > > > > > This will indicate that once the slot is inactive for the > > > inactive_timeout period, we will invalidate the slot. We are also > > > discussing to have this parameter (inactive_timeout) as GUC [1]. We > > > can have this new parameter both at the slot level and as well as a > > > GUC, or just one of those. > > > > replication_slot_inactivity_timeout? > > So, it seems you are okay to have this parameter both at slot level > and as a GUC. Well, I think a GUC is good to have regardless of the slot parameter, because the GUC can be used as an instance-wide protection against going out of disk space because of broken replication. However, now that I think about it, I'm not really sure about invalidating a slot based on time rather on disk space, for which we already have a parameter; what's your rationale for that? The passage of time is not a very good measure, really, because the amount of WAL being protected has wildly varying production rate across time. I can only see a timeout being useful as a parameter if its default value is not the special disable value; say, the default timeout is 3 days (to be more precise -- the period from Friday to Monday, that is, between DBA leaving the office one week until discovering a problem when he returns early next week). This way we have a built-in mechanism that invalidates slots regardless of how big the WAL partition is. I'm less sure about the slot parameter; in what situation do you need to extend the life of one individual slot further than the life of all the other slots? (Of course, it makes no sense to set the per-slot param to a shorter period than the GUC: invalidating one slot ahead of the others is completely pointless.) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'. After collecting 500 such letters, he mused, a university somewhere in Arizona would probably grant him a degree. (Don Knuth)
Hi, On Tue, Mar 26, 2024 at 01:45:23PM +0530, Amit Kapila wrote: > On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2024-Mar-26, Amit Kapila wrote: > > > > > We have a consensus on inactive_since, so I'll make that change. > > > > Sounds reasonable. So this is a timestamptz if the slot is inactive, > > NULL if active, right? > > > > Yes. > > > What value is it going to have for sync slots? > > > > The behavior will be the same for non-sync slots. In each sync cycle, > we acquire/release the sync slots. So at the time of release, > inactive_since will be updated. See email [1]. I don't think we should set inactive_since to the current time at each sync cycle, see [1] as to why. What do you think? [1]: https://www.postgresql.org/message-id/ZgKGIDC5lttWTdJH%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Mar 26, 2024 at 2:11 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Mar-26, Amit Kapila wrote: > > > On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > On 2024-Mar-26, Amit Kapila wrote: > > > > I would also like to solicit your opinion on the other slot-level > > > > parameter we are planning to introduce. This new slot-level parameter > > > > will be named as inactive_timeout. > > > > > > Maybe inactivity_timeout? > > > > > > > This will indicate that once the slot is inactive for the > > > > inactive_timeout period, we will invalidate the slot. We are also > > > > discussing to have this parameter (inactive_timeout) as GUC [1]. We > > > > can have this new parameter both at the slot level and as well as a > > > > GUC, or just one of those. > > > > > > replication_slot_inactivity_timeout? > > > > So, it seems you are okay to have this parameter both at slot level > > and as a GUC. > > Well, I think a GUC is good to have regardless of the slot parameter, > because the GUC can be used as an instance-wide protection against going > out of disk space because of broken replication. However, now that I > think about it, I'm not really sure about invalidating a slot based on > time rather on disk space, for which we already have a parameter; what's > your rationale for that? The passage of time is not a very good > measure, really, because the amount of WAL being protected has wildly > varying production rate across time. > The inactive slot not only blocks WAL from being removed but prevents the vacuum from proceeding. Also, there is a risk of transaction Id wraparound. See email [1] for more context. > I can only see a timeout being useful as a parameter if its default > value is not the special disable value; say, the default timeout is 3 > days (to be more precise -- the period from Friday to Monday, that is, > between DBA leaving the office one week until discovering a problem when > he returns early next week). This way we have a built-in mechanism that > invalidates slots regardless of how big the WAL partition is. > We can have a default value for this parameter but it has the potential to break the replication, so not sure what could be a good default value. > > I'm less sure about the slot parameter; in what situation do you need to > extend the life of one individual slot further than the life of all the > other slots? I was thinking of an idle slot scenario where a slot from one particular subscriber (or output plugin) is inactive due to some maintenance activity. But it should be okay to have a GUC for this for now. [1] - https://www.postgresql.org/message-id/20240325195443.GA2923888%40nathanxps13 -- With Regards, Amit Kapila.
On Tue, Mar 26, 2024 at 03:44:29PM +0530, Amit Kapila wrote: > On Tue, Mar 26, 2024 at 2:11 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> Well, I think a GUC is good to have regardless of the slot parameter, >> because the GUC can be used as an instance-wide protection against going >> out of disk space because of broken replication. However, now that I >> think about it, I'm not really sure about invalidating a slot based on >> time rather on disk space, for which we already have a parameter; what's >> your rationale for that? The passage of time is not a very good >> measure, really, because the amount of WAL being protected has wildly >> varying production rate across time. > > The inactive slot not only blocks WAL from being removed but prevents > the vacuum from proceeding. Also, there is a risk of transaction Id > wraparound. See email [1] for more context. FWIW I'd really prefer to have something like max_slot_xid_age for this. A time-based parameter would likely help with most cases, but transaction ID usage will vary widely from server to server, so it'd be nice to have something to protect against wraparound more directly. I don't object to a time-based setting as well, but that won't always work as well for this particular use-case, especially if we are relying on users to set a slot-level parameter. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On 2024-Mar-26, Nathan Bossart wrote: > FWIW I'd really prefer to have something like max_slot_xid_age for this. A > time-based parameter would likely help with most cases, but transaction ID > usage will vary widely from server to server, so it'd be nice to have > something to protect against wraparound more directly. Yeah, I tend to agree that an XID-based limit makes more sense than a time-based one. > I don't object to a > time-based setting as well, but that won't always work as well for this > particular use-case, especially if we are relying on users to set a > slot-level parameter. I think slot-level parameters are mostly useless, because it takes just one slot where you forget to set it for disaster to strike. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Hi, On Tue, Mar 26, 2024 at 04:39:55PM +0100, Alvaro Herrera wrote: > On 2024-Mar-26, Nathan Bossart wrote: > > I don't object to a > > time-based setting as well, but that won't always work as well for this > > particular use-case, especially if we are relying on users to set a > > slot-level parameter. > > I think slot-level parameters are mostly useless, because it takes just > one slot where you forget to set it for disaster to strike. I think that's a fair point. So maybe we should focus on having a GUC first and later on re-think about having (or not) a slot based one (in addition to the GUC). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Mar 26, 2024 at 9:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Mar-26, Nathan Bossart wrote: > > > FWIW I'd really prefer to have something like max_slot_xid_age for this. A > > time-based parameter would likely help with most cases, but transaction ID > > usage will vary widely from server to server, so it'd be nice to have > > something to protect against wraparound more directly. > > Yeah, I tend to agree that an XID-based limit makes more sense than a > time-based one. > So, do we want the time-based parameter or just max_slot_xid_age considering both will be GUC's? Yes, the xid_age based parameter sounds to be directly helpful for transaction ID wraparound or dead row cleanup, OTOH having a lot of inactive slots (which is possible in use cases where a tool creates a lot of slots and forgets to remove them, or the tool exits without cleaning up slots (say due to server shutdown)) also prohibit removing dead space which is not nice either? The one example that comes to mind is the pg_createsubscriber (committed for PG17) which creates one slot per database to convert standby to subscriber, now say it exits due to power shutdown then there could be a lot of dangling slots on the primary server. Also, say there is some bug in such a tool that doesn't allow proper cleanup of slots, the same problem can happen; yeah, this would be a problem of the tool but I think there is no harm in giving a way to avoid problems at the server end due to such slots. -- With Regards, Amit Kapila.
On Wed, Mar 27, 2024 at 10:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 9:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2024-Mar-26, Nathan Bossart wrote: > > > > > FWIW I'd really prefer to have something like max_slot_xid_age for this. A > > > time-based parameter would likely help with most cases, but transaction ID > > > usage will vary widely from server to server, so it'd be nice to have > > > something to protect against wraparound more directly. > > > > Yeah, I tend to agree that an XID-based limit makes more sense than a > > time-based one. > > > So, do we want the time-based parameter or just max_slot_xid_age > considering both will be GUC's? Yes, the xid_age based parameter > sounds to be directly helpful for transaction ID wraparound or dead > row cleanup, OTOH having a lot of inactive slots (which is possible in > use cases where a tool creates a lot of slots and forgets to remove > them, or the tool exits without cleaning up slots (say due to server > shutdown)) also prohibit removing dead space which is not nice either? I've personally seen the leftover slots problem on production systems where a timeout based invalidation mechanism could have been of more help to save time and reduce manual intervention. Usually, most if not all, migration/upgrade/other tools create slots, and if at all any errors occur or the operation gets cancelled midway, there's a chance that the slots can be leftover if such tools forgets to clean them up either because there was a bug or for whatever reasons. These are unintended/ghost slots for the database user unnecessarily holding up resources such as XIDs, dead rows and WAL; which might lead to XID wraparound or server crash if unnoticed. Although XID based GUC helps a lot, why do these unintended and unnoticed slots need to hold up the resources even before the XID age of say 1.5 or 2 billion is reached. With both GUCs max_slot_xid_age and replication_slot_inactive_timeout in place, I can set max_slot_xid_age = 1.5 billion or so and also set replication_slot_inactive_timeout = 2 days or so to make the database foolproof. > The one example that comes to mind is the pg_createsubscriber > (committed for PG17) which creates one slot per database to convert > standby to subscriber, now say it exits due to power shutdown then > there could be a lot of dangling slots on the primary server. Also, > say there is some bug in such a tool that doesn't allow proper cleanup > of slots, the same problem can happen; yeah, this would be a problem > of the tool but I think there is no harm in giving a way to avoid > problems at the server end due to such slots. Right. I can personally connect to this problem of leftover slots where manual intervention was needed to drop all such slots which is time-consuming and painful sometimes. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Wed, Mar 27, 2024 at 12:28:06PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 27, 2024 at 10:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Mar 26, 2024 at 9:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > > > On 2024-Mar-26, Nathan Bossart wrote: > > > > > > > FWIW I'd really prefer to have something like max_slot_xid_age for this. A > > > > time-based parameter would likely help with most cases, but transaction ID > > > > usage will vary widely from server to server, so it'd be nice to have > > > > something to protect against wraparound more directly. > > > > > > Yeah, I tend to agree that an XID-based limit makes more sense than a > > > time-based one. > > > > > So, do we want the time-based parameter or just max_slot_xid_age > > considering both will be GUC's? Yes, the xid_age based parameter > > sounds to be directly helpful for transaction ID wraparound or dead > > row cleanup, OTOH having a lot of inactive slots (which is possible in > > use cases where a tool creates a lot of slots and forgets to remove > > them, or the tool exits without cleaning up slots (say due to server > > shutdown)) also prohibit removing dead space which is not nice either? > > I've personally seen the leftover slots problem on production systems > where a timeout based invalidation mechanism could have been of more > help to save time and reduce manual intervention. Usually, most if not > all, migration/upgrade/other tools create slots, and if at all any > errors occur or the operation gets cancelled midway, there's a chance > that the slots can be leftover if such tools forgets to clean them up > either because there was a bug or for whatever reasons. These are > unintended/ghost slots for the database user unnecessarily holding up > resources such as XIDs, dead rows and WAL; which might lead to XID > wraparound or server crash if unnoticed. Although XID based GUC helps > a lot, why do these unintended and unnoticed slots need to hold up the > resources even before the XID age of say 1.5 or 2 billion is reached. > > With both GUCs max_slot_xid_age and replication_slot_inactive_timeout > in place, I can set max_slot_xid_age = 1.5 billion or so and also set > replication_slot_inactive_timeout = 2 days or so to make the database > foolproof. Yeah, I think that both makes senses. The reason is that one depends of the database activity and slot activity (the xid age one) while the other (the timeout one) depends only of the slot activity. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Wed, Mar 27, 2024 at 3:13 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > Yeah, I think that both makes senses. The reason is that one depends of the > database activity and slot activity (the xid age one) while the other (the > timeout one) depends only of the slot activity. FWIW, I thought the time-based one sounded more useful. I think it would be poor planning to say "well, if the slot reaches an XID age of a billion, kill it so we don't wrap around," because while that likely will prevent me from getting into wraparound trouble, my database is likely to become horribly bloated long before the cutoff is reached. I thought it would be easier to reason in terms of time: I don't expect a slave to ever be down for more than X period of time, say an hour or whatever, so if it is, forget about it. Or alternatively, I know that if a slave does go down for more than X period of time, I start to get bloat, so cut it off at that point and I'll rebuild it later. I feel like these are things where people's intuition is going to be much stronger when reckoning in units of wall-clock time, which everyone deals with every day in one way or another, rather than in XID-based units that are, at least in my view, just a lot less intuitive. For a previous example of where an XID threshold turned out not to be great, see vacuum_defer_cleanup_age, and in particular the commit message from where it was removed in 1118cd37eb61e6a2428f457a8b2026a7bb3f801a. The case here might not turn out to be quite comparable for one reason or another, but I do think that case is a cautionary tale. I'm sure the world won't end or anything if we end up with both thresholds, and I may be missing some reason why the XID threshold would be really great here. I just can't quite see why I'd ever recommend it to anyone. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Mar 27, 2024 at 10:33:28AM -0400, Robert Haas wrote: > FWIW, I thought the time-based one sounded more useful. I think it > would be poor planning to say "well, if the slot reaches an XID age of > a billion, kill it so we don't wrap around," because while that likely > will prevent me from getting into wraparound trouble, my database is > likely to become horribly bloated long before the cutoff is reached. I > thought it would be easier to reason in terms of time: I don't expect > a slave to ever be down for more than X period of time, say an hour or > whatever, so if it is, forget about it. Or alternatively, I know that > if a slave does go down for more than X period of time, I start to get > bloat, so cut it off at that point and I'll rebuild it later. I feel > like these are things where people's intuition is going to be much > stronger when reckoning in units of wall-clock time, which everyone > deals with every day in one way or another, rather than in XID-based > units that are, at least in my view, just a lot less intuitive. I don't disagree with this point in the context of a user who is managing a single server or just a handful of servers. They are going to understand their workload best and can reason about the right value for the timeout. I think they'd still benefit from having an XID-based setting as a backstop in case the timeout is still not sufficient to prevent wraparound, but it's not nearly as important in that case. IMHO the use-case where this doesn't work so well is when you have many, many servers to administer (e.g., a cloud provider). In those cases, picking a default timeout to try to prevent wraparound is going to be much less accurate, as any reasonable value you pick is still going to be insufficient in some cases. I think the XID-based parameter would be better here; if the server is at imminent risk of an outage due to wraparound, invalidating the slots is probably a reasonable course of action. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Mar 27, 2024 at 11:06 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > IMHO the use-case where this doesn't work so well is when you have many, > many servers to administer (e.g., a cloud provider). In those cases, > picking a default timeout to try to prevent wraparound is going to be much > less accurate, as any reasonable value you pick is still going to be > insufficient in some cases. I think the XID-based parameter would be > better here; if the server is at imminent risk of an outage due to > wraparound, invalidating the slots is probably a reasonable course of > action. Well, I'm certainly willing to believe that your experience with administering servers in the cloud is superior to mine. I don't really understand why it makes a difference, though. Whether you have one server or many, I agree that it is reasonable to invalidate slots when XID wraparound looms. But also, whether you have one server or many, by the time wraparound looms, you will typically have crippling bloat as well. If preventing that bloat isn't important or there are other defenses against it, then I see the value of the XID-based cutoff as a backstop. And I will admit that in an on-prem installation, I've occasionally seen situations where lots of XIDs got burned without really causing a lot of bloat -- say, because there are heavily updated staging tables which are periodically truncated, and very little modification to long-lived data. I'm not really strongly against having an XID-based threshold if smart people such as yourself want it. I just think for a lot of users it's going to be fiddly to get right. -- Robert Haas EDB: http://www.enterprisedb.com