Thread: Introduce XID age and inactive timeout based replication slot invalidation
Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi, Replication slots in postgres will prevent removal of required resources when there is no connection using them (inactive). This consumes storage because neither required WAL nor required rows from the user tables/system catalogs can be removed by VACUUM as long as they are required by a replication slot. In extreme cases this could cause the transaction ID wraparound. Currently postgres has the ability to invalidate inactive replication slots based on the amount of WAL (set via max_slot_wal_keep_size GUC) that will be needed for the slots in case they become active. However, the wraparound issue isn't effectively covered by max_slot_wal_keep_size - one can't tell postgres to invalidate a replication slot if it is blocking VACUUM. Also, it is often tricky to choose a default value for max_slot_wal_keep_size, because the amount of WAL that gets generated and allocated storage for the database can vary. Therefore, it is often easy for developers to do the following: a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which the slots get invalidated. b) set a timeout of say 1 or 2 or 3 days, after which the inactive slots get invalidated. To implement (a), postgres needs a new GUC called max_slot_xid_age. The checkpointer then invalidates all the slots whose xmin (the oldest transaction that this slot needs the database to retain) or catalog_xmin (the oldest transaction affecting the system catalogs that this slot needs the database to retain) has reached the age specified by this setting. To implement (b), first postgres needs to track the replication slot metrics like the time at which the slot became inactive (inactive_at timestamptz) and the total number of times the slot became inactive in its lifetime (inactive_count numeric) in ReplicationSlotPersistentData structure. And, then it needs a new timeout GUC called inactive_replication_slot_timeout. Whenever a slot becomes inactive, the current timestamp and inactive count are stored in ReplicationSlotPersistentData structure and persisted to disk. The checkpointer then invalidates all the slots that are lying inactive for about inactive_replication_slot_timeout duration starting from inactive_at. In addition to implementing (b), these two new metrics enable developers to improve their monitoring tools as the metrics are exposed via pg_replication_slots system view. For instance, one can build a monitoring tool that signals when replication slots are lying inactive for a day or so using inactive_at metric, and/or when a replication slot is becoming inactive too frequently using inactive_at metric. I’m attaching the v1 patch set as described below: 0001 - Tracks invalidation_reason in pg_replication_slots. This is needed because slots now have multiple reasons for slot invalidation. 0002 - Tracks inactive replication slot information inactive_at and inactive_timeout. 0003 - Adds inactive_timeout based replication slot invalidation. 0004 - Adds XID based replication slot invalidation. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > Replication slots in postgres will prevent removal of required > resources when there is no connection using them (inactive). This > consumes storage because neither required WAL nor required rows from > the user tables/system catalogs can be removed by VACUUM as long as > they are required by a replication slot. In extreme cases this could > cause the transaction ID wraparound. > > Currently postgres has the ability to invalidate inactive replication > slots based on the amount of WAL (set via max_slot_wal_keep_size GUC) > that will be needed for the slots in case they become active. However, > the wraparound issue isn't effectively covered by > max_slot_wal_keep_size - one can't tell postgres to invalidate a > replication slot if it is blocking VACUUM. Also, it is often tricky to > choose a default value for max_slot_wal_keep_size, because the amount > of WAL that gets generated and allocated storage for the database can > vary. > > Therefore, it is often easy for developers to do the following: > a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5 > billion, after which the slots get invalidated. > b) set a timeout of say 1 or 2 or 3 days, after which the inactive > slots get invalidated. > > To implement (a), postgres needs a new GUC called max_slot_xid_age. > The checkpointer then invalidates all the slots whose xmin (the oldest > transaction that this slot needs the database to retain) or > catalog_xmin (the oldest transaction affecting the system catalogs > that this slot needs the database to retain) has reached the age > specified by this setting. > > To implement (b), first postgres needs to track the replication slot > metrics like the time at which the slot became inactive (inactive_at > timestamptz) and the total number of times the slot became inactive in > its lifetime (inactive_count numeric) in ReplicationSlotPersistentData > structure. And, then it needs a new timeout GUC called > inactive_replication_slot_timeout. Whenever a slot becomes inactive, > the current timestamp and inactive count are stored in > ReplicationSlotPersistentData structure and persisted to disk. The > checkpointer then invalidates all the slots that are lying inactive > for about inactive_replication_slot_timeout duration starting from > inactive_at. > > In addition to implementing (b), these two new metrics enable > developers to improve their monitoring tools as the metrics are > exposed via pg_replication_slots system view. For instance, one can > build a monitoring tool that signals when replication slots are lying > inactive for a day or so using inactive_at metric, and/or when a > replication slot is becoming inactive too frequently using inactive_at > metric. > > I’m attaching the v1 patch set as described below: > 0001 - Tracks invalidation_reason in pg_replication_slots. This is > needed because slots now have multiple reasons for slot invalidation. > 0002 - Tracks inactive replication slot information inactive_at and > inactive_timeout. > 0003 - Adds inactive_timeout based replication slot invalidation. > 0004 - Adds XID based replication slot invalidation. > > Thoughts? Needed a rebase due to c393308b. Please find the attached v2 patch set. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Jan 27, 2024 at 1:18 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Hi, > > > > Replication slots in postgres will prevent removal of required > > resources when there is no connection using them (inactive). This > > consumes storage because neither required WAL nor required rows from > > the user tables/system catalogs can be removed by VACUUM as long as > > they are required by a replication slot. In extreme cases this could > > cause the transaction ID wraparound. > > > > Currently postgres has the ability to invalidate inactive replication > > slots based on the amount of WAL (set via max_slot_wal_keep_size GUC) > > that will be needed for the slots in case they become active. However, > > the wraparound issue isn't effectively covered by > > max_slot_wal_keep_size - one can't tell postgres to invalidate a > > replication slot if it is blocking VACUUM. Also, it is often tricky to > > choose a default value for max_slot_wal_keep_size, because the amount > > of WAL that gets generated and allocated storage for the database can > > vary. > > > > Therefore, it is often easy for developers to do the following: > > a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5 > > billion, after which the slots get invalidated. > > b) set a timeout of say 1 or 2 or 3 days, after which the inactive > > slots get invalidated. > > > > To implement (a), postgres needs a new GUC called max_slot_xid_age. > > The checkpointer then invalidates all the slots whose xmin (the oldest > > transaction that this slot needs the database to retain) or > > catalog_xmin (the oldest transaction affecting the system catalogs > > that this slot needs the database to retain) has reached the age > > specified by this setting. > > > > To implement (b), first postgres needs to track the replication slot > > metrics like the time at which the slot became inactive (inactive_at > > timestamptz) and the total number of times the slot became inactive in > > its lifetime (inactive_count numeric) in ReplicationSlotPersistentData > > structure. And, then it needs a new timeout GUC called > > inactive_replication_slot_timeout. Whenever a slot becomes inactive, > > the current timestamp and inactive count are stored in > > ReplicationSlotPersistentData structure and persisted to disk. The > > checkpointer then invalidates all the slots that are lying inactive > > for about inactive_replication_slot_timeout duration starting from > > inactive_at. > > > > In addition to implementing (b), these two new metrics enable > > developers to improve their monitoring tools as the metrics are > > exposed via pg_replication_slots system view. For instance, one can > > build a monitoring tool that signals when replication slots are lying > > inactive for a day or so using inactive_at metric, and/or when a > > replication slot is becoming inactive too frequently using inactive_at > > metric. > > > > I’m attaching the v1 patch set as described below: > > 0001 - Tracks invalidation_reason in pg_replication_slots. This is > > needed because slots now have multiple reasons for slot invalidation. > > 0002 - Tracks inactive replication slot information inactive_at and > > inactive_timeout. > > 0003 - Adds inactive_timeout based replication slot invalidation. > > 0004 - Adds XID based replication slot invalidation. > > > > Thoughts? > > Needed a rebase due to c393308b. Please find the attached v2 patch set. Needed a rebase due to commit 776621a (conflict in src/test/recovery/meson.build for new TAP test file added). Please find the attached v3 patch set. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Jan 11, 2024 at 10:48:13AM +0530, Bharath Rupireddy wrote: > Hi, > > Therefore, it is often easy for developers to do the following: > a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5 > billion, after which the slots get invalidated. > b) set a timeout of say 1 or 2 or 3 days, after which the inactive > slots get invalidated. > > To implement (a), postgres needs a new GUC called max_slot_xid_age. > The checkpointer then invalidates all the slots whose xmin (the oldest > transaction that this slot needs the database to retain) or > catalog_xmin (the oldest transaction affecting the system catalogs > that this slot needs the database to retain) has reached the age > specified by this setting. > > To implement (b), first postgres needs to track the replication slot > metrics like the time at which the slot became inactive (inactive_at > timestamptz) and the total number of times the slot became inactive in > its lifetime (inactive_count numeric) in ReplicationSlotPersistentData > structure. And, then it needs a new timeout GUC called > inactive_replication_slot_timeout. Whenever a slot becomes inactive, > the current timestamp and inactive count are stored in > ReplicationSlotPersistentData structure and persisted to disk. The > checkpointer then invalidates all the slots that are lying inactive > for about inactive_replication_slot_timeout duration starting from > inactive_at. > > In addition to implementing (b), these two new metrics enable > developers to improve their monitoring tools as the metrics are > exposed via pg_replication_slots system view. For instance, one can > build a monitoring tool that signals when replication slots are lying > inactive for a day or so using inactive_at metric, and/or when a > replication slot is becoming inactive too frequently using inactive_at > metric. Thanks for the patch and +1 for the idea, I think adding those new "invalidation reasons" make sense. > > I’m attaching the v1 patch set as described below: > 0001 - Tracks invalidation_reason in pg_replication_slots. This is > needed because slots now have multiple reasons for slot invalidation. > 0002 - Tracks inactive replication slot information inactive_at and > inactive_timeout. > 0003 - Adds inactive_timeout based replication slot invalidation. > 0004 - Adds XID based replication slot invalidation. > I think it's better to have the XID one being discussed/implemented before the inactive_timeout one: what about changing the 0002, 0003 and 0004 ordering? 0004 -> 0002 0002 -> 0003 0003 -> 0004 As far 0001: " This commit renames conflict_reason to invalidation_reason, and adds the support to show invalidation reasons for both physical and logical slots. " I'm not sure I like the fact that "invalidations" and "conflicts" are merged into a single field. I'd vote to keep conflict_reason as it is and add a new invalidation_reason (and put "conflict" as value when it is the case). The reason is that I think they are 2 different concepts (could be linked though) and that it would be easier to check for conflicts (means conflict_reason is not NULL). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Dilip Kumar
Date:
On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > Replication slots in postgres will prevent removal of required > resources when there is no connection using them (inactive). This > consumes storage because neither required WAL nor required rows from > the user tables/system catalogs can be removed by VACUUM as long as > they are required by a replication slot. In extreme cases this could > cause the transaction ID wraparound. > > Currently postgres has the ability to invalidate inactive replication > slots based on the amount of WAL (set via max_slot_wal_keep_size GUC) > that will be needed for the slots in case they become active. However, > the wraparound issue isn't effectively covered by > max_slot_wal_keep_size - one can't tell postgres to invalidate a > replication slot if it is blocking VACUUM. Also, it is often tricky to > choose a default value for max_slot_wal_keep_size, because the amount > of WAL that gets generated and allocated storage for the database can > vary. > > Therefore, it is often easy for developers to do the following: > a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5 > billion, after which the slots get invalidated. > b) set a timeout of say 1 or 2 or 3 days, after which the inactive > slots get invalidated. > > To implement (a), postgres needs a new GUC called max_slot_xid_age. > The checkpointer then invalidates all the slots whose xmin (the oldest > transaction that this slot needs the database to retain) or > catalog_xmin (the oldest transaction affecting the system catalogs > that this slot needs the database to retain) has reached the age > specified by this setting. > > To implement (b), first postgres needs to track the replication slot > metrics like the time at which the slot became inactive (inactive_at > timestamptz) and the total number of times the slot became inactive in > its lifetime (inactive_count numeric) in ReplicationSlotPersistentData > structure. And, then it needs a new timeout GUC called > inactive_replication_slot_timeout. Whenever a slot becomes inactive, > the current timestamp and inactive count are stored in > ReplicationSlotPersistentData structure and persisted to disk. The > checkpointer then invalidates all the slots that are lying inactive > for about inactive_replication_slot_timeout duration starting from > inactive_at. > > In addition to implementing (b), these two new metrics enable > developers to improve their monitoring tools as the metrics are > exposed via pg_replication_slots system view. For instance, one can > build a monitoring tool that signals when replication slots are lying > inactive for a day or so using inactive_at metric, and/or when a > replication slot is becoming inactive too frequently using inactive_at > metric. > > I’m attaching the v1 patch set as described below: > 0001 - Tracks invalidation_reason in pg_replication_slots. This is > needed because slots now have multiple reasons for slot invalidation. > 0002 - Tracks inactive replication slot information inactive_at and > inactive_timeout. > 0003 - Adds inactive_timeout based replication slot invalidation. > 0004 - Adds XID based replication slot invalidation. > > Thoughts? > +1 for the idea, here are some comments on 0002, I will review other patches soon and respond. 1. + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>inactive_at</structfield> <type>timestamptz</type> + </para> + <para> + The time at which the slot became inactive. + <literal>NULL</literal> if the slot is currently actively being + used. + </para></entry> + </row> Maybe we can change the field name to 'last_inactive_at'? or maybe the comment can explain timestampt at which slot was last inactivated. I think since we are already maintaining the inactive_count so better to explicitly say this is the last invaliding time. 2. + /* + * XXX: Can inactive_count of type uint64 ever overflow? It takes + * about a half-billion years for inactive_count to overflow even + * if slot becomes inactive for every 1 millisecond. So, using + * pg_add_u64_overflow might be an overkill. + */ Correct we don't need to use pg_add_u64_overflow for this counter. 3. + + /* Convert to numeric. */ + snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count); + values[i++] = DirectFunctionCall3(numeric_in, + CStringGetDatum(buf), + ObjectIdGetDatum(0), + Int32GetDatum(-1)); What is the purpose of doing this? I mean inactive_count is 8 byte integer and you can define function outparameter as 'int8' which is 8 byte integer. Then you don't need to convert int to string and then to numeric? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Feb 6, 2024 at 2:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > Thoughts? > > > +1 for the idea, here are some comments on 0002, I will review other > patches soon and respond. Thanks for looking at it. > + <structfield>inactive_at</structfield> <type>timestamptz</type> > > Maybe we can change the field name to 'last_inactive_at'? or maybe the > comment can explain timestampt at which slot was last inactivated. > I think since we are already maintaining the inactive_count so better > to explicitly say this is the last invaliding time. last_inactive_at looks better, so will use that in the next version of the patch. > 2. > + /* > + * XXX: Can inactive_count of type uint64 ever overflow? It takes > + * about a half-billion years for inactive_count to overflow even > + * if slot becomes inactive for every 1 millisecond. So, using > + * pg_add_u64_overflow might be an overkill. > + */ > > Correct we don't need to use pg_add_u64_overflow for this counter. Will remove this comment in the next version of the patch. > + /* Convert to numeric. */ > + snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count); > + values[i++] = DirectFunctionCall3(numeric_in, > + CStringGetDatum(buf), > + ObjectIdGetDatum(0), > + Int32GetDatum(-1)); > > What is the purpose of doing this? I mean inactive_count is 8 byte > integer and you can define function outparameter as 'int8' which is 8 > byte integer. Then you don't need to convert int to string and then > to numeric? Nope, it's of type uint64, so reporting it as numeric is a way typically used elsewhere - see code around /* Convert to numeric. */. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Feb 5, 2024 at 3:15 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Thanks for the patch and +1 for the idea, I think adding those new > "invalidation reasons" make sense. Thanks for looking at it. > I think it's better to have the XID one being discussed/implemented before the > inactive_timeout one: what about changing the 0002, 0003 and 0004 ordering? > > 0004 -> 0002 > 0002 -> 0003 > 0003 -> 0004 Done that way. > As far 0001: > > " > This commit renames conflict_reason to > invalidation_reason, and adds the support to show invalidation > reasons for both physical and logical slots. > " > > I'm not sure I like the fact that "invalidations" and "conflicts" are merged > into a single field. I'd vote to keep conflict_reason as it is and add a new > invalidation_reason (and put "conflict" as value when it is the case). The reason > is that I think they are 2 different concepts (could be linked though) and that > it would be easier to check for conflicts (means conflict_reason is not NULL). So, do you want conflict_reason for only logical slots, and a separate column for invalidation_reason for both logical and physical slots? Is there any strong reason to have two properties "conflict" and "invalidated" for slots? They both are the same internally, so why confuse the users? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Feb 07, 2024 at 12:22:07AM +0530, Bharath Rupireddy wrote: > On Mon, Feb 5, 2024 at 3:15 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > I'm not sure I like the fact that "invalidations" and "conflicts" are merged > > into a single field. I'd vote to keep conflict_reason as it is and add a new > > invalidation_reason (and put "conflict" as value when it is the case). The reason > > is that I think they are 2 different concepts (could be linked though) and that > > it would be easier to check for conflicts (means conflict_reason is not NULL). > > So, do you want conflict_reason for only logical slots, and a separate > column for invalidation_reason for both logical and physical slots? Yes, with "conflict" as value in case of conflicts (and one would need to refer to the conflict_reason reason to see the reason). > Is there any strong reason to have two properties "conflict" and > "invalidated" for slots? I think "conflict" is an important topic and does contain several reasons. The slot "first" conflict and then leads to slot "invalidation". > They both are the same internally, so why > confuse the users? I don't think that would confuse the users, I do think that would be easier to check for conflicting slots. I did not look closely at the code, just played a bit with the patch and was able to produce something like: postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots; slot_name | slot_type | active | active_pid | wal_status | invalidation_reason -------------+-----------+--------+------------+------------+--------------------- rep1 | physical | f | | reserved | master_slot | physical | t | 1482441 | unreserved | wal_removed (2 rows) does that make sense to have an "active/working" slot "ivalidated"? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Fri, Feb 9, 2024 at 1:12 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > I think "conflict" is an important topic and does contain several reasons. The > slot "first" conflict and then leads to slot "invalidation". > > > They both are the same internally, so why > > confuse the users? > > I don't think that would confuse the users, I do think that would be easier to > check for conflicting slots. I've added a separate column for invalidation reasons for now. I'll see how others think on this as the time goes by. > I did not look closely at the code, just played a bit with the patch and was able > to produce something like: > > postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots; > slot_name | slot_type | active | active_pid | wal_status | invalidation_reason > -------------+-----------+--------+------------+------------+--------------------- > rep1 | physical | f | | reserved | > master_slot | physical | t | 1482441 | unreserved | wal_removed > (2 rows) > > does that make sense to have an "active/working" slot "ivalidated"? Thanks. Can you please provide the steps to generate this error? Are you setting max_slot_wal_keep_size on primary to generate "wal_removed"? Attached v5 patch set after rebasing and addressing review comments. Please review it further. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Feb 20, 2024 at 12:05 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > >> [...] and was able to produce something like: > > > > postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots; > > slot_name | slot_type | active | active_pid | wal_status | invalidation_reason > > -------------+-----------+--------+------------+------------+--------------------- > > rep1 | physical | f | | reserved | > > master_slot | physical | t | 1482441 | unreserved | wal_removed > > (2 rows) > > > > does that make sense to have an "active/working" slot "ivalidated"? > > Thanks. Can you please provide the steps to generate this error? Are > you setting max_slot_wal_keep_size on primary to generate > "wal_removed"? I'm able to reproduce [1] the state [2] where the slot got invalidated first, then its wal_status became unreserved, but still the slot is serving after the standby comes up online after it catches up with the primary getting the WAL files from the archive. There's a good reason for this state - https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/slotfuncs.c;h=d2fa5e669a32f19989b0d987d3c7329851a1272e;hb=ff9e1e764fcce9a34467d614611a34d4d2a91b50#l351. This intermittent state can only happen for physical slots, not for logical slots because logical subscribers can't get the missing changes from the WAL stored in the archive. And, the fact looks to be that an invalidated slot can never become normal but still can serve a standby if the standby is able to catch up by fetching required WAL (this is the WAL the slot couldn't keep for the standby) from elsewhere (archive via restore_command). As far as the 0001 patch is concerned, it reports the invalidation_reason as long as slot_contents.data.invalidated != RS_INVAL_NONE. I think this is okay. Thoughts? [1] ./initdb -D db17 echo "max_wal_size = 128MB max_slot_wal_keep_size = 64MB archive_mode = on archive_command='cp %p /home/ubuntu/postgres/pg17/bin/archived_wal/%f'" | tee -a db17/postgresql.conf ./pg_ctl -D db17 -l logfile17 start ./psql -d postgres -p 5432 -c "SELECT pg_create_physical_replication_slot('sb_repl_slot', true, false);" rm -rf sbdata logfilesbdata ./pg_basebackup -D sbdata echo "port=5433 primary_conninfo='host=localhost port=5432 dbname=postgres user=ubuntu' primary_slot_name='sb_repl_slot' restore_command='cp /home/ubuntu/postgres/pg17/bin/archived_wal/%f %p'" | tee -a sbdata/postgresql.conf touch sbdata/standby.signal ./pg_ctl -D sbdata -l logfilesbdata start ./psql -d postgres -p 5433 -c "SELECT pg_is_in_recovery();" ./pg_ctl -D sbdata -l logfilesbdata stop ./psql -d postgres -p 5432 -c "SELECT pg_logical_emit_message(true, 'mymessage', repeat('aaaa', 10000000));" ./psql -d postgres -p 5432 -c "CHECKPOINT;" ./pg_ctl -D sbdata -l logfilesbdata start ./psql -d postgres -p 5432 -xc "SELECT * FROM pg_replication_slots;" [2] postgres=# SELECT * FROM pg_replication_slots; -[ RECORD 1 ]-------+------------- slot_name | sb_repl_slot plugin | slot_type | physical datoid | database | temporary | f active | t active_pid | 710667 xmin | catalog_xmin | restart_lsn | 0/115D21A0 confirmed_flush_lsn | wal_status | unreserved safe_wal_size | 77782624 two_phase | f conflict_reason | failover | f synced | f invalidation_reason | wal_removed last_inactive_at | inactive_count | 1 -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Feb 21, 2024 at 10:55:00AM +0530, Bharath Rupireddy wrote: > On Tue, Feb 20, 2024 at 12:05 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > >> [...] and was able to produce something like: > > > > > > postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots; > > > slot_name | slot_type | active | active_pid | wal_status | invalidation_reason > > > -------------+-----------+--------+------------+------------+--------------------- > > > rep1 | physical | f | | reserved | > > > master_slot | physical | t | 1482441 | unreserved | wal_removed > > > (2 rows) > > > > > > does that make sense to have an "active/working" slot "ivalidated"? > > > > Thanks. Can you please provide the steps to generate this error? Are > > you setting max_slot_wal_keep_size on primary to generate > > "wal_removed"? > > I'm able to reproduce [1] the state [2] where the slot got invalidated > first, then its wal_status became unreserved, but still the slot is > serving after the standby comes up online after it catches up with the > primary getting the WAL files from the archive. There's a good reason > for this state - > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/slotfuncs.c;h=d2fa5e669a32f19989b0d987d3c7329851a1272e;hb=ff9e1e764fcce9a34467d614611a34d4d2a91b50#l351. > This intermittent state can only happen for physical slots, not for > logical slots because logical subscribers can't get the missing > changes from the WAL stored in the archive. > > And, the fact looks to be that an invalidated slot can never become > normal but still can serve a standby if the standby is able to catch > up by fetching required WAL (this is the WAL the slot couldn't keep > for the standby) from elsewhere (archive via restore_command). > > As far as the 0001 patch is concerned, it reports the > invalidation_reason as long as slot_contents.data.invalidated != > RS_INVAL_NONE. I think this is okay. > > Thoughts? Yeah, looking at the code I agree that looks ok. OTOH, that looks confusing, maybe we should add a few words about it in the doc? Looking at v5-0001: + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>invalidation_reason</structfield> <type>text</type> + </para> + <para> My initial thought was to put "conflict" value in this new field in case of conflict (not to mention the conflict reason in it). With the current proposal invalidation_reason could report the same as conflict_reason, which sounds weird to me. Does that make sense to you to use "conflict" as value in "invalidation_reason" when the slot has "conflict_reason" not NULL? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Feb 21, 2024 at 5:55 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > As far as the 0001 patch is concerned, it reports the > > invalidation_reason as long as slot_contents.data.invalidated != > > RS_INVAL_NONE. I think this is okay. > > > > Thoughts? > > Yeah, looking at the code I agree that looks ok. OTOH, that looks confusing, > maybe we should add a few words about it in the doc? I'll think about it. > Looking at v5-0001: > > + <entry role="catalog_table_entry"><para role="column_definition"> > + <structfield>invalidation_reason</structfield> <type>text</type> > + </para> > + <para> > > My initial thought was to put "conflict" value in this new field in case of > conflict (not to mention the conflict reason in it). With the current proposal > invalidation_reason could report the same as conflict_reason, which sounds weird > to me. > > Does that make sense to you to use "conflict" as value in "invalidation_reason" > when the slot has "conflict_reason" not NULL? I'm thinking the other way around - how about we revert https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5, that is, put in place "conflict" as a boolean and introduce invalidation_reason the text form. So, for logical slots, whenever the "conflict" column is true, the reason is found in invaldiation_reason column? How does it sound? Again the debate might be "conflict" vs "invalidation", but that looks clean IMHO. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Feb 21, 2024 at 08:10:00PM +0530, Bharath Rupireddy wrote: > On Wed, Feb 21, 2024 at 5:55 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > My initial thought was to put "conflict" value in this new field in case of > > conflict (not to mention the conflict reason in it). With the current proposal > > invalidation_reason could report the same as conflict_reason, which sounds weird > > to me. > > > > Does that make sense to you to use "conflict" as value in "invalidation_reason" > > when the slot has "conflict_reason" not NULL? > > I'm thinking the other way around - how about we revert > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5, > that is, put in place "conflict" as a boolean and introduce > invalidation_reason the text form. So, for logical slots, whenever the > "conflict" column is true, the reason is found in invaldiation_reason > column? How does it sound? Yeah, I think that looks fine too. We would need more change (like take care of ddd5f4f54a for example). CC'ing Amit, Hou-San and Shveta to get their point of view (as the ones behind 007693f2a3 and ddd5f4f54a). Regarding, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Feb 22, 2024 at 1:44 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > Does that make sense to you to use "conflict" as value in "invalidation_reason" > > > when the slot has "conflict_reason" not NULL? > > > > I'm thinking the other way around - how about we revert > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5, > > that is, put in place "conflict" as a boolean and introduce > > invalidation_reason the text form. So, for logical slots, whenever the > > "conflict" column is true, the reason is found in invaldiation_reason > > column? How does it sound? > > Yeah, I think that looks fine too. We would need more change (like take care of > ddd5f4f54a for example). > > CC'ing Amit, Hou-San and Shveta to get their point of view (as the ones behind > 007693f2a3 and ddd5f4f54a). Yeah, let's wait for what others think about it. FWIW, I've had to rebase the patches due to 943f7ae1c. Please see the attached v6 patch set. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
On Wed, Feb 21, 2024 at 08:10:00PM +0530, Bharath Rupireddy wrote: > I'm thinking the other way around - how about we revert > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5, > that is, put in place "conflict" as a boolean and introduce > invalidation_reason the text form. So, for logical slots, whenever the > "conflict" column is true, the reason is found in invaldiation_reason > column? How does it sound? Again the debate might be "conflict" vs > "invalidation", but that looks clean IMHO. Would you ever see "conflict" as false and "invalidation_reason" as non-null for a logical slot? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > > [....] how about we revert > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5, > > Would you ever see "conflict" as false and "invalidation_reason" as > non-null for a logical slot? No. Because both conflict and invalidation_reason are decided based on the invalidation reason i.e. value of slot_contents.data.invalidated. IOW, a logical slot that reports conflict as true must have been invalidated. Do you have any thoughts on reverting 007693f and introducing invalidation_reason? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote: > On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote: >> Would you ever see "conflict" as false and "invalidation_reason" as >> non-null for a logical slot? > > No. Because both conflict and invalidation_reason are decided based on > the invalidation reason i.e. value of slot_contents.data.invalidated. > IOW, a logical slot that reports conflict as true must have been > invalidated. > > Do you have any thoughts on reverting 007693f and introducing > invalidation_reason? Unless I am misinterpreting some details, ISTM we could rename this column to invalidation_reason and use it for both logical and physical slots. I'm not seeing a strong need for another column. Perhaps I am missing something... -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Michael Paquier
Date:
On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: > On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote: >> Do you have any thoughts on reverting 007693f and introducing >> invalidation_reason? > > Unless I am misinterpreting some details, ISTM we could rename this column > to invalidation_reason and use it for both logical and physical slots. I'm > not seeing a strong need for another column. Perhaps I am missing > something... And also, please don't be hasty in taking a decision that would involve a revert of 007693f without informing the committer of this commit about that. I am adding Amit Kapila in CC of this thread for awareness. -- Michael
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: > On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote: > > On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > >> Would you ever see "conflict" as false and "invalidation_reason" as > >> non-null for a logical slot? > > > > No. Because both conflict and invalidation_reason are decided based on > > the invalidation reason i.e. value of slot_contents.data.invalidated. > > IOW, a logical slot that reports conflict as true must have been > > invalidated. > > > > Do you have any thoughts on reverting 007693f and introducing > > invalidation_reason? > > Unless I am misinterpreting some details, ISTM we could rename this column > to invalidation_reason and use it for both logical and physical slots. I'm > not seeing a strong need for another column. Yeah having two columns was more for convenience purpose. Without the "conflict" one, a slot conflicting with recovery would be "a logical slot having a non NULL invalidation_reason". I'm also fine with one column if most of you prefer that way. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: > > On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote: > > > On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > >> Would you ever see "conflict" as false and "invalidation_reason" as > > >> non-null for a logical slot? > > > > > > No. Because both conflict and invalidation_reason are decided based on > > > the invalidation reason i.e. value of slot_contents.data.invalidated. > > > IOW, a logical slot that reports conflict as true must have been > > > invalidated. > > > > > > Do you have any thoughts on reverting 007693f and introducing > > > invalidation_reason? > > > > Unless I am misinterpreting some details, ISTM we could rename this column > > to invalidation_reason and use it for both logical and physical slots. I'm > > not seeing a strong need for another column. > > Yeah having two columns was more for convenience purpose. Without the "conflict" > one, a slot conflicting with recovery would be "a logical slot having a non NULL > invalidation_reason". > > I'm also fine with one column if most of you prefer that way. While we debate on the above, please find the attached v7 patch set after rebasing. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: >> > Unless I am misinterpreting some details, ISTM we could rename this column >> > to invalidation_reason and use it for both logical and physical slots. I'm >> > not seeing a strong need for another column. >> >> Yeah having two columns was more for convenience purpose. Without the "conflict" >> one, a slot conflicting with recovery would be "a logical slot having a non NULL >> invalidation_reason". >> >> I'm also fine with one column if most of you prefer that way. > > While we debate on the above, please find the attached v7 patch set > after rebasing. It looks like Bertrand is okay with reusing the same column for both logical and physical slots, which IIUC is what you initially proposed in v1 of the patch set. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote: > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: > >> > Unless I am misinterpreting some details, ISTM we could rename this column > >> > to invalidation_reason and use it for both logical and physical slots. I'm > >> > not seeing a strong need for another column. > >> > >> Yeah having two columns was more for convenience purpose. Without the "conflict" > >> one, a slot conflicting with recovery would be "a logical slot having a non NULL > >> invalidation_reason". > >> > >> I'm also fine with one column if most of you prefer that way. > > > > While we debate on the above, please find the attached v7 patch set > > after rebasing. > > It looks like Bertrand is okay with reusing the same column for both > logical and physical slots Yeah, I'm okay with one column. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote: > > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: > > >> > Unless I am misinterpreting some details, ISTM we could rename this column > > >> > to invalidation_reason and use it for both logical and physical slots. I'm > > >> > not seeing a strong need for another column. > > >> > > >> Yeah having two columns was more for convenience purpose. Without the "conflict" > > >> one, a slot conflicting with recovery would be "a logical slot having a non NULL > > >> invalidation_reason". > > >> > > >> I'm also fine with one column if most of you prefer that way. > > > > > > While we debate on the above, please find the attached v7 patch set > > > after rebasing. > > > > It looks like Bertrand is okay with reusing the same column for both > > logical and physical slots > > Yeah, I'm okay with one column. Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 06, 2024 at 02:46:57PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > Yeah, I'm okay with one column. > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. Thanks! A few comments: 1 === + The reason for the slot's invalidation. <literal>NULL</literal> if the + slot is currently actively being used. s/currently actively being used/not invalidated/ ? (I mean it could be valid and not being used). 2 === + the slot is marked as invalidated. In case of logical slots, it + represents the reason for the logical slot's conflict with recovery. s/the reason for the logical slot's conflict with recovery./the recovery conflict reason./ ? 3 === @@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check) * removed. */ res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, " - "%s as caught_up, conflict_reason IS NOT NULL as invalid " + "%s as caught_up, invalidation_reason IS NOT NULL as invalid " "FROM pg_catalog.pg_replication_slots " "WHERE slot_type = 'logical' AND " "database = current_database() AND " "temporary IS FALSE;", live_check ? "FALSE" : - "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE " + "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE " Yeah that's fine because there is logical slot filtering here. 4 === -GetSlotInvalidationCause(const char *conflict_reason) +GetSlotInvalidationCause(const char *invalidation_reason) Should we change the comment "Maps a conflict reason" above this function? 5 === -# Check conflict_reason is NULL for physical slot +# Check invalidation_reason is NULL for physical slot $res = $node_primary->safe_psql( 'postgres', qq[ - SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';] + SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';] ); I don't think this test is needed anymore: it does not make that much sense since it's done after the primary database initialization and startup. 6 === @@ -680,7 +680,7 @@ ok( $node_standby->poll_query_until( is( $node_standby->safe_psql( 'postgres', q[select bool_or(conflicting) from - (select conflict_reason is not NULL as conflicting + (select invalidation_reason is not NULL as conflicting from pg_replication_slots WHERE slot_type = 'logical')]), 'f', 'Logical slots are reported as non conflicting'); What about? " # Verify slots are reported as valid in pg_replication_slots is( $node_standby->safe_psql( 'postgres', q[select bool_or(invalidated) from (select invalidation_reason is not NULL as invalidated from pg_replication_slots WHERE slot_type = 'logical')]), 'f', 'Logical slots are reported as valid'); " Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Mar 4, 2024 at 3:14 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote: > > On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > >> Would you ever see "conflict" as false and "invalidation_reason" as > >> non-null for a logical slot? > > > > No. Because both conflict and invalidation_reason are decided based on > > the invalidation reason i.e. value of slot_contents.data.invalidated. > > IOW, a logical slot that reports conflict as true must have been > > invalidated. > > > > Do you have any thoughts on reverting 007693f and introducing > > invalidation_reason? > > Unless I am misinterpreting some details, ISTM we could rename this column > to invalidation_reason and use it for both logical and physical slots. I'm > not seeing a strong need for another column. Perhaps I am missing > something... > IIUC, the current conflict_reason is primarily used to determine logical slots on standby that got invalidated due to recovery time conflict. On the primary, it will also show logical slots that got invalidated due to the corresponding WAL got removed. Is that understanding correct? If so, we are already sort of overloading this column. However, now adding more invalidation reasons that won't happen during recovery conflict handling will change entirely the purpose (as per the name we use) of this variable. I think invalidation_reason could depict this column correctly but OTOH I guess it would lose its original meaning/purpose. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. > @@ -1629,6 +1634,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause, } } break; + case RS_INVAL_INACTIVE_TIMEOUT: + if (s->data.last_inactive_at > 0) + { + TimestampTz now; + + Assert(s->data.persistency == RS_PERSISTENT); + Assert(s->active_pid == 0); + + now = GetCurrentTimestamp(); + if (TimestampDifferenceExceeds(s->data.last_inactive_at, now, + inactive_replication_slot_timeout * 1000)) You might want to consider its interaction with sync slots on standby. Say, there is no activity on slots in terms of processing the changes for slots. Now, we won't perform sync of such slots on standby showing them inactive as per your new criteria where as same slots could still be valid on primary as the walsender is still active. This may be more of a theoretical point as in running system there will probably be some activity but I think this needs some thougths. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 6, 2024 at 4:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > IIUC, the current conflict_reason is primarily used to determine > logical slots on standby that got invalidated due to recovery time > conflict. On the primary, it will also show logical slots that got > invalidated due to the corresponding WAL got removed. Is that > understanding correct? That's right. > If so, we are already sort of overloading this > column. However, now adding more invalidation reasons that won't > happen during recovery conflict handling will change entirely the > purpose (as per the name we use) of this variable. I think > invalidation_reason could depict this column correctly but OTOH I > guess it would lose its original meaning/purpose. Hm. I get the concern. Are you okay with having inavlidation_reason separately for both logical and physical slots? In such a case, logical slots that got invalidated on the standby will have duplicate info in conflict_reason and invalidation_reason, is this fine? Another idea is to make 'conflict_reason text' as a 'conflicting boolean' again (revert 007693f2a3), and have 'invalidation_reason text' for both logical and physical slots. So, whenever 'conflicting' is true, one can look at invalidation_reason for the reason for conflict. How does this sound? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > You might want to consider its interaction with sync slots on standby. > Say, there is no activity on slots in terms of processing the changes > for slots. Now, we won't perform sync of such slots on standby showing > them inactive as per your new criteria where as same slots could still > be valid on primary as the walsender is still active. This may be more > of a theoretical point as in running system there will probably be > some activity but I think this needs some thougths. I believe the xmin and catalog_xmin of the sync slots on the standby keep advancing depending on the slots on the primary, no? If yes, the XID age based invalidation shouldn't be a problem. I believe there are no walsenders started for the sync slots on the standbys, right? If yes, the inactive timeout based invalidation also shouldn't be a problem. Because, the inactive timeouts for a slot are tracked only for walsenders because they are the ones that typically hold replication slots for longer durations and for real replication use. We did a similar thing in a recent commit [1]. Is my understanding right? Do you still see any problems with it? [1] commit 7c3fb505b14e86581b6a052075a294c78c91b123 Author: Amit Kapila <akapila@postgresql.org> Date: Tue Nov 21 07:59:53 2023 +0530 Log messages for replication slot acquisition and release. ......... Note that these messages are emitted only for walsenders but not for backends. This is because walsenders are the ones that typically hold replication slots for longer durations, unlike backends which hold them for executing replication related functions. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 8, 2024 at 8:08 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 6, 2024 at 4:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > IIUC, the current conflict_reason is primarily used to determine > > logical slots on standby that got invalidated due to recovery time > > conflict. On the primary, it will also show logical slots that got > > invalidated due to the corresponding WAL got removed. Is that > > understanding correct? > > That's right. > > > If so, we are already sort of overloading this > > column. However, now adding more invalidation reasons that won't > > happen during recovery conflict handling will change entirely the > > purpose (as per the name we use) of this variable. I think > > invalidation_reason could depict this column correctly but OTOH I > > guess it would lose its original meaning/purpose. > > Hm. I get the concern. Are you okay with having inavlidation_reason > separately for both logical and physical slots? In such a case, > logical slots that got invalidated on the standby will have duplicate > info in conflict_reason and invalidation_reason, is this fine? > If we have duplicate information in two columns that could be confusing for users. BTW, isn't the recovery conflict occur only because of rows_removed and wal_level_insufficient reasons? The wal_removed or the new reasons you are proposing can't happen because of recovery conflict. Am, I missing something here? > Another idea is to make 'conflict_reason text' as a 'conflicting > boolean' again (revert 007693f2a3), and have 'invalidation_reason > text' for both logical and physical slots. So, whenever 'conflicting' > is true, one can look at invalidation_reason for the reason for > conflict. How does this sound? > So, does this mean that conflicting will only be true for some of the reasons (say wal_level_insufficient, rows_removed, wal_removed) and logical slots but not for others? I think that will also not eliminate the duplicate information as user could have deduced that from single column -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 8, 2024 at 10:42 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > You might want to consider its interaction with sync slots on standby. > > Say, there is no activity on slots in terms of processing the changes > > for slots. Now, we won't perform sync of such slots on standby showing > > them inactive as per your new criteria where as same slots could still > > be valid on primary as the walsender is still active. This may be more > > of a theoretical point as in running system there will probably be > > some activity but I think this needs some thougths. > > I believe the xmin and catalog_xmin of the sync slots on the standby > keep advancing depending on the slots on the primary, no? If yes, the > XID age based invalidation shouldn't be a problem. > > I believe there are no walsenders started for the sync slots on the > standbys, right? If yes, the inactive timeout based invalidation also > shouldn't be a problem. Because, the inactive timeouts for a slot are > tracked only for walsenders because they are the ones that typically > hold replication slots for longer durations and for real replication > use. We did a similar thing in a recent commit [1]. > > Is my understanding right? > Yes, your understanding is correct. I wanted us to consider having new parameters like 'inactive_replication_slot_timeout' to be at slot-level instead of GUC. I think this new parameter doesn't seem to be the similar as 'max_slot_wal_keep_size' which leads to truncation of WAL at global and then invalidates the appropriate slots. OTOH, the 'inactive_replication_slot_timeout' doesn't appear to have a similar global effect. The other thing we should consider is what if the checkpoint happens at a timeout greater than 'inactive_replication_slot_timeout'? Shall, we consider doing it via some other background process or do we think checkpointer is the best we can have? > Do you still see any problems with it? > Sorry, I haven't done any detailed review yet so can't say with confidence whether there is any problem or not w.r.t sync slots. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. > Commit message says: "Currently postgres has the ability to invalidate inactive replication slots based on the amount of WAL (set via max_slot_wal_keep_size GUC) that will be needed for the slots in case they become active. However, choosing a default value for max_slot_wal_keep_size is tricky. Because the amount of WAL a customer generates, and their allocated storage will vary greatly in production, making it difficult to pin down a one-size-fits-all value. It is often easy for developers to set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which the slots get invalidated." I don't see how it will be easier for the user to choose the default value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But, I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be another parameter to allow vacuum to proceed removing the rows which otherwise it wouldn't have been as those would be required by some slot. Now, if this understanding is correct, we should probably make this invalidation happen by (auto)vacuum after computing the age based on this new parameter. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
On Mon, Mar 11, 2024 at 04:09:27PM +0530, Amit Kapila wrote: > I don't see how it will be easier for the user to choose the default > value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But, > I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be > another parameter to allow vacuum to proceed removing the rows which > otherwise it wouldn't have been as those would be required by some > slot. Yeah, the idea is to help prevent transaction ID wraparound, so I would expect max_slot_xid_age to ordinarily be set relatively high, i.e., 1.5B+. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > You might want to consider its interaction with sync slots on standby. > > Say, there is no activity on slots in terms of processing the changes > > for slots. Now, we won't perform sync of such slots on standby showing > > them inactive as per your new criteria where as same slots could still > > be valid on primary as the walsender is still active. This may be more > > of a theoretical point as in running system there will probably be > > some activity but I think this needs some thougths. > > I believe the xmin and catalog_xmin of the sync slots on the standby > keep advancing depending on the slots on the primary, no? If yes, the > XID age based invalidation shouldn't be a problem. > > I believe there are no walsenders started for the sync slots on the > standbys, right? If yes, the inactive timeout based invalidation also > shouldn't be a problem. Because, the inactive timeouts for a slot are > tracked only for walsenders because they are the ones that typically > hold replication slots for longer durations and for real replication > use. We did a similar thing in a recent commit [1]. > > Is my understanding right? Do you still see any problems with it? Would that make sense to "simply" discard/prevent those kind of invalidations for "synced" slot on standby? I mean, do they make sense given the fact that those slots are not usable until the standby is promoted? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote: > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > You might want to consider its interaction with sync slots on standby. > > > Say, there is no activity on slots in terms of processing the changes > > > for slots. Now, we won't perform sync of such slots on standby showing > > > them inactive as per your new criteria where as same slots could still > > > be valid on primary as the walsender is still active. This may be more > > > of a theoretical point as in running system there will probably be > > > some activity but I think this needs some thougths. > > > > I believe the xmin and catalog_xmin of the sync slots on the standby > > keep advancing depending on the slots on the primary, no? If yes, the > > XID age based invalidation shouldn't be a problem. > > > > I believe there are no walsenders started for the sync slots on the > > standbys, right? If yes, the inactive timeout based invalidation also > > shouldn't be a problem. Because, the inactive timeouts for a slot are > > tracked only for walsenders because they are the ones that typically > > hold replication slots for longer durations and for real replication > > use. We did a similar thing in a recent commit [1]. > > > > Is my understanding right? Do you still see any problems with it? > > Would that make sense to "simply" discard/prevent those kind of invalidations > for "synced" slot on standby? I mean, do they make sense given the fact that > those slots are not usable until the standby is promoted? > AFAIR, we don't prevent similar invalidations due to 'max_slot_wal_keep_size' for sync slots, so why to prevent it for these new parameters? This will unnecessarily create inconsistency in the invalidation behavior. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Hm. I get the concern. Are you okay with having inavlidation_reason > > separately for both logical and physical slots? In such a case, > > logical slots that got invalidated on the standby will have duplicate > > info in conflict_reason and invalidation_reason, is this fine? > > > > If we have duplicate information in two columns that could be > confusing for users. BTW, isn't the recovery conflict occur only > because of rows_removed and wal_level_insufficient reasons? The > wal_removed or the new reasons you are proposing can't happen because > of recovery conflict. Am, I missing something here? My understanding aligns with yours that the rows_removed and wal_level_insufficient invalidations can occur only upon recovery conflict. FWIW, a test named 'synchronized slot has been invalidated' in 040_standby_failover_slots_sync.pl inappropriately uses conflict_reason = 'wal_removed' logical slot on standby. As per the above understanding, it's inappropriate to use conflict_reason here because wal_removed invalidation doesn't conflict with recovery. > > Another idea is to make 'conflict_reason text' as a 'conflicting > > boolean' again (revert 007693f2a3), and have 'invalidation_reason > > text' for both logical and physical slots. So, whenever 'conflicting' > > is true, one can look at invalidation_reason for the reason for > > conflict. How does this sound? > > > > So, does this mean that conflicting will only be true for some of the > reasons (say wal_level_insufficient, rows_removed, wal_removed) and > logical slots but not for others? I think that will also not eliminate > the duplicate information as user could have deduced that from single > column. So, how about we turn conflict_reason to only report the reasons that actually cause conflict with recovery for logical slots, something like below, and then have invalidation_cause as a generic column for all sorts of invalidation reasons for both logical and physical slots? ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated; if (slot_contents.data.database == InvalidOid || cause == RS_INVAL_NONE || cause != RS_INVAL_HORIZON || cause != RS_INVAL_WAL_LEVEL) { nulls[i++] = true; } else { Assert(cause == RS_INVAL_HORIZON || cause == RS_INVAL_WAL_LEVEL); values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]); } -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote: > On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote: > > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > You might want to consider its interaction with sync slots on standby. > > > > Say, there is no activity on slots in terms of processing the changes > > > > for slots. Now, we won't perform sync of such slots on standby showing > > > > them inactive as per your new criteria where as same slots could still > > > > be valid on primary as the walsender is still active. This may be more > > > > of a theoretical point as in running system there will probably be > > > > some activity but I think this needs some thougths. > > > > > > I believe the xmin and catalog_xmin of the sync slots on the standby > > > keep advancing depending on the slots on the primary, no? If yes, the > > > XID age based invalidation shouldn't be a problem. > > > > > > I believe there are no walsenders started for the sync slots on the > > > standbys, right? If yes, the inactive timeout based invalidation also > > > shouldn't be a problem. Because, the inactive timeouts for a slot are > > > tracked only for walsenders because they are the ones that typically > > > hold replication slots for longer durations and for real replication > > > use. We did a similar thing in a recent commit [1]. > > > > > > Is my understanding right? Do you still see any problems with it? > > > > Would that make sense to "simply" discard/prevent those kind of invalidations > > for "synced" slot on standby? I mean, do they make sense given the fact that > > those slots are not usable until the standby is promoted? > > > > AFAIR, we don't prevent similar invalidations due to > 'max_slot_wal_keep_size' for sync slots, Right, we'd invalidate them on the standby should the standby sync slot restart_lsn exceeds the limit. > so why to prevent it for > these new parameters? This will unnecessarily create inconsistency in > the invalidation behavior. Yeah, but I think wal removal has a direct impact on the slot usuability which is probably not the case with the new XID and Timeout ones. That's why I thought about handling them differently (but I'm also fine if that's not the case). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Mar 12, 2024 at 5:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Would that make sense to "simply" discard/prevent those kind of invalidations > > for "synced" slot on standby? I mean, do they make sense given the fact that > > those slots are not usable until the standby is promoted? > > AFAIR, we don't prevent similar invalidations due to > 'max_slot_wal_keep_size' for sync slots, so why to prevent it for > these new parameters? This will unnecessarily create inconsistency in > the invalidation behavior. Right. +1 to keep the behaviour consistent for all invalidations. However, an assertion that inactive_timeout isn't set for synced slots on the standby isn't a bad idea because we rely on the fact that walsenders aren't started for synced slots. Again, I think it misses the consistency in the invalidation behaviour. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > AFAIR, we don't prevent similar invalidations due to > > 'max_slot_wal_keep_size' for sync slots, > > Right, we'd invalidate them on the standby should the standby sync slot restart_lsn > exceeds the limit. Right. Help me understand this a bit - is the wal_removed invalidation going to conflict with recovery on the standby? Per the discussion upthread, I'm trying to understand what invalidation reasons will exactly cause conflict with recovery? Is it just rows_removed and wal_level_insufficient invalidations? My understanding on the conflict with recovery and invalidation reason has been a bit off track. Perhaps, we need to clarify these two things in the docs for the end users as well? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Yes, your understanding is correct. I wanted us to consider having new > parameters like 'inactive_replication_slot_timeout' to be at > slot-level instead of GUC. I think this new parameter doesn't seem to > be the similar as 'max_slot_wal_keep_size' which leads to truncation > of WAL at global and then invalidates the appropriate slots. OTOH, the > 'inactive_replication_slot_timeout' doesn't appear to have a similar > global effect. last_inactive_at is tracked for each slot using which slots get invalidated based on inactive_replication_slot_timeout. It's like max_slot_wal_keep_size invalidating slots based on restart_lsn. In a way, both are similar, right? > The other thing we should consider is what if the > checkpoint happens at a timeout greater than > 'inactive_replication_slot_timeout'? In such a case, the slots get invalidated upon the next checkpoint as the (current_checkpointer_timeout - last_inactive_at) will then be greater than inactive_replication_slot_timeout. > Shall, we consider doing it via > some other background process or do we think checkpointer is the best > we can have? The same problem exists if we do it with some other background process. I think the checkpointer is best because it already invalidates slots for wal_removed cause, and flushes all replication slots to disk. Moving this new invalidation functionality into some other background process such as autovacuum will not only burden that process' work but also mix up the unique functionality of that background process. Having said above, I'm open to ideas from others as I'm not so sure if there's any issue with checkpointer invalidating the slots for new reasons. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 11, 2024 at 4:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I don't see how it will be easier for the user to choose the default > value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But, > I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be > another parameter to allow vacuum to proceed removing the rows which > otherwise it wouldn't have been as those would be required by some > slot. Now, if this understanding is correct, we should probably make > this invalidation happen by (auto)vacuum after computing the age based > on this new parameter. Currently, the patch computes the XID age in the checkpointer using the next XID (gets from ReadNextFullTransactionId()) and slot's xmin and catalog_xmin. I think the checkpointer is best because it already invalidates slots for wal_removed cause, and flushes all replication slots to disk. Moving this new invalidation functionality into some other background process such as autovacuum will not only burden that process' work but also mix up the unique functionality of that background process. Having said above, I'm open to ideas from others as I'm not so sure if there's any issue with checkpointer invalidating the slots for new reasons. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 12, 2024 at 8:55 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > Hm. I get the concern. Are you okay with having inavlidation_reason > > > separately for both logical and physical slots? In such a case, > > > logical slots that got invalidated on the standby will have duplicate > > > info in conflict_reason and invalidation_reason, is this fine? > > > > > > > If we have duplicate information in two columns that could be > > confusing for users. BTW, isn't the recovery conflict occur only > > because of rows_removed and wal_level_insufficient reasons? The > > wal_removed or the new reasons you are proposing can't happen because > > of recovery conflict. Am, I missing something here? > > My understanding aligns with yours that the rows_removed and > wal_level_insufficient invalidations can occur only upon recovery > conflict. > > FWIW, a test named 'synchronized slot has been invalidated' in > 040_standby_failover_slots_sync.pl inappropriately uses > conflict_reason = 'wal_removed' logical slot on standby. As per the > above understanding, it's inappropriate to use conflict_reason here > because wal_removed invalidation doesn't conflict with recovery. > > > > Another idea is to make 'conflict_reason text' as a 'conflicting > > > boolean' again (revert 007693f2a3), and have 'invalidation_reason > > > text' for both logical and physical slots. So, whenever 'conflicting' > > > is true, one can look at invalidation_reason for the reason for > > > conflict. How does this sound? > > > > > > > So, does this mean that conflicting will only be true for some of the > > reasons (say wal_level_insufficient, rows_removed, wal_removed) and > > logical slots but not for others? I think that will also not eliminate > > the duplicate information as user could have deduced that from single > > column. > > So, how about we turn conflict_reason to only report the reasons that > actually cause conflict with recovery for logical slots, something > like below, and then have invalidation_cause as a generic column for > all sorts of invalidation reasons for both logical and physical slots? > If our above understanding is correct then coflict_reason will be a subset of invalidation_reason. If so, whatever way we arrange this information, there will be some sort of duplicity unless we just have one column 'invalidation_reason' and update the docs to interpret it correctly for conflicts. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote: > > On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > so why to prevent it for > > these new parameters? This will unnecessarily create inconsistency in > > the invalidation behavior. > > Yeah, but I think wal removal has a direct impact on the slot usuability which > is probably not the case with the new XID and Timeout ones. > BTW, is XID the based parameter 'max_slot_xid_age' not have similarity with 'max_slot_wal_keep_size'? I think it will impact the rows we removed based on xid horizons. Don't we need to consider it while vacuum computing the xid horizons in ComputeXidHorizons() similar to what we do for WAL w.r.t 'max_slot_wal_keep_size'? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 12, 2024 at 10:10 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Yes, your understanding is correct. I wanted us to consider having new > > parameters like 'inactive_replication_slot_timeout' to be at > > slot-level instead of GUC. I think this new parameter doesn't seem to > > be the similar as 'max_slot_wal_keep_size' which leads to truncation > > of WAL at global and then invalidates the appropriate slots. OTOH, the > > 'inactive_replication_slot_timeout' doesn't appear to have a similar > > global effect. > > last_inactive_at is tracked for each slot using which slots get > invalidated based on inactive_replication_slot_timeout. It's like > max_slot_wal_keep_size invalidating slots based on restart_lsn. In a > way, both are similar, right? > There is some similarity but 'max_slot_wal_keep_size' leads to truncation of WAL which in turn leads to invalidation of slots. Here, I am also trying to be cautious in adding a GUC unless it is required or having a slot-level parameter doesn't serve the need. Having said that, I see that there is an argument that we should follow the path of 'max_slot_wal_keep_size' GUC and there is some value to it but still I think avoiding a new GUC for inactivity in the slot would outweigh. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote: > > > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > > > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot > > > > <bertranddrouvot.pg@gmail.com> wrote: > > > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote: > > > >> > Unless I am misinterpreting some details, ISTM we could rename this column > > > >> > to invalidation_reason and use it for both logical and physical slots. I'm > > > >> > not seeing a strong need for another column. > > > >> > > > >> Yeah having two columns was more for convenience purpose. Without the "conflict" > > > >> one, a slot conflicting with recovery would be "a logical slot having a non NULL > > > >> invalidation_reason". > > > >> > > > >> I'm also fine with one column if most of you prefer that way. > > > > > > > > While we debate on the above, please find the attached v7 patch set > > > > after rebasing. > > > > > > It looks like Bertrand is okay with reusing the same column for both > > > logical and physical slots > > > > Yeah, I'm okay with one column. > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. JFYI, the patch does not apply to the head. There is a conflict in multiple files. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
> JFYI, the patch does not apply to the head. There is a conflict in > multiple files. For review purposes, I applied v8 to the March 6 code-base. I have yet to review in detail, please find my initial thoughts: 1) I found that 'inactive_replication_slot_timeout' works only if there was any walsender ever started for that slot . The logic is under 'am_walsender' check. Is this intentional? If I create a slot and use only pg_logical_slot_get_changes or pg_replication_slot_advance on it, it never gets invalidated due to timeout. While, when I set 'max_slot_xid_age' or say 'max_slot_wal_keep_size' to a lower value, the said slot is invalidated correctly with 'xid_aged' and 'wal_removed' reasons respectively. Example: With inactive_replication_slot_timeout=1min, test1_3 is the slot for which there is no walsender and only advance and get_changes SQL functions were called; test1_4 is the one for which pg_recvlogical was run for a second. test1_3 | 785 | | reserved | | t | | test1_4 | 798 | | lost | inactive_timeout | t | 2024-03-13 11:52:41.58446+05:30 | And when inactive_replication_slot_timeout=0 and max_slot_xid_age=10 test1_3 | 785 | | lost | xid_aged | t | | test1_4 | 798 | | lost | inactive_timeout | t | 2024-03-13 11:52:41.58446+05:30 | 2) The msg for patch 3 says: -------------- a) when replication slots is lying inactive for a day or so using last_inactive_at metric, b) when a replication slot is becoming inactive too frequently using last_inactive_at metric. -------------- I think in b, you want to refer to inactive_count instead of last_inactive_at? 3) I do not see invalidation_reason updated for 2 new reasons in system-views.sgml thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 12, 2024 at 09:19:35PM +0530, Bharath Rupireddy wrote: > On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > AFAIR, we don't prevent similar invalidations due to > > > 'max_slot_wal_keep_size' for sync slots, > > > > Right, we'd invalidate them on the standby should the standby sync slot restart_lsn > > exceeds the limit. > > Right. Help me understand this a bit - is the wal_removed invalidation > going to conflict with recovery on the standby? I don't think so, as it's not directly related to recovery. The slot will be invalided on the standby though. > Per the discussion upthread, I'm trying to understand what > invalidation reasons will exactly cause conflict with recovery? Is it > just rows_removed and wal_level_insufficient invalidations? Yes, that's the ones added in be87200efd. See the error messages on a standby: == wal removal postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub4_slot', NULL, NULL, 'include-xids', '0'); ERROR: can no longer get changes from replication slot "lsub4_slot" DETAIL: This slot has been invalidated because it exceeded the maximum reserved size. == wal level postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub5_slot';; conflict_reason ------------------------ wal_level_insufficient (1 row) postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub5_slot', NULL, NULL, 'include-xids', '0'); ERROR: can no longer get changes from replication slot "lsub5_slot" DETAIL: This slot has been invalidated because it was conflicting with recovery. == rows removal postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub6_slot';; conflict_reason ----------------- rows_removed (1 row) postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub6_slot', NULL, NULL, 'include-xids', '0'); ERROR: can no longer get changes from replication slot "lsub6_slot" DETAIL: This slot has been invalidated because it was conflicting with recovery. As you can see, only wal level and rows removal are mentioning conflict with recovery. So, are we already "wrong" mentioning "wal_removed" in conflict_reason? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Fri, Mar 8, 2024 at 10:42 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > You might want to consider its interaction with sync slots on standby. > > Say, there is no activity on slots in terms of processing the changes > > for slots. Now, we won't perform sync of such slots on standby showing > > them inactive as per your new criteria where as same slots could still > > be valid on primary as the walsender is still active. This may be more > > of a theoretical point as in running system there will probably be > > some activity but I think this needs some thougths. > > I believe the xmin and catalog_xmin of the sync slots on the standby > keep advancing depending on the slots on the primary, no? If yes, the > XID age based invalidation shouldn't be a problem. If the user has not enabled slot-sync worker and is relying on the SQL function pg_sync_replication_slots(), then the xmin and catalog_xmin of synced slots may not keep on advancing. These will be advanced only on next run of function. But meanwhile the synced slots may be invalidated due to 'xid_aged'. Then the next time, when user runs pg_sync_replication_slots() again, the invalidated slots will be dropped and will be recreated by this SQL function (provided they are valid on primary and are invalidated on standby alone). I am not stating that it is a problem, but we need to think if this is what we want. Secondly, the behaviour is not same with 'inactive_timeout' invalidation. Synced slots are immune to 'inactive_timeout' invalidation as this invalidation happens only in walsender, while these are not immune to 'xid_aged' invalidation. So again, needs some thoughts here. > I believe there are no walsenders started for the sync slots on the > standbys, right? If yes, the inactive timeout based invalidation also > shouldn't be a problem. Because, the inactive timeouts for a slot are > tracked only for walsenders because they are the ones that typically > hold replication slots for longer durations and for real replication > use. We did a similar thing in a recent commit [1]. > > Is my understanding right? Do you still see any problems with it? I have explained the situation above for us to think over it better. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > So, how about we turn conflict_reason to only report the reasons that > > actually cause conflict with recovery for logical slots, something > > like below, and then have invalidation_cause as a generic column for > > all sorts of invalidation reasons for both logical and physical slots? > > If our above understanding is correct then coflict_reason will be a > subset of invalidation_reason. If so, whatever way we arrange this > information, there will be some sort of duplicity unless we just have > one column 'invalidation_reason' and update the docs to interpret it > correctly for conflicts. Yes, there will be some sort of duplicity if we emit conflict_reason as a text field. However, I still think the better way is to turn conflict_reason text to conflict boolean and set it to true only on rows_removed and wal_level_insufficient invalidations. When conflict boolean is true, one (including all the tests that we've added recently) can look for invalidation_reason text field for the reason. This sounds reasonable to me as opposed to we just mentioning in the docs that "if invalidation_reason is rows_removed or wal_level_insufficient it's the reason for conflict with recovery". Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 13, 2024 at 12:51 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > See the error messages on a standby: > > == wal removal > > postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub4_slot', NULL, NULL, 'include-xids', '0'); > ERROR: can no longer get changes from replication slot "lsub4_slot" > DETAIL: This slot has been invalidated because it exceeded the maximum reserved size. > > == wal level > > postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub5_slot';; > conflict_reason > ------------------------ > wal_level_insufficient > (1 row) > > postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub5_slot', NULL, NULL, 'include-xids', '0'); > ERROR: can no longer get changes from replication slot "lsub5_slot" > DETAIL: This slot has been invalidated because it was conflicting with recovery. > > == rows removal > > postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub6_slot';; > conflict_reason > ----------------- > rows_removed > (1 row) > > postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub6_slot', NULL, NULL, 'include-xids', '0'); > ERROR: can no longer get changes from replication slot "lsub6_slot" > DETAIL: This slot has been invalidated because it was conflicting with recovery. > > As you can see, only wal level and rows removal are mentioning conflict with > recovery. > > So, are we already "wrong" mentioning "wal_removed" in conflict_reason? It looks like yes. So, how about we fix it the way proposed here - https://www.postgresql.org/message-id/CALj2ACVd_dizYQiZwwUfsb%2BhG-fhGYo_kEDq0wn_vNwQvOrZHg%40mail.gmail.com? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote: > > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. > > JFYI, the patch does not apply to the head. There is a conflict in > multiple files. Thanks for looking into this. I noticed that the v8 patches needed rebase. Before I go do anything with the patches, I'm trying to gain consensus on the design. Following is the summary of design choices we've discussed so far: 1) conflict_reason vs invalidation_reason. 2) When to compute the XID age? 3) Where to do the invalidations? Is it in the checkpointer or autovacuum or some other process? 4) Interaction of these new invalidations with sync slots on the standby. I hope to get on to these one after the other. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > So, how about we turn conflict_reason to only report the reasons that > > > actually cause conflict with recovery for logical slots, something > > > like below, and then have invalidation_cause as a generic column for > > > all sorts of invalidation reasons for both logical and physical slots? > > > > If our above understanding is correct then coflict_reason will be a > > subset of invalidation_reason. If so, whatever way we arrange this > > information, there will be some sort of duplicity unless we just have > > one column 'invalidation_reason' and update the docs to interpret it > > correctly for conflicts. > > Yes, there will be some sort of duplicity if we emit conflict_reason > as a text field. However, I still think the better way is to turn > conflict_reason text to conflict boolean and set it to true only on > rows_removed and wal_level_insufficient invalidations. When conflict > boolean is true, one (including all the tests that we've added > recently) can look for invalidation_reason text field for the reason. > This sounds reasonable to me as opposed to we just mentioning in the > docs that "if invalidation_reason is rows_removed or > wal_level_insufficient it's the reason for conflict with recovery". > Fair point. I think we can go either way. Bertrand, Nathan, and others, do you have an opinion on this matter? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 13, 2024 at 10:16 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. > > > > JFYI, the patch does not apply to the head. There is a conflict in > > multiple files. > > Thanks for looking into this. I noticed that the v8 patches needed > rebase. Before I go do anything with the patches, I'm trying to gain > consensus on the design. Following is the summary of design choices > we've discussed so far: > 1) conflict_reason vs invalidation_reason. > 2) When to compute the XID age? > I feel we should focus on two things (a) one is to introduce a new column invalidation_reason, and (b) let's try to first complete invalidation due to timeout. We can look into XID stuff if time permits, remember, we don't have ample time left. With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy > > > > Yes, there will be some sort of duplicity if we emit conflict_reason > > as a text field. However, I still think the better way is to turn > > conflict_reason text to conflict boolean and set it to true only on > > rows_removed and wal_level_insufficient invalidations. When conflict > > boolean is true, one (including all the tests that we've added > > recently) can look for invalidation_reason text field for the reason. > > This sounds reasonable to me as opposed to we just mentioning in the > > docs that "if invalidation_reason is rows_removed or > > wal_level_insufficient it's the reason for conflict with recovery". > > > Fair point. I think we can go either way. Bertrand, Nathan, and > others, do you have an opinion on this matter? While we wait to hear from others on this, I'm attaching the v9 patch set implementing the above idea (check 0001 patch). Please have a look. I'll come back to the other review comments soon. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy > > > > > > Yes, there will be some sort of duplicity if we emit conflict_reason > > > as a text field. However, I still think the better way is to turn > > > conflict_reason text to conflict boolean and set it to true only on > > > rows_removed and wal_level_insufficient invalidations. When conflict > > > boolean is true, one (including all the tests that we've added > > > recently) can look for invalidation_reason text field for the reason. > > > This sounds reasonable to me as opposed to we just mentioning in the > > > docs that "if invalidation_reason is rows_removed or > > > wal_level_insufficient it's the reason for conflict with recovery". +1 on maintaining both conflicting and invalidation_reason > > Fair point. I think we can go either way. Bertrand, Nathan, and > > others, do you have an opinion on this matter? > > While we wait to hear from others on this, I'm attaching the v9 patch > set implementing the above idea (check 0001 patch). Please have a > look. I'll come back to the other review comments soon. Thanks for the patch. JFYI, patch09 does not apply to HEAD, some recent commit caused the conflict. Some trivial comments on patch001 (yet to review other patches) 1) info.c: - "%s as caught_up, conflict_reason IS NOT NULL as invalid " + "%s as caught_up, invalidation_reason IS NOT NULL as invalid " Can we revert back to 'conflicting as invalid' since it is a query for logical slots only. 2) 040_standby_failover_slots_sync.pl: - q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';} + q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';} Here too, can we have 'NOT conflicting' instead of ' invalidation_reason IS NULL' as it is a logical slot test. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 13, 2024 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > BTW, is XID the based parameter 'max_slot_xid_age' not have similarity > with 'max_slot_wal_keep_size'? I think it will impact the rows we > removed based on xid horizons. Don't we need to consider it while > vacuum computing the xid horizons in ComputeXidHorizons() similar to > what we do for WAL w.r.t 'max_slot_wal_keep_size'? I'm having a hard time understanding why we'd need something up there in ComputeXidHorizons(). Can you elaborate it a bit please? What's proposed with max_slot_xid_age is that during checkpoint we look at slot's xmin and catalog_xmin, and the current system txn id. Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses max_slot_xid_age, we invalidate the slot. Let me illustrate how all this works: 1. Setup a primary and standby with hot_standby_feedback set to on on standby. For instance, check my scripts at [1]. 2. Stop the standby to make the slot inactive on the primary. Check the slot is holding xmin of 738. ./pg_ctl -D sbdata -l logfilesbdata stop postgres=# SELECT * FROM pg_replication_slots; -[ RECORD 1 ]-------+------------- slot_name | sb_repl_slot plugin | slot_type | physical datoid | database | temporary | f active | f active_pid | xmin | 738 catalog_xmin | restart_lsn | 0/3000000 confirmed_flush_lsn | wal_status | reserved safe_wal_size | two_phase | f conflict_reason | failover | f synced | f 3. Start consuming the XIDs on the primary with the following script for instance ./psql -d postgres -p 5432 DROP TABLE tab_int; CREATE TABLE tab_int (a int); do $$ begin for i in 1..268435 loop -- use an exception block so that each iteration eats an XID begin insert into tab_int values (i); exception when division_by_zero then null; end; end loop; end$$; 4. Make some dead rows in the table. update tab_int set a = a+1; delete from tab_int where a%4=0; postgres=# SELECT n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del FROM pg_stat_user_tables WHERE relname = 'tab_int'; -[ RECORD 1 ]------ n_dead_tup | 335544 n_tup_ins | 268435 n_tup_upd | 268435 n_tup_del | 67109 5. Try vacuuming to delete the dead rows, observe 'tuples: 0 removed, 536870 remain, 335544 are dead but not yet removable'. The dead rows can't be removed because the inactive slot is holding an xmin, see 'removable cutoff: 738, which was 268441 XIDs old when operation ended'. postgres=# vacuum verbose tab_int; INFO: vacuuming "postgres.public.tab_int" INFO: finished vacuuming "postgres.public.tab_int": index scans: 0 pages: 0 removed, 2376 remain, 2376 scanned (100.00% of total) tuples: 0 removed, 536870 remain, 335544 are dead but not yet removable removable cutoff: 738, which was 268441 XIDs old when operation ended frozen: 0 pages from table (0.00% of total) had 0 tuples frozen index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s buffer usage: 4759 hits, 0 misses, 0 dirtied WAL usage: 0 records, 0 full page images, 0 bytes system usage: CPU: user: 0.07 s, system: 0.00 s, elapsed: 0.07 s VACUUM 6. Now, repeat the above steps but with setting max_slot_xid_age = 200000 on the primary. 7. Do a checkpoint to invalidate the slot. postgres=# checkpoint; CHECKPOINT postgres=# SELECT * FROM pg_replication_slots; -[ RECORD 1 ]-------+------------- slot_name | sb_repl_slot plugin | slot_type | physical datoid | database | temporary | f active | f active_pid | xmin | 738 catalog_xmin | restart_lsn | 0/3000000 confirmed_flush_lsn | wal_status | lost safe_wal_size | two_phase | f conflicting | failover | f synced | f invalidation_reason | xid_aged 8. And, then vacuum the table, observe 'tuples: 335544 removed, 201326 remain, 0 are dead but not yet removable'. postgres=# vacuum verbose tab_int; INFO: vacuuming "postgres.public.tab_int" INFO: finished vacuuming "postgres.public.tab_int": index scans: 0 pages: 0 removed, 2376 remain, 2376 scanned (100.00% of total) tuples: 335544 removed, 201326 remain, 0 are dead but not yet removable removable cutoff: 269179, which was 0 XIDs old when operation ended new relfrozenxid: 269179, which is 268441 XIDs ahead of previous value frozen: 1189 pages from table (50.04% of total) had 201326 tuples frozen index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed avg read rate: 0.000 MB/s, avg write rate: 193.100 MB/s buffer usage: 4760 hits, 0 misses, 2381 dirtied WAL usage: 5942 records, 2378 full page images, 8343275 bytes system usage: CPU: user: 0.09 s, system: 0.00 s, elapsed: 0.09 s VACUUM [1] cd /home/ubuntu/postgres/pg17/bin ./pg_ctl -D db17 -l logfile17 stop rm -rf db17 logfile17 rm -rf /home/ubuntu/postgres/pg17/bin/archived_wal mkdir /home/ubuntu/postgres/pg17/bin/archived_wal ./initdb -D db17 echo "archive_mode = on archive_command='cp %p /home/ubuntu/postgres/pg17/bin/archived_wal/%f'" | tee -a db17/postgresql.conf ./pg_ctl -D db17 -l logfile17 start ./psql -d postgres -p 5432 -c "SELECT pg_create_physical_replication_slot('sb_repl_slot', true, false);" rm -rf sbdata logfilesbdata ./pg_basebackup -D sbdata echo "port=5433 primary_conninfo='host=localhost port=5432 dbname=postgres user=ubuntu' primary_slot_name='sb_repl_slot' restore_command='cp /home/ubuntu/postgres/pg17/bin/archived_wal/%f %p' hot_standby_feedback = on" | tee -a sbdata/postgresql.conf touch sbdata/standby.signal ./pg_ctl -D sbdata -l logfilesbdata start ./psql -d postgres -p 5433 -c "SELECT pg_is_in_recovery();" -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > While we wait to hear from others on this, I'm attaching the v9 patch > set implementing the above idea (check 0001 patch). Please have a > look. I'll come back to the other review comments soon. > patch002: 1) I would like to understand the purpose of 'inactive_count'? Is it only for users for monitoring purposes? We are not using it anywhere internally. I shutdown the instance 5 times and found that 'inactive_count' became 5 for all the slots created on that instance. Is this intentional? I mean we can not really use them if the instance is down. I felt it should increment the inactive_count only if during the span of instance, they were actually inactive i.e. no streaming or replication happening through them. 2) slot.c: + case RS_INVAL_XID_AGE: + { + if (TransactionIdIsNormal(s->data.xmin)) + { + .......... + } + if (TransactionIdIsNormal(s->data.catalog_xmin)) + { + .......... + } + } Can we optimize this code? It has duplicate code for processing s->data.catalog_xmin and s->data.xmin. Can we create a sub-function for this purpose and call it twice here? thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Fri, Mar 15, 2024 at 10:15 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > wal_level_insufficient it's the reason for conflict with recovery". > > +1 on maintaining both conflicting and invalidation_reason Thanks. > Thanks for the patch. JFYI, patch09 does not apply to HEAD, some > recent commit caused the conflict. Yep, the conflict is in src/test/recovery/meson.build and is because of e6927270cd18d535b77cbe79c55c6584351524be. > Some trivial comments on patch001 (yet to review other patches) Thanks for looking into this. > 1) > info.c: > > - "%s as caught_up, conflict_reason IS NOT NULL as invalid " > + "%s as caught_up, invalidation_reason IS NOT NULL as invalid " > > Can we revert back to 'conflicting as invalid' since it is a query for > logical slots only. I guess, no. There the intention is to check for invalid logical slots not just for the conflicting ones. The logical slots can get invalidated due to other reasons as well. > 2) > 040_standby_failover_slots_sync.pl: > > - q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM > pg_replication_slots WHERE slot_name = 'lsub1_slot';} > + q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary > FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';} > > Here too, can we have 'NOT conflicting' instead of ' > invalidation_reason IS NULL' as it is a logical slot test. I guess no. The tests are ensuring the slot on the standby isn't invalidated. In general, one needs to use the 'conflicting' column from pg_replication_slots when the intention is to look for reasons for conflicts, otherwise use the 'invalidation_reason' column for invalidations. Please see the attached v10 patch set after resolving the merge conflict and fixing an indentation warning in the TAP test file. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
On Thu, Mar 14, 2024 at 12:24:00PM +0530, Amit Kapila wrote: > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: >> On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: >> > > So, how about we turn conflict_reason to only report the reasons that >> > > actually cause conflict with recovery for logical slots, something >> > > like below, and then have invalidation_cause as a generic column for >> > > all sorts of invalidation reasons for both logical and physical slots? >> > >> > If our above understanding is correct then coflict_reason will be a >> > subset of invalidation_reason. If so, whatever way we arrange this >> > information, there will be some sort of duplicity unless we just have >> > one column 'invalidation_reason' and update the docs to interpret it >> > correctly for conflicts. >> >> Yes, there will be some sort of duplicity if we emit conflict_reason >> as a text field. However, I still think the better way is to turn >> conflict_reason text to conflict boolean and set it to true only on >> rows_removed and wal_level_insufficient invalidations. When conflict >> boolean is true, one (including all the tests that we've added >> recently) can look for invalidation_reason text field for the reason. >> This sounds reasonable to me as opposed to we just mentioning in the >> docs that "if invalidation_reason is rows_removed or >> wal_level_insufficient it's the reason for conflict with recovery". > > Fair point. I think we can go either way. Bertrand, Nathan, and > others, do you have an opinion on this matter? WFM -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 14, 2024 at 12:24:00PM +0530, Amit Kapila wrote: > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > So, how about we turn conflict_reason to only report the reasons that > > > > actually cause conflict with recovery for logical slots, something > > > > like below, and then have invalidation_cause as a generic column for > > > > all sorts of invalidation reasons for both logical and physical slots? > > > > > > If our above understanding is correct then coflict_reason will be a > > > subset of invalidation_reason. If so, whatever way we arrange this > > > information, there will be some sort of duplicity unless we just have > > > one column 'invalidation_reason' and update the docs to interpret it > > > correctly for conflicts. > > > > Yes, there will be some sort of duplicity if we emit conflict_reason > > as a text field. However, I still think the better way is to turn > > conflict_reason text to conflict boolean and set it to true only on > > rows_removed and wal_level_insufficient invalidations. When conflict > > boolean is true, one (including all the tests that we've added > > recently) can look for invalidation_reason text field for the reason. > > This sounds reasonable to me as opposed to we just mentioning in the > > docs that "if invalidation_reason is rows_removed or > > wal_level_insufficient it's the reason for conflict with recovery". > > > > Fair point. I think we can go either way. Bertrand, Nathan, and > others, do you have an opinion on this matter? Sounds like a good approach to me and one will be able to quickly identify if a conflict occured. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Fri, Mar 15, 2024 at 12:49 PM shveta malik <shveta.malik@gmail.com> wrote: > > patch002: > > 1) > I would like to understand the purpose of 'inactive_count'? Is it only > for users for monitoring purposes? We are not using it anywhere > internally. inactive_count metric helps detect unstable replication slots connections that have a lot of disconnections. It's not used for the inactive_timeout based slot invalidation mechanism. > I shutdown the instance 5 times and found that 'inactive_count' became > 5 for all the slots created on that instance. Is this intentional? Yes, it's incremented on shutdown (and for that matter upon every slot release) for all the slots that are tied to walsenders. > I mean we can not really use them if the instance is down. I felt it > should increment the inactive_count only if during the span of > instance, they were actually inactive i.e. no streaming or replication > happening through them. inactive_count is persisted to disk- upon clean shutdown, so, once the slots become active again, one gets to see the metric and deduce some info on disconnections. Having said that, I'm okay to hear from others on the inactive_count metric being added. > 2) > slot.c: > + case RS_INVAL_XID_AGE: > > Can we optimize this code? It has duplicate code for processing > s->data.catalog_xmin and s->data.xmin. Can we create a sub-function > for this purpose and call it twice here? Good idea. Done that way. > 2) > The msg for patch 3 says: > -------------- > a) when replication slots is lying inactive for a day or so using > last_inactive_at metric, > b) when a replication slot is becoming inactive too frequently using > last_inactive_at metric. > -------------- > I think in b, you want to refer to inactive_count instead of last_inactive_at? Right. Changed. > 3) > I do not see invalidation_reason updated for 2 new reasons in system-views.sgml Nice catch. Added them now. I've also responded to Bertrand's comments here. On Wed, Mar 6, 2024 at 3:56 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > A few comments: > > 1 === > > + The reason for the slot's invalidation. <literal>NULL</literal> if the > + slot is currently actively being used. > > s/currently actively being used/not invalidated/ ? (I mean it could be valid > and not being used). Changed. > 3 === > > res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, " > - "%s as caught_up, conflict_reason IS NOT NULL as invalid " > + "%s as caught_up, invalidation_reason IS NOT NULL as invalid " > "FROM pg_catalog.pg_replication_slots " > - "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE " > + "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE " > > Yeah that's fine because there is logical slot filtering here. Right. And, we really are looking for invalid slots there, so use of invalidation_reason is much more correct than conflicting. > 4 === > > -GetSlotInvalidationCause(const char *conflict_reason) > +GetSlotInvalidationCause(const char *invalidation_reason) > > Should we change the comment "Maps a conflict reason" above this function? Changed. > 5 === > > -# Check conflict_reason is NULL for physical slot > +# Check invalidation_reason is NULL for physical slot > $res = $node_primary->safe_psql( > 'postgres', qq[ > - SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';] > + SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';] > ); > > > I don't think this test is needed anymore: it does not make that much sense since > it's done after the primary database initialization and startup. It is now turned into a test verifying 'conflicting boolean' is null for the physical slot. Isn't that okay? > 6 === > > 'Logical slots are reported as non conflicting'); > > What about? > > " > # Verify slots are reported as valid in pg_replication_slots > 'Logical slots are reported as valid'); > " Changed. Please see the attached v11 patch set with all the above review comments addressed. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 15, 2024 at 10:45 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 13, 2024 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > BTW, is XID the based parameter 'max_slot_xid_age' not have similarity > > with 'max_slot_wal_keep_size'? I think it will impact the rows we > > removed based on xid horizons. Don't we need to consider it while > > vacuum computing the xid horizons in ComputeXidHorizons() similar to > > what we do for WAL w.r.t 'max_slot_wal_keep_size'? > > I'm having a hard time understanding why we'd need something up there > in ComputeXidHorizons(). Can you elaborate it a bit please? > > What's proposed with max_slot_xid_age is that during checkpoint we > look at slot's xmin and catalog_xmin, and the current system txn id. > Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses > max_slot_xid_age, we invalidate the slot. > I can see that in your patch (in function InvalidatePossiblyObsoleteSlot()). As per my understanding, we need something similar for slot xids in ComputeXidHorizons() as we are doing WAL in KeepLogSeg(). In KeepLogSeg(), we compute the minimum LSN location required by slots and then adjust it for 'max_slot_wal_keep_size'. On similar lines, currently in ComputeXidHorizons(), we compute the minimum xid required by slots (procArray->replication_slot_xmin and procArray->replication_slot_catalog_xmin) but then don't adjust it for 'max_slot_xid_age'. I could be missing something in this but it is better to keep discussing this and try to move with another parameter 'inactive_replication_slot_timeout' which according to me can be kept at slot level instead of a GUC but OTOH we need to see the arguments on both side and then decide which makes more sense. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > procArray->replication_slot_catalog_xmin) but then don't adjust it for > 'max_slot_xid_age'. I could be missing something in this but it is > better to keep discussing this and try to move with another parameter > 'inactive_replication_slot_timeout' which according to me can be kept > at slot level instead of a GUC but OTOH we need to see the arguments > on both side and then decide which makes more sense. Hm. Are you suggesting inactive_timeout to be a slot level parameter similar to 'failover' property added recently by c393308b69d229b664391ac583b9e07418d411b6 and 73292404370c9900a96e2bebdc7144f7010339cf? With this approach, one can set inactive_timeout while creating the slot either via pg_create_physical_replication_slot() or pg_create_logical_replication_slot() or CREATE_REPLICATION_SLOT or ALTER_REPLICATION_SLOT command, and postgres tracks the last_inactive_at for every slot based on which the slot gets invalidated. If this understanding is right, I can go ahead and work towards it. Alternatively, we can go the route of making GUC a list of key-value pairs of {slot_name, inactive_timeout}, but this kind of GUC for setting slot level parameters is going to be the first of its kind, so I'd prefer the above approach. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Sun, Mar 17, 2024 at 2:03 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > procArray->replication_slot_catalog_xmin) but then don't adjust it for > > 'max_slot_xid_age'. I could be missing something in this but it is > > better to keep discussing this and try to move with another parameter > > 'inactive_replication_slot_timeout' which according to me can be kept > > at slot level instead of a GUC but OTOH we need to see the arguments > > on both side and then decide which makes more sense. > > Hm. Are you suggesting inactive_timeout to be a slot level parameter > similar to 'failover' property added recently by > c393308b69d229b664391ac583b9e07418d411b6 and > 73292404370c9900a96e2bebdc7144f7010339cf? With this approach, one can > set inactive_timeout while creating the slot either via > pg_create_physical_replication_slot() or > pg_create_logical_replication_slot() or CREATE_REPLICATION_SLOT or > ALTER_REPLICATION_SLOT command, and postgres tracks the > last_inactive_at for every slot based on which the slot gets > invalidated. If this understanding is right, I can go ahead and work > towards it. > Yeah, I have something like that in mind. You can prepare the patch but it would be good if others involved in this thread can also share their opinion. > Alternatively, we can go the route of making GUC a list of key-value > pairs of {slot_name, inactive_timeout}, but this kind of GUC for > setting slot level parameters is going to be the first of its kind, so > I'd prefer the above approach. > I would prefer a slot-level parameter in this case rather than a GUC. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > What's proposed with max_slot_xid_age is that during checkpoint we > > look at slot's xmin and catalog_xmin, and the current system txn id. > > Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses > > max_slot_xid_age, we invalidate the slot. > > > > I can see that in your patch (in function > InvalidatePossiblyObsoleteSlot()). As per my understanding, we need > something similar for slot xids in ComputeXidHorizons() as we are > doing WAL in KeepLogSeg(). In KeepLogSeg(), we compute the minimum LSN > location required by slots and then adjust it for > 'max_slot_wal_keep_size'. On similar lines, currently in > ComputeXidHorizons(), we compute the minimum xid required by slots > (procArray->replication_slot_xmin and > procArray->replication_slot_catalog_xmin) but then don't adjust it for > 'max_slot_xid_age'. I could be missing something in this but it is > better to keep discussing this After invalidating slots because of max_slot_xid_age, the procArray->replication_slot_xmin and procArray->replication_slot_catalog_xmin are recomputed immediately in InvalidateObsoleteReplicationSlots->ReplicationSlotsComputeRequiredXmin->ProcArraySetReplicationSlotXmin. And, later the XID horizons in ComputeXidHorizons are computed before the vacuum on each table via GetOldestNonRemovableTransactionId. Aren't these enough? Do you want the XID horizons recomputed immediately, something like the below? /* Invalidate replication slots based on xmin or catalog_xmin age */ if (max_slot_xid_age > 0) { if (InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0, InvalidOid, InvalidTransactionId)) { ComputeXidHorizonsResult horizons; /* * Some slots have been invalidated; update the XID horizons * as a side-effect. */ ComputeXidHorizons(&horizons); } } -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Mar 18, 2024 at 9:58 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > What's proposed with max_slot_xid_age is that during checkpoint we > > > look at slot's xmin and catalog_xmin, and the current system txn id. > > > Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses > > > max_slot_xid_age, we invalidate the slot. > > > > > > > I can see that in your patch (in function > > InvalidatePossiblyObsoleteSlot()). As per my understanding, we need > > something similar for slot xids in ComputeXidHorizons() as we are > > doing WAL in KeepLogSeg(). In KeepLogSeg(), we compute the minimum LSN > > location required by slots and then adjust it for > > 'max_slot_wal_keep_size'. On similar lines, currently in > > ComputeXidHorizons(), we compute the minimum xid required by slots > > (procArray->replication_slot_xmin and > > procArray->replication_slot_catalog_xmin) but then don't adjust it for > > 'max_slot_xid_age'. I could be missing something in this but it is > > better to keep discussing this > > After invalidating slots because of max_slot_xid_age, the > procArray->replication_slot_xmin and > procArray->replication_slot_catalog_xmin are recomputed immediately in > InvalidateObsoleteReplicationSlots->ReplicationSlotsComputeRequiredXmin->ProcArraySetReplicationSlotXmin. > And, later the XID horizons in ComputeXidHorizons are computed before > the vacuum on each table via GetOldestNonRemovableTransactionId. > Aren't these enough? > IIUC, this will be delayed by one cycle in the vacuum rather than doing it when the slot's xmin age is crossed and it can be invalidated. Do you want the XID horizons recomputed > immediately, something like the below? > I haven't thought of the exact logic but we can try to mimic the handling similar to WAL. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Sat, Mar 16, 2024 at 09:29:01AM +0530, Bharath Rupireddy wrote: > I've also responded to Bertrand's comments here. Thanks! > > On Wed, Mar 6, 2024 at 3:56 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > 5 === > > > > -# Check conflict_reason is NULL for physical slot > > +# Check invalidation_reason is NULL for physical slot > > $res = $node_primary->safe_psql( > > 'postgres', qq[ > > - SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';] > > + SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';] > > ); > > > > > > I don't think this test is needed anymore: it does not make that much sense since > > it's done after the primary database initialization and startup. > > It is now turned into a test verifying 'conflicting boolean' is null > for the physical slot. Isn't that okay? Yeah makes more sense now, thanks! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Mon, Mar 18, 2024 at 08:50:56AM +0530, Amit Kapila wrote: > On Sun, Mar 17, 2024 at 2:03 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > procArray->replication_slot_catalog_xmin) but then don't adjust it for > > > 'max_slot_xid_age'. I could be missing something in this but it is > > > better to keep discussing this and try to move with another parameter > > > 'inactive_replication_slot_timeout' which according to me can be kept > > > at slot level instead of a GUC but OTOH we need to see the arguments > > > on both side and then decide which makes more sense. > > > > Hm. Are you suggesting inactive_timeout to be a slot level parameter > > similar to 'failover' property added recently by > > c393308b69d229b664391ac583b9e07418d411b6 and > > 73292404370c9900a96e2bebdc7144f7010339cf? With this approach, one can > > set inactive_timeout while creating the slot either via > > pg_create_physical_replication_slot() or > > pg_create_logical_replication_slot() or CREATE_REPLICATION_SLOT or > > ALTER_REPLICATION_SLOT command, and postgres tracks the > > last_inactive_at for every slot based on which the slot gets > > invalidated. If this understanding is right, I can go ahead and work > > towards it. > > > > Yeah, I have something like that in mind. You can prepare the patch > but it would be good if others involved in this thread can also share > their opinion. I think it makes sense to put the inactive_timeout granularity at the slot level (as the activity could vary a lot say between one slot linked to a subcription and one linked to some plugins). As far max_slot_xid_age I've the feeling that a new GUC is good enough. > > Alternatively, we can go the route of making GUC a list of key-value > > pairs of {slot_name, inactive_timeout}, but this kind of GUC for > > setting slot level parameters is going to be the first of its kind, so > > I'd prefer the above approach. > > > > I would prefer a slot-level parameter in this case rather than a GUC. Yeah, same here. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Sat, Mar 16, 2024 at 09:29:01AM +0530, Bharath Rupireddy wrote: > Please see the attached v11 patch set with all the above review > comments addressed. Thanks! Looking at 0001: 1 === + True if this logical slot conflicted with recovery (and so is now + invalidated). When this column is true, check Worth to add back the physical slot mention "Always NULL for physical slots."? 2 === @@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS L.wal_status, L.safe_wal_size, L.two_phase, - L.conflict_reason, + L.conflicting, L.failover, - L.synced + L.synced, + L.invalidation_reason What about making invalidation_reason close to conflict_reason? 3 === - * Maps a conflict reason for a replication slot to + * Maps a invalidation reason for a replication slot to s/a invalidation/an invalidation/? 4 === While at it, shouldn't we also rename "conflict" to say "invalidation_cause" in InvalidatePossiblyObsoleteSlot()? 5 === + * rows_removed and wal_level_insufficient are only two reasons s/are only two/are the only two/? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 14, 2024 at 12:27:26PM +0530, Amit Kapila wrote: > On Wed, Mar 13, 2024 at 10:16 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. > > > > > > JFYI, the patch does not apply to the head. There is a conflict in > > > multiple files. > > > > Thanks for looking into this. I noticed that the v8 patches needed > > rebase. Before I go do anything with the patches, I'm trying to gain > > consensus on the design. Following is the summary of design choices > > we've discussed so far: > > 1) conflict_reason vs invalidation_reason. > > 2) When to compute the XID age? > > > > I feel we should focus on two things (a) one is to introduce a new > column invalidation_reason, and (b) let's try to first complete > invalidation due to timeout. We can look into XID stuff if time > permits, remember, we don't have ample time left. Agree. While it makes sense to invalidate slots for wal removal in CreateCheckPoint() (because this is the place where wal is removed), I 'm not sure this is the right place for the 2 new cases. Let's focus on the timeout one as proposed above (as probably the simplest one): as this one is purely related to time and activity what about to invalidate them when?: - their usage resume - in pg_get_replication_slots() The idea is to invalidate the slot when one resumes activity on it or wants to get information about it (and among other things wants to know if the slot is valid or not). Thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Thu, Mar 14, 2024 at 12:27:26PM +0530, Amit Kapila wrote: > > On Wed, Mar 13, 2024 at 10:16 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. > > > > > > > > JFYI, the patch does not apply to the head. There is a conflict in > > > > multiple files. > > > > > > Thanks for looking into this. I noticed that the v8 patches needed > > > rebase. Before I go do anything with the patches, I'm trying to gain > > > consensus on the design. Following is the summary of design choices > > > we've discussed so far: > > > 1) conflict_reason vs invalidation_reason. > > > 2) When to compute the XID age? > > > > > > > I feel we should focus on two things (a) one is to introduce a new > > column invalidation_reason, and (b) let's try to first complete > > invalidation due to timeout. We can look into XID stuff if time > > permits, remember, we don't have ample time left. > > Agree. While it makes sense to invalidate slots for wal removal in > CreateCheckPoint() (because this is the place where wal is removed), I 'm not > sure this is the right place for the 2 new cases. > > Let's focus on the timeout one as proposed above (as probably the simplest one): > as this one is purely related to time and activity what about to invalidate them > when?: > > - their usage resume > - in pg_get_replication_slots() > > The idea is to invalidate the slot when one resumes activity on it or wants to > get information about it (and among other things wants to know if the slot is > valid or not). > Trying to invalidate at those two places makes sense to me but we still need to cover the cases where it takes very long to resume the slot activity and the dangling slot cases where the activity is never resumed. How about apart from the above two places, trying to invalidate in CheckPointReplicationSlots() where we are traversing all the slots? This could prevent invalid slots from being marked as dirty. BTW, how will the user use 'inactive_count' to know whether a replication slot is becoming inactive too frequently? The patch just keeps incrementing this counter, one will never know in the last 'n' minutes, how many times the slot became inactive unless there is some monitoring tool that keeps capturing this counter from time to time and calculates the frequency in some way. Even, if this is useful, it is not clear to me whether we need to store 'inactive_count' in the slot's persistent data. I understand it could be a metric required by the user but wouldn't it be better to track this via pg_stat_replication_slots such that we don't need to store this in slot's persist data? If this understanding is correct, I would say let's remove 'inactive_count' as well from the main patch and discuss it separately. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote: > On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > Agree. While it makes sense to invalidate slots for wal removal in > > CreateCheckPoint() (because this is the place where wal is removed), I 'm not > > sure this is the right place for the 2 new cases. > > > > Let's focus on the timeout one as proposed above (as probably the simplest one): > > as this one is purely related to time and activity what about to invalidate them > > when?: > > > > - their usage resume > > - in pg_get_replication_slots() > > > > The idea is to invalidate the slot when one resumes activity on it or wants to > > get information about it (and among other things wants to know if the slot is > > valid or not). > > > > Trying to invalidate at those two places makes sense to me but we > still need to cover the cases where it takes very long to resume the > slot activity and the dangling slot cases where the activity is never > resumed. I understand it's better to have the slot reflecting its real status internally but it is a real issue if that's not the case until the activity on it is resumed? (just asking, not saying we should not) > How about apart from the above two places, trying to > invalidate in CheckPointReplicationSlots() where we are traversing all > the slots? I think that's a good place but there is still a window of time (that could also be "large" depending of the activity and the checkpoint frequency) during which the slot is not known as invalid internally. But yeah, at leat we know that we'll mark it as invalid at some point... BTW: if (am_walsender) { + if (slot->data.persistency == RS_PERSISTENT) + { + SpinLockAcquire(&slot->mutex); + slot->data.last_inactive_at = GetCurrentTimestamp(); + slot->data.inactive_count++; + SpinLockRelease(&slot->mutex); I'm also feeling the same concern as Shveta mentioned in [1]: that a "normal" backend using pg_logical_slot_get_changes() or friends would not set the last_inactive_at. [1]: https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 19, 2024 at 3:11 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote: > > On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > Agree. While it makes sense to invalidate slots for wal removal in > > > CreateCheckPoint() (because this is the place where wal is removed), I 'm not > > > sure this is the right place for the 2 new cases. > > > > > > Let's focus on the timeout one as proposed above (as probably the simplest one): > > > as this one is purely related to time and activity what about to invalidate them > > > when?: > > > > > > - their usage resume > > > - in pg_get_replication_slots() > > > > > > The idea is to invalidate the slot when one resumes activity on it or wants to > > > get information about it (and among other things wants to know if the slot is > > > valid or not). > > > > > > > Trying to invalidate at those two places makes sense to me but we > > still need to cover the cases where it takes very long to resume the > > slot activity and the dangling slot cases where the activity is never > > resumed. > > I understand it's better to have the slot reflecting its real status internally > but it is a real issue if that's not the case until the activity on it is resumed? > (just asking, not saying we should not) > Sorry, I didn't understand your point. Can you try to explain by example? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 19, 2024 at 04:20:35PM +0530, Amit Kapila wrote: > On Tue, Mar 19, 2024 at 3:11 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote: > > > On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > Agree. While it makes sense to invalidate slots for wal removal in > > > > CreateCheckPoint() (because this is the place where wal is removed), I 'm not > > > > sure this is the right place for the 2 new cases. > > > > > > > > Let's focus on the timeout one as proposed above (as probably the simplest one): > > > > as this one is purely related to time and activity what about to invalidate them > > > > when?: > > > > > > > > - their usage resume > > > > - in pg_get_replication_slots() > > > > > > > > The idea is to invalidate the slot when one resumes activity on it or wants to > > > > get information about it (and among other things wants to know if the slot is > > > > valid or not). > > > > > > > > > > Trying to invalidate at those two places makes sense to me but we > > > still need to cover the cases where it takes very long to resume the > > > slot activity and the dangling slot cases where the activity is never > > > resumed. > > > > I understand it's better to have the slot reflecting its real status internally > > but it is a real issue if that's not the case until the activity on it is resumed? > > (just asking, not saying we should not) > > > > Sorry, I didn't understand your point. Can you try to explain by example? Sorry if that was not clear, let me try to rephrase it first: what issue to you see if the invalidation of such a slot occurs only when its usage resume or when pg_get_replication_slots() is triggered? I understand that this could lead to the slot not being invalidated (maybe forever) but is that an issue for an inactive slot? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > Hm. Are you suggesting inactive_timeout to be a slot level parameter > > > similar to 'failover' property added recently by > > > c393308b69d229b664391ac583b9e07418d411b6 and > > > 73292404370c9900a96e2bebdc7144f7010339cf? > > > > Yeah, I have something like that in mind. You can prepare the patch > > but it would be good if others involved in this thread can also share > > their opinion. > > I think it makes sense to put the inactive_timeout granularity at the slot > level (as the activity could vary a lot say between one slot linked to a > subcription and one linked to some plugins). As far max_slot_xid_age I've the > feeling that a new GUC is good enough. Well, here I'm implementing the above idea. The attached v12 patches majorly have the following changes: 1. inactive_timeout is now slot-level, that is, one can set it while creating the slot either via SQL functions or via replication commands or via subscription. 2. last_inactive_at and inactive_timeout are now tracked in on-disk replication slot data structure. 3. last_inactive_at is now set even for non-walsenders whenever the slot is released as opposed to initial versions of the patches setting it only for walsenders. 4. slot's inactive_timeout parameter is now migrated to the new cluster with pg_upgrade. 5. slot's inactive_timeout parameter is now synced to the standby when failover is enabled for the slot. 6. Test cases are added to cover most of the above cases including new invalidation mechanisms. Following are some open points: 1. Where to do inactive_timeout invalidation exactly if not the checkpointer. 2. Where to do XID age invalidation exactly if not the checkpointer. 3. How to go about recomputing XID horizons based on max_slot_xid_age. Does the slot's horizon's need to be adjusted in ComputeXidHorizons()? 4. New invalidation mechanisms interaction with slot sync feature. 5. Review comments on 0001 from Bertrand. Please see the attached v12 patches. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
- v12-0001-Track-invalidation_reason-in-pg_replication_slot.patch
- v12-0002-Track-last_inactive_at-for-replication-slots.patch
- v12-0003-Allow-setting-inactive_timeout-for-replication-s.patch
- v12-0004-Allow-setting-inactive_timeout-in-the-replicatio.patch
- v12-0005-Add-inactive_timeout-option-to-subscriptions.patch
- v12-0006-Add-inactive_timeout-based-replication-slot-inva.patch
- v12-0007-Add-XID-age-based-replication-slot-invalidation.patch
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 19, 2024 at 6:12 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Tue, Mar 19, 2024 at 04:20:35PM +0530, Amit Kapila wrote: > > On Tue, Mar 19, 2024 at 3:11 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote: > > > > On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot > > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > Agree. While it makes sense to invalidate slots for wal removal in > > > > > CreateCheckPoint() (because this is the place where wal is removed), I 'm not > > > > > sure this is the right place for the 2 new cases. > > > > > > > > > > Let's focus on the timeout one as proposed above (as probably the simplest one): > > > > > as this one is purely related to time and activity what about to invalidate them > > > > > when?: > > > > > > > > > > - their usage resume > > > > > - in pg_get_replication_slots() > > > > > > > > > > The idea is to invalidate the slot when one resumes activity on it or wants to > > > > > get information about it (and among other things wants to know if the slot is > > > > > valid or not). > > > > > > > > > > > > > Trying to invalidate at those two places makes sense to me but we > > > > still need to cover the cases where it takes very long to resume the > > > > slot activity and the dangling slot cases where the activity is never > > > > resumed. > > > > > > I understand it's better to have the slot reflecting its real status internally > > > but it is a real issue if that's not the case until the activity on it is resumed? > > > (just asking, not saying we should not) > > > > > > > Sorry, I didn't understand your point. Can you try to explain by example? > > Sorry if that was not clear, let me try to rephrase it first: what issue to you > see if the invalidation of such a slot occurs only when its usage resume or > when pg_get_replication_slots() is triggered? I understand that this could lead > to the slot not being invalidated (maybe forever) but is that an issue for an > inactive slot? > It has the risk of preventing WAL and row removal. I think this is the primary reason we are at the first place planning to have such a parameter. So, we should have some way to invalidate it even when the walsender/backend process doesn't use it again. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > Following are some open points: > > 1. Where to do inactive_timeout invalidation exactly if not the checkpointer. > I have suggested to do it at the time of CheckpointReplicationSlots() and Bertrand suggested to do it whenever we resume using the slot. I think we should follow both the suggestions. > 2. Where to do XID age invalidation exactly if not the checkpointer. > 3. How to go about recomputing XID horizons based on max_slot_xid_age. > Does the slot's horizon's need to be adjusted in ComputeXidHorizons()? > I suggest postponing the patch for xid based invalidation for a later discussion. > 4. New invalidation mechanisms interaction with slot sync feature. > Yeah, this is important. My initial thoughts are that synced slots shouldn't be invalidated on the standby due to timeout. > 5. Review comments on 0001 from Bertrand. > > Please see the attached v12 patches. > Thanks for quickly updating the patches. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 18, 2024 at 3:42 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > Looking at 0001: Thanks for reviewing. > 1 === > > + True if this logical slot conflicted with recovery (and so is now > + invalidated). When this column is true, check > > Worth to add back the physical slot mention "Always NULL for physical slots."? Will change. > 2 === > > @@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS > L.wal_status, > L.safe_wal_size, > L.two_phase, > - L.conflict_reason, > + L.conflicting, > L.failover, > - L.synced > + L.synced, > + L.invalidation_reason > > What about making invalidation_reason close to conflict_reason? Not required I think. One can pick the required columns in the SELECT clause anyways. > 3 === > > - * Maps a conflict reason for a replication slot to > + * Maps a invalidation reason for a replication slot to > > s/a invalidation/an invalidation/? Will change. > 4 === > > While at it, shouldn't we also rename "conflict" to say "invalidation_cause" in > InvalidatePossiblyObsoleteSlot()? That's inline with our understanding about conflict vs invalidation, and keeps the function generic. Will change. > 5 === > > + * rows_removed and wal_level_insufficient are only two reasons > > s/are only two/are the only two/? Will change.. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 20, 2024 at 08:58:05AM +0530, Amit Kapila wrote: > On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Following are some open points: > > > > 1. Where to do inactive_timeout invalidation exactly if not the checkpointer. > > > > I have suggested to do it at the time of CheckpointReplicationSlots() > and Bertrand suggested to do it whenever we resume using the slot. I > think we should follow both the suggestions. Agree. I also think that pg_get_replication_slots() would be a good place, so that queries would return the right invalidation status. > > 4. New invalidation mechanisms interaction with slot sync feature. > > > > Yeah, this is important. My initial thoughts are that synced slots > shouldn't be invalidated on the standby due to timeout. +1 Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Hm. Are you suggesting inactive_timeout to be a slot level parameter > > > > similar to 'failover' property added recently by > > > > c393308b69d229b664391ac583b9e07418d411b6 and > > > > 73292404370c9900a96e2bebdc7144f7010339cf? > > > > > > Yeah, I have something like that in mind. You can prepare the patch > > > but it would be good if others involved in this thread can also share > > > their opinion. > > > > I think it makes sense to put the inactive_timeout granularity at the slot > > level (as the activity could vary a lot say between one slot linked to a > > subcription and one linked to some plugins). As far max_slot_xid_age I've the > > feeling that a new GUC is good enough. > > Well, here I'm implementing the above idea. Thanks! > The attached v12 patches > majorly have the following changes: > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk > replication slot data structure. Should last_inactive_at be tracked on disk? Say the engine is down for a period of time > inactive_timeout then the slot will be invalidated after the engine re-start (if no activity before we invalidate the slot). Should the time the engine is down be counted as "inactive" time? I've the feeling it should not, and that we should only take into account inactive time while the engine is up. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Hm. Are you suggesting inactive_timeout to be a slot level parameter > > > > similar to 'failover' property added recently by > > > > c393308b69d229b664391ac583b9e07418d411b6 and > > > > 73292404370c9900a96e2bebdc7144f7010339cf? > > > > > > Yeah, I have something like that in mind. You can prepare the patch > > > but it would be good if others involved in this thread can also share > > > their opinion. > > > > I think it makes sense to put the inactive_timeout granularity at the slot > > level (as the activity could vary a lot say between one slot linked to a > > subcription and one linked to some plugins). As far max_slot_xid_age I've the > > feeling that a new GUC is good enough. > > Well, here I'm implementing the above idea. The attached v12 patches > majorly have the following changes: > Regarding v12-0004: "Allow setting inactive_timeout in the replication command", shouldn't we also add an new SQL API say: pg_alter_replication_slot() that would allow to change the timeout property? That would allow users to alter this property without the need to make a replication connection. But the issue is that it would make it inconsistent with the new inactivetimeout in the subscription that is added in "v12-0005". But do we need to display subinactivetimeout in pg_subscription (and even allow it at subscription creation / alter) after all? (I've the feeling there is less such a need as compare to subfailover, subtwophasestate for example). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 20, 2024 at 1:04 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Wed, Mar 20, 2024 at 08:58:05AM +0530, Amit Kapila wrote: > > On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Following are some open points: > > > > > > 1. Where to do inactive_timeout invalidation exactly if not the checkpointer. > > > > > I have suggested to do it at the time of CheckpointReplicationSlots() > > and Bertrand suggested to do it whenever we resume using the slot. I > > think we should follow both the suggestions. > > Agree. I also think that pg_get_replication_slots() would be a good place, so > that queries would return the right invalidation status. I've addressed review comments and attaching the v13 patches with the following changes: 1. Invalidate replication slot due to inactive_timeout: 1.1 In CheckpointReplicationSlots() to help with automatic invalidation. 1.2 In pg_get_replication_slots to help readers see the latest slot information. 1.3 In ReplicationSlotAcquire for walsenders as typically walsenders are the ones that use slots for longer durations for streaming standbys and logical subscribers. 1.4 In ReplicationSlotAcquire when called from pg_logical_slot_get_changes_guts to help with logical decoding clients to disallow decoding from invalidated slots. 1.5 In ReplicationSlotAcquire when called from pg_replication_slot_advance to help with disallowing advancing invalidated slots. 2. Have a new input parameter bool check_for_invalidation for ReplicationSlotAcquire(). When true, check for the inactive_timeout invalidation, if invalidated, error out. 3. Have a new function to just do inactive_timeout invalidation. 4. Do not update last_inactive_at for failover slots on standby to not invalidate failover slots on the standby. 5. In ReplicationSlotAcquire(), invalidate the slot before making it active. 6. Make last_inactive_at a shared-memory parameter as opposed to an on-disk parameter to help not count the server downtime for inactive time. 7. Let the failover slot on standby and pg_upgraded slots get inactive_timeout parameter from the primary and old cluster respectively. Please see the attached v13 patches. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
- v13-0001-Track-invalidation_reason-in-pg_replication_slot.patch
- v13-0002-Track-last_inactive_at-for-replication-slots-in-.patch
- v13-0003-Allow-setting-inactive_timeout-for-replication-s.patch
- v13-0004-Allow-setting-inactive_timeout-in-the-replicatio.patch
- v13-0005-Add-inactive_timeout-option-to-subscriptions.patch
- v13-0006-Add-inactive_timeout-based-replication-slot-inva.patch
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 20, 2024 at 7:08 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Regarding v12-0004: "Allow setting inactive_timeout in the replication command", > shouldn't we also add an new SQL API say: pg_alter_replication_slot() that would > allow to change the timeout property? > > That would allow users to alter this property without the need to make a > replication connection. +1 to add a new SQL function pg_alter_replication_slot(). It helps first create the slots and then later decide the appropriate inactive_timeout. It might grow into altering other slot parameters such as failover (I'm not sure if altering failover property on the primary after a while makes it the right candidate for syncing on the standby). Perhaps, we can add it for altering just inactive_timeout for now and be done with it. FWIW, ALTER_REPLICATION_SLOT was added keeping in mind just the failover property for logical slots, that's why it emits an error "cannot use ALTER_REPLICATION_SLOT with a physical replication slot" > But the issue is that it would make it inconsistent with the new inactivetimeout > in the subscription that is added in "v12-0005". Can you please elaborate what the inconsistency it causes with inactivetimeout? > But do we need to display > subinactivetimeout in pg_subscription (and even allow it at subscription creation > / alter) after all? (I've the feeling there is less such a need as compare to > subfailover, subtwophasestate for example). Maybe we don't need to. One can always trace down to the replication slot associated with the subscription on the publisher, and get to know what the slot's inactive_timeout setting is. However, it looks to me that it avoids one going to the publisher to know the inactive_timeout value for a subscription. Moreover, we are allowing the inactive_timeout to be set via CREATE/ALTER SUBSCRIPTION command, I believe there's nothing wrong if it's also part of the pg_subscription catalog. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > > > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk > > replication slot data structure. > > Should last_inactive_at be tracked on disk? Say the engine is down for a period > of time > inactive_timeout then the slot will be invalidated after the engine > re-start (if no activity before we invalidate the slot). Should the time the > engine is down be counted as "inactive" time? I've the feeling it should not, and > that we should only take into account inactive time while the engine is up. > Good point. The question is how do we achieve this without persisting the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot had some valid value before we shut down but it still didn't cross the configured 'inactive_timeout' value, so, we won't be able to invalidate it. Now, after the restart, as we don't know the last_inactive_at's value before the shutdown, we will initialize it with 0 (this is what Bharath seems to have done in the latest v13-0002* patch). After this, even if walsender or backend never acquires the slot, we won't invalidate it. OTOH, if we track 'last_inactive_at' on the disk, after, restart, we could initialize it to the current time if the value is non-zero. Do you have any better ideas? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Mar 21, 2024 at 5:19 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 20, 2024 at 7:08 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Regarding v12-0004: "Allow setting inactive_timeout in the replication command", > > shouldn't we also add an new SQL API say: pg_alter_replication_slot() that would > > allow to change the timeout property? > > > > That would allow users to alter this property without the need to make a > > replication connection. > > +1 to add a new SQL function pg_alter_replication_slot(). > I also don't see any obvious problem with such an API. However, this is not a good time to invent new APIs. Let's keep the feature simple and then we can extend it in the next version after more discussion and probably by that time we will get some feedback from the field as well. > > It helps > first create the slots and then later decide the appropriate > inactive_timeout. It might grow into altering other slot parameters > such as failover (I'm not sure if altering failover property on the > primary after a while makes it the right candidate for syncing on the > standby). Perhaps, we can add it for altering just inactive_timeout > for now and be done with it. > > FWIW, ALTER_REPLICATION_SLOT was added keeping in mind just the > failover property for logical slots, that's why it emits an error > "cannot use ALTER_REPLICATION_SLOT with a physical replication slot" > > > But the issue is that it would make it inconsistent with the new inactivetimeout > > in the subscription that is added in "v12-0005". > > Can you please elaborate what the inconsistency it causes with inactivetimeout? > I think the inconsistency can arise from the fact that on publisher one can change the inactive_timeout for the slot corresponding to a subscription but the subscriber won't know, so it will still show the old value. If we want we can document this as a limitation and let users be aware of it. However, I feel at this stage, let's not even expose this from the subscription or maybe we can discuss it once/if we are done with other patches. Anyway, if one wants to use this feature with a subscription, she can create a slot first on the publisher with inactive_timeout value and then associate such a slot with a required subscription. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Mar 21, 2024 at 9:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I also don't see any obvious problem with such an API. However, this > is not a good time to invent new APIs. Let's keep the feature simple > and then we can extend it in the next version after more discussion > and probably by that time we will get some feedback from the field as > well. I couldn't agree more. > > > But the issue is that it would make it inconsistent with the new inactivetimeout > > > in the subscription that is added in "v12-0005". > > > > Can you please elaborate what the inconsistency it causes with inactivetimeout? > > > I think the inconsistency can arise from the fact that on publisher > one can change the inactive_timeout for the slot corresponding to a > subscription but the subscriber won't know, so it will still show the > old value. Understood. > If we want we can document this as a limitation and let > users be aware of it. However, I feel at this stage, let's not even > expose this from the subscription or maybe we can discuss it once/if > we are done with other patches. Anyway, if one wants to use this > feature with a subscription, she can create a slot first on the > publisher with inactive_timeout value and then associate such a slot > with a required subscription. If we are not exposing it via subscription (meaning, we don't consider v13-0004 and v13-0005 patches), I feel we can have a new SQL API pg_alter_replication_slot(int inactive_timeout) for now just altering the inactive_timeout of a given slot. With this approach, one can do either of the following: 1) Create a slot with SQL API with inactive_timeout set, and use it for subscriptions or for streaming standbys. 2) Create a slot with SQL API without inactive_timeout set, use it for subscriptions or for streaming standbys, and set inactive_timeout later via pg_alter_replication_slot() depending on how the slot is consumed 3) Create a subscription with create_slot=true, and set inactive_timeout via pg_alter_replication_slot() depending on how the slot is consumed. This approach seems consistent and minimal to start with. If we agree on this, I'll drop both 0004 and 0005 that are allowing inactive_timeout to be set via replication commands and via create/alter subscription respectively, and implement pg_alter_replication_slot(). FWIW, adding the new SQL API pg_alter_replication_slot() isn't that hard. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Mar 21, 2024 at 8:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > > > > > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk > > > replication slot data structure. > > > > Should last_inactive_at be tracked on disk? Say the engine is down for a period > > of time > inactive_timeout then the slot will be invalidated after the engine > > re-start (if no activity before we invalidate the slot). Should the time the > > engine is down be counted as "inactive" time? I've the feeling it should not, and > > that we should only take into account inactive time while the engine is up. > > > > Good point. The question is how do we achieve this without persisting > the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot > had some valid value before we shut down but it still didn't cross the > configured 'inactive_timeout' value, so, we won't be able to > invalidate it. Now, after the restart, as we don't know the > last_inactive_at's value before the shutdown, we will initialize it > with 0 (this is what Bharath seems to have done in the latest > v13-0002* patch). After this, even if walsender or backend never > acquires the slot, we won't invalidate it. OTOH, if we track > 'last_inactive_at' on the disk, after, restart, we could initialize it > to the current time if the value is non-zero. Do you have any better > ideas? This sounds reasonable to me at least. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote: > On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > > > > > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk > > > replication slot data structure. > > > > Should last_inactive_at be tracked on disk? Say the engine is down for a period > > of time > inactive_timeout then the slot will be invalidated after the engine > > re-start (if no activity before we invalidate the slot). Should the time the > > engine is down be counted as "inactive" time? I've the feeling it should not, and > > that we should only take into account inactive time while the engine is up. > > > > Good point. The question is how do we achieve this without persisting > the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot > had some valid value before we shut down but it still didn't cross the > configured 'inactive_timeout' value, so, we won't be able to > invalidate it. Now, after the restart, as we don't know the > last_inactive_at's value before the shutdown, we will initialize it > with 0 (this is what Bharath seems to have done in the latest > v13-0002* patch). After this, even if walsender or backend never > acquires the slot, we won't invalidate it. OTOH, if we track > 'last_inactive_at' on the disk, after, restart, we could initialize it > to the current time if the value is non-zero. Do you have any better > ideas? > I think that setting last_inactive_at when we restart makes sense if the slot has been active previously. I think the idea is because it's holding xmin/catalog_xmin and that we don't want to prevent rows removal longer that the timeout. So what about relying on xmin/catalog_xmin instead that way? - For physical slots if xmin is set then set last_inactive_at to the current time at restart (else zero). - For logical slot, it's not the same as the catalog_xmin is set at the slot creation time. So what about setting last_inactive_at at the current time at restart but also at creation time for logical slot? (Setting it to zero at creation time (as we do in v13) does not look right, given the fact that it's "already" holding a catalog_xmin). That way, we'd ensure that we are not holding rows for longer that the timeout and we don't need to persist last_inactive_at. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 21, 2024 at 10:53:54AM +0530, Bharath Rupireddy wrote: > On Thu, Mar 21, 2024 at 9:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > But the issue is that it would make it inconsistent with the new inactivetimeout > > > > in the subscription that is added in "v12-0005". > > > > > > Can you please elaborate what the inconsistency it causes with inactivetimeout? > > > > > I think the inconsistency can arise from the fact that on publisher > > one can change the inactive_timeout for the slot corresponding to a > > subscription but the subscriber won't know, so it will still show the > > old value. Yeah, that was what I had in mind. > > If we want we can document this as a limitation and let > > users be aware of it. However, I feel at this stage, let's not even > > expose this from the subscription or maybe we can discuss it once/if > > we are done with other patches. I agree, it's important to expose it for things like "failover" but I think we can get rid of it for the timeout one. >> Anyway, if one wants to use this > > feature with a subscription, she can create a slot first on the > > publisher with inactive_timeout value and then associate such a slot > > with a required subscription. Right. > > If we are not exposing it via subscription (meaning, we don't consider > v13-0004 and v13-0005 patches), I feel we can have a new SQL API > pg_alter_replication_slot(int inactive_timeout) for now just altering > the inactive_timeout of a given slot. Agree, that seems more "natural" that going through a replication connection. > With this approach, one can do either of the following: > 1) Create a slot with SQL API with inactive_timeout set, and use it > for subscriptions or for streaming standbys. Yes. > 2) Create a slot with SQL API without inactive_timeout set, use it for > subscriptions or for streaming standbys, and set inactive_timeout > later via pg_alter_replication_slot() depending on how the slot is > consumed Yes. > 3) Create a subscription with create_slot=true, and set > inactive_timeout via pg_alter_replication_slot() depending on how the > slot is consumed. Yes. We could also do the above 3 and altering the timeout with a replication connection but the SQL API seems more natural to me. > > This approach seems consistent and minimal to start with. > > If we agree on this, I'll drop both 0004 and 0005 that are allowing > inactive_timeout to be set via replication commands and via > create/alter subscription respectively, and implement > pg_alter_replication_slot(). +1 on this. > FWIW, adding the new SQL API pg_alter_replication_slot() isn't that hard. Also I think we should ensure that one could "only" alter the timeout property for the time being (if not that could lead to the subscription inconsistency mentioned above). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Mar 21, 2024 at 11:23 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote: > > On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > > > > > > > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk > > > > replication slot data structure. > > > > > > Should last_inactive_at be tracked on disk? Say the engine is down for a period > > > of time > inactive_timeout then the slot will be invalidated after the engine > > > re-start (if no activity before we invalidate the slot). Should the time the > > > engine is down be counted as "inactive" time? I've the feeling it should not, and > > > that we should only take into account inactive time while the engine is up. > > > > > > > Good point. The question is how do we achieve this without persisting > > the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot > > had some valid value before we shut down but it still didn't cross the > > configured 'inactive_timeout' value, so, we won't be able to > > invalidate it. Now, after the restart, as we don't know the > > last_inactive_at's value before the shutdown, we will initialize it > > with 0 (this is what Bharath seems to have done in the latest > > v13-0002* patch). After this, even if walsender or backend never > > acquires the slot, we won't invalidate it. OTOH, if we track > > 'last_inactive_at' on the disk, after, restart, we could initialize it > > to the current time if the value is non-zero. Do you have any better > > ideas? > > > > I think that setting last_inactive_at when we restart makes sense if the slot > has been active previously. I think the idea is because it's holding xmin/catalog_xmin > and that we don't want to prevent rows removal longer that the timeout. > > So what about relying on xmin/catalog_xmin instead that way? > That doesn't sound like a great idea because xmin/catalog_xmin values won't tell us before restart whether it was active or not. It could have been inactive for long time before restart but the xmin values could still be valid. What about we always set 'last_inactive_at' at restart (if the slot's inactive_timeout has non-zero value) and reset it as soon as someone acquires that slot? Now, if the slot doesn't get acquired till 'inactive_timeout', checkpointer will invalidate the slot. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Mar 21, 2024 at 11:37 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Thu, Mar 21, 2024 at 10:53:54AM +0530, Bharath Rupireddy wrote: > > On Thu, Mar 21, 2024 at 9:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > But the issue is that it would make it inconsistent with the new inactivetimeout > > > > > in the subscription that is added in "v12-0005". > > > > > > > > Can you please elaborate what the inconsistency it causes with inactivetimeout? > > > > > > > I think the inconsistency can arise from the fact that on publisher > > > one can change the inactive_timeout for the slot corresponding to a > > > subscription but the subscriber won't know, so it will still show the > > > old value. > > Yeah, that was what I had in mind. > > > > If we want we can document this as a limitation and let > > > users be aware of it. However, I feel at this stage, let's not even > > > expose this from the subscription or maybe we can discuss it once/if > > > we are done with other patches. > > I agree, it's important to expose it for things like "failover" but I think we > can get rid of it for the timeout one. > > >> Anyway, if one wants to use this > > > feature with a subscription, she can create a slot first on the > > > publisher with inactive_timeout value and then associate such a slot > > > with a required subscription. > > Right. > > > > > If we are not exposing it via subscription (meaning, we don't consider > > v13-0004 and v13-0005 patches), I feel we can have a new SQL API > > pg_alter_replication_slot(int inactive_timeout) for now just altering > > the inactive_timeout of a given slot. > > Agree, that seems more "natural" that going through a replication connection. > > > With this approach, one can do either of the following: > > 1) Create a slot with SQL API with inactive_timeout set, and use it > > for subscriptions or for streaming standbys. > > Yes. > > > 2) Create a slot with SQL API without inactive_timeout set, use it for > > subscriptions or for streaming standbys, and set inactive_timeout > > later via pg_alter_replication_slot() depending on how the slot is > > consumed > > Yes. > > > 3) Create a subscription with create_slot=true, and set > > inactive_timeout via pg_alter_replication_slot() depending on how the > > slot is consumed. > > Yes. > > We could also do the above 3 and altering the timeout with a replication > connection but the SQL API seems more natural to me. > If we want to go with this then I think we should at least ensure that if one specified timeout via CREATE_REPLICATION_SLOT or ALTER_REPLICATION_SLOT that should be honored. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 21, 2024 at 11:43:54AM +0530, Amit Kapila wrote: > On Thu, Mar 21, 2024 at 11:23 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote: > > > On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > > > > > > > > > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk > > > > > replication slot data structure. > > > > > > > > Should last_inactive_at be tracked on disk? Say the engine is down for a period > > > > of time > inactive_timeout then the slot will be invalidated after the engine > > > > re-start (if no activity before we invalidate the slot). Should the time the > > > > engine is down be counted as "inactive" time? I've the feeling it should not, and > > > > that we should only take into account inactive time while the engine is up. > > > > > > > > > > Good point. The question is how do we achieve this without persisting > > > the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot > > > had some valid value before we shut down but it still didn't cross the > > > configured 'inactive_timeout' value, so, we won't be able to > > > invalidate it. Now, after the restart, as we don't know the > > > last_inactive_at's value before the shutdown, we will initialize it > > > with 0 (this is what Bharath seems to have done in the latest > > > v13-0002* patch). After this, even if walsender or backend never > > > acquires the slot, we won't invalidate it. OTOH, if we track > > > 'last_inactive_at' on the disk, after, restart, we could initialize it > > > to the current time if the value is non-zero. Do you have any better > > > ideas? > > > > > > > I think that setting last_inactive_at when we restart makes sense if the slot > > has been active previously. I think the idea is because it's holding xmin/catalog_xmin > > and that we don't want to prevent rows removal longer that the timeout. > > > > So what about relying on xmin/catalog_xmin instead that way? > > > > That doesn't sound like a great idea because xmin/catalog_xmin values > won't tell us before restart whether it was active or not. It could > have been inactive for long time before restart but the xmin values > could still be valid. Right, the idea here was more like "don't hold xmin/catalog_xmin" for longer than timeout. My concern was that we set catalog_xmin at logical slot creation time. So if we set last_inactive_at to zero at creation time and the slot is not used for a long period of time > timeout, then I think it's not helping there. > What about we always set 'last_inactive_at' at > restart (if the slot's inactive_timeout has non-zero value) and reset > it as soon as someone acquires that slot? Now, if the slot doesn't get > acquired till 'inactive_timeout', checkpointer will invalidate the > slot. Yeah that sounds good to me, but I think we should set last_inactive_at at creation time too, if not: - physical slot could remain valid for long time after creation (which is fine) but the behavior would change at restart. - logical slot would have the "issue" reported above (holding catalog_xmin). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 21, 2024 at 11:53:32AM +0530, Amit Kapila wrote: > On Thu, Mar 21, 2024 at 11:37 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > We could also do the above 3 and altering the timeout with a replication > > connection but the SQL API seems more natural to me. > > > > If we want to go with this then I think we should at least ensure that > if one specified timeout via CREATE_REPLICATION_SLOT or > ALTER_REPLICATION_SLOT that should be honored. Yeah, agree. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 21, 2024 at 05:05:46AM +0530, Bharath Rupireddy wrote: > On Wed, Mar 20, 2024 at 1:04 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Wed, Mar 20, 2024 at 08:58:05AM +0530, Amit Kapila wrote: > > > On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy > > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > > > Following are some open points: > > > > > > > > 1. Where to do inactive_timeout invalidation exactly if not the checkpointer. > > > > > > > I have suggested to do it at the time of CheckpointReplicationSlots() > > > and Bertrand suggested to do it whenever we resume using the slot. I > > > think we should follow both the suggestions. > > > > Agree. I also think that pg_get_replication_slots() would be a good place, so > > that queries would return the right invalidation status. > > I've addressed review comments and attaching the v13 patches with the > following changes: Thanks! v13-0001 looks good to me. The only Nit (that I've mentioned up-thread) is that in the pg_replication_slots view, the invalidation_reason is "far away" from the conflicting field. I understand that one could query the fields individually but when describing the view or reading the doc, it seems more appropriate to see them closer. Also as "failover" and "synced" are also new in version 17, there is no risk to break order by "17,18" kind of queries (which are the failover and sync positions). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Mar 21, 2024 at 12:40 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > v13-0001 looks good to me. The only Nit (that I've mentioned up-thread) is that > in the pg_replication_slots view, the invalidation_reason is "far away" from the > conflicting field. I understand that one could query the fields individually but > when describing the view or reading the doc, it seems more appropriate to see > them closer. Also as "failover" and "synced" are also new in version 17, there > is no risk to break order by "17,18" kind of queries (which are the failover > and sync positions). Hm, yeah, I can change that in the next version of the patches. Thanks. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Mar 21, 2024 at 12:15 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Thu, Mar 21, 2024 at 11:43:54AM +0530, Amit Kapila wrote: > > On Thu, Mar 21, 2024 at 11:23 AM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote: > > > > On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot > > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > > > > > > > > > > > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk > > > > > > replication slot data structure. > > > > > > > > > > Should last_inactive_at be tracked on disk? Say the engine is down for a period > > > > > of time > inactive_timeout then the slot will be invalidated after the engine > > > > > re-start (if no activity before we invalidate the slot). Should the time the > > > > > engine is down be counted as "inactive" time? I've the feeling it should not, and > > > > > that we should only take into account inactive time while the engine is up. > > > > > > > > > > > > > Good point. The question is how do we achieve this without persisting > > > > the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot > > > > had some valid value before we shut down but it still didn't cross the > > > > configured 'inactive_timeout' value, so, we won't be able to > > > > invalidate it. Now, after the restart, as we don't know the > > > > last_inactive_at's value before the shutdown, we will initialize it > > > > with 0 (this is what Bharath seems to have done in the latest > > > > v13-0002* patch). After this, even if walsender or backend never > > > > acquires the slot, we won't invalidate it. OTOH, if we track > > > > 'last_inactive_at' on the disk, after, restart, we could initialize it > > > > to the current time if the value is non-zero. Do you have any better > > > > ideas? > > > > > > > > > > I think that setting last_inactive_at when we restart makes sense if the slot > > > has been active previously. I think the idea is because it's holding xmin/catalog_xmin > > > and that we don't want to prevent rows removal longer that the timeout. > > > > > > So what about relying on xmin/catalog_xmin instead that way? > > > > > > > That doesn't sound like a great idea because xmin/catalog_xmin values > > won't tell us before restart whether it was active or not. It could > > have been inactive for long time before restart but the xmin values > > could still be valid. > > Right, the idea here was more like "don't hold xmin/catalog_xmin" for longer > than timeout. > > My concern was that we set catalog_xmin at logical slot creation time. So if we > set last_inactive_at to zero at creation time and the slot is not used for a long > period of time > timeout, then I think it's not helping there. > But, we do call ReplicationSlotRelease() after slot creation. For example, see CreateReplicationSlot(). So wouldn't that take care of the case you are worried about? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Mar 21, 2024 at 3:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > My concern was that we set catalog_xmin at logical slot creation time. So if we > > set last_inactive_at to zero at creation time and the slot is not used for a long > > period of time > timeout, then I think it's not helping there. > > But, we do call ReplicationSlotRelease() after slot creation. For > example, see CreateReplicationSlot(). So wouldn't that take care of > the case you are worried about? Right. That's true even for pg_create_physical_replication_slot and pg_create_logical_replication_slot. AFAICS, setting it to the current timestamp in ReplicationSlotRelease suffices unless I'm missing something. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Mar 21, 2024 at 2:44 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Mar 21, 2024 at 12:40 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > v13-0001 looks good to me. The only Nit (that I've mentioned up-thread) is that > > in the pg_replication_slots view, the invalidation_reason is "far away" from the > > conflicting field. I understand that one could query the fields individually but > > when describing the view or reading the doc, it seems more appropriate to see > > them closer. Also as "failover" and "synced" are also new in version 17, there > > is no risk to break order by "17,18" kind of queries (which are the failover > > and sync positions). > > Hm, yeah, I can change that in the next version of the patches. Thanks. > This makes sense to me. Apart from this, few more comments on 0001. 1. --- a/src/bin/pg_upgrade/info.c +++ b/src/bin/pg_upgrade/info.c @@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check) * removed. */ res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, " - "%s as caught_up, conflict_reason IS NOT NULL as invalid " + "%s as caught_up, invalidation_reason IS NOT NULL as invalid " "FROM pg_catalog.pg_replication_slots " "WHERE slot_type = 'logical' AND " "database = current_database() AND " "temporary IS FALSE;", live_check ? "FALSE" : - "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE " + "(CASE WHEN conflicting THEN FALSE " I think here at both places we need to change 'conflict_reason' to 'conflicting'. 2. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>invalidation_reason</structfield> <type>text</type> + </para> + <para> + The reason for the slot's invalidation. It is set for both logical and + physical slots. <literal>NULL</literal> if the slot is not invalidated. + Possible values are: + <itemizedlist spacing="compact"> + <listitem> + <para> + <literal>wal_removed</literal> means that the required WAL has been + removed. + </para> + </listitem> + <listitem> + <para> + <literal>rows_removed</literal> means that the required rows have + been removed. + </para> + </listitem> + <listitem> + <para> + <literal>wal_level_insufficient</literal> means that the + primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to + perform logical decoding. + </para> Can the reasons 'rows_removed' and 'wal_level_insufficient' appear for physical slots? If not, then it is not clear from above text. 3. -# Verify slots are reported as non conflicting in pg_replication_slots +# Verify slots are reported as valid in pg_replication_slots is( $node_standby->safe_psql( 'postgres', q[select bool_or(conflicting) from - (select conflict_reason is not NULL as conflicting - from pg_replication_slots WHERE slot_type = 'logical')]), + (select conflicting from pg_replication_slots + where slot_type = 'logical')]), 'f', - 'Logical slots are reported as non conflicting'); + 'Logical slots are reported as valid'); I don't think we need to change the comment or success message in this test. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Thu, Mar 21, 2024 at 04:13:31PM +0530, Bharath Rupireddy wrote: > On Thu, Mar 21, 2024 at 3:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > My concern was that we set catalog_xmin at logical slot creation time. So if we > > > set last_inactive_at to zero at creation time and the slot is not used for a long > > > period of time > timeout, then I think it's not helping there. > > > > But, we do call ReplicationSlotRelease() after slot creation. For > > example, see CreateReplicationSlot(). So wouldn't that take care of > > the case you are worried about? > > Right. That's true even for pg_create_physical_replication_slot and > pg_create_logical_replication_slot. AFAICS, setting it to the current > timestamp in ReplicationSlotRelease suffices unless I'm missing > something. Right, but we have: " if (set_last_inactive_at && slot->data.persistency == RS_PERSISTENT) { /* * There's no point in allowing failover slots to get invalidated * based on slot's inactive_timeout parameter on standby. The failover * slots simply get synced from the primary on the standby. */ if (!(RecoveryInProgress() && slot->data.failover)) { SpinLockAcquire(&slot->mutex); slot->last_inactive_at = GetCurrentTimestamp(); SpinLockRelease(&slot->mutex); } } " while we set set_last_inactive_at to false at creation time so that last_inactive_at is not set to GetCurrentTimestamp(). We should set set_last_inactive_at to true if a timeout is provided during the slot creation. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Mar 21, 2024 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > This makes sense to me. Apart from this, few more comments on 0001. Thanks for looking into it. > 1. > - "%s as caught_up, conflict_reason IS NOT NULL as invalid " > + "%s as caught_up, invalidation_reason IS NOT NULL as invalid " > live_check ? "FALSE" : > - "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE " > + "(CASE WHEN conflicting THEN FALSE " > > I think here at both places we need to change 'conflict_reason' to > 'conflicting'. Basically, the idea there is to not live_check for invalidated logical slots. It has nothing to do with conflicting. Up until now, conflict_reason is also reporting wal_removed (although wrongly including rows_removed, wal_level_insufficient, the two reasons for conflicts). So, I think invalidation_reason is right for invalid column. Also, I think we need to change conflicting to invalidation_reason for live_check. So, I've changed that to use invalidation_reason for both columns. > 2. > > Can the reasons 'rows_removed' and 'wal_level_insufficient' appear for > physical slots? No. They can only occur for logical slots, check InvalidatePossiblyObsoleteSlot, only the logical slots get invalidated. > If not, then it is not clear from above text. I've stated that "It is set only for logical slots." for rows_removed and wal_level_insufficient. Other reasons can occur for both slots. > 3. > -# Verify slots are reported as non conflicting in pg_replication_slots > +# Verify slots are reported as valid in pg_replication_slots > is( $node_standby->safe_psql( > 'postgres', > q[select bool_or(conflicting) from > - (select conflict_reason is not NULL as conflicting > - from pg_replication_slots WHERE slot_type = 'logical')]), > + (select conflicting from pg_replication_slots > + where slot_type = 'logical')]), > 'f', > - 'Logical slots are reported as non conflicting'); > + 'Logical slots are reported as valid'); > > I don't think we need to change the comment or success message in this test. Yes. There the intention of the test case is to verify logical slots are reported as non conflicting. So, I changed them. Please find the v14-0001 patch for now. I'll post the other patches soon. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Mar 21, 2024 at 11:21 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > Please find the v14-0001 patch for now. I'll post the other patches soon. > LGTM. Let's wait for Bertrand to see if he has more comments on 0001 and then I'll push it. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 22, 2024 at 10:49:17AM +0530, Amit Kapila wrote: > On Thu, Mar 21, 2024 at 11:21 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Please find the v14-0001 patch for now. Thanks! > LGTM. Let's wait for Bertrand to see if he has more comments on 0001 > and then I'll push it. LGTM too. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > Please find the v14-0001 patch for now. > > Thanks! > > > LGTM. Let's wait for Bertrand to see if he has more comments on 0001 > > and then I'll push it. > > LGTM too. Thanks. Here I'm implementing the following: 0001 Track invalidation_reason in pg_replication_slots 0002 Track last_inactive_at in pg_replication_slots 0003 Allow setting inactive_timeout for replication slots via SQL API 0004 Introduce new SQL funtion pg_alter_replication_slot 0005 Allow setting inactive_timeout in the replication command 0006 Add inactive_timeout based replication slot invalidation 1. Keep it last_inactive_at as a shared memory variable, but always set it at restart if the slot's inactive_timeout has non-zero value and reset it as soon as someone acquires that slot so that if the slot doesn't get acquired till inactive_timeout, checkpointer will invalidate the slot. 2. Ensure with pg_alter_replication_slot one could "only" alter the timeout property for the time being, if not that could lead to the subscription inconsistency. 3. Have some notes in the CREATE and ALTER SUBSCRIPTION docs about using an existing slot to leverage inactive_timeout feature. 4. last_inactive_at should also be set to the current time during slot creation because if one creates a slot and does nothing with it then it's the time it starts to be inactive. 5. We don't set last_inactive_at to GetCurrentTimestamp() for failover slots. 6. Leave the patch that added support for inactive_timeout in subscriptions. Please see the attached v14 patch set. No change in the attached v14-0001 from the previous patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
- v14-0001-Track-invalidation_reason-in-pg_replication_slot.patch
- v14-0002-Track-last_inactive_at-in-pg_replication_slots.patch
- v14-0003-Allow-setting-inactive_timeout-for-replication-s.patch
- v14-0004-Introduce-new-SQL-funtion-pg_alter_replication_s.patch
- v14-0005-Allow-setting-inactive_timeout-in-the-replicatio.patch
- v14-0006-Add-inactive_timeout-based-replication-slot-inva.patch
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Please find the v14-0001 patch for now. > > > > Thanks! > > > > > LGTM. Let's wait for Bertrand to see if he has more comments on 0001 > > > and then I'll push it. > > > > LGTM too. > > Thanks. Here I'm implementing the following: Thanks! > 0001 Track invalidation_reason in pg_replication_slots > 0002 Track last_inactive_at in pg_replication_slots > 0003 Allow setting inactive_timeout for replication slots via SQL API > 0004 Introduce new SQL funtion pg_alter_replication_slot > 0005 Allow setting inactive_timeout in the replication command > 0006 Add inactive_timeout based replication slot invalidation > > 1. Keep it last_inactive_at as a shared memory variable, but always > set it at restart if the slot's inactive_timeout has non-zero value > and reset it as soon as someone acquires that slot so that if the slot > doesn't get acquired till inactive_timeout, checkpointer will > invalidate the slot. > 4. last_inactive_at should also be set to the current time during slot > creation because if one creates a slot and does nothing with it then > it's the time it starts to be inactive. I did not look at the code yet but just tested the behavior. It works as you describe it but I think this behavior is weird because: - when we create a slot without a timeout then last_inactive_at is set. I think that's fine, but then: - when we restart the engine, then last_inactive_at is gone (as timeout is not set). I think last_inactive_at should be set also at engine restart even if there is no timeout. I don't think we should link both. Changing my mind here on this subject due to the testing. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > > > 0001 Track invalidation_reason in pg_replication_slots > > 0002 Track last_inactive_at in pg_replication_slots > > 0003 Allow setting inactive_timeout for replication slots via SQL API > > 0004 Introduce new SQL funtion pg_alter_replication_slot > > 0005 Allow setting inactive_timeout in the replication command > > 0006 Add inactive_timeout based replication slot invalidation > > > > 1. Keep it last_inactive_at as a shared memory variable, but always > > set it at restart if the slot's inactive_timeout has non-zero value > > and reset it as soon as someone acquires that slot so that if the slot > > doesn't get acquired till inactive_timeout, checkpointer will > > invalidate the slot. > > 4. last_inactive_at should also be set to the current time during slot > > creation because if one creates a slot and does nothing with it then > > it's the time it starts to be inactive. > > I did not look at the code yet but just tested the behavior. It works as you > describe it but I think this behavior is weird because: > > - when we create a slot without a timeout then last_inactive_at is set. I think > that's fine, but then: > - when we restart the engine, then last_inactive_at is gone (as timeout is not > set). > > I think last_inactive_at should be set also at engine restart even if there is > no timeout. I think it is the opposite. Why do we need to set 'last_inactive_at' when inactive_timeout is not set? BTW, haven't we discussed that we don't need to set 'last_inactive_at' at the time of slot creation as it is sufficient to set it at the time ReplicationSlotRelease()? A few other comments: ================== 1. @@ -1027,7 +1027,8 @@ CREATE VIEW pg_replication_slots AS L.invalidation_reason, L.failover, L.synced, - L.last_inactive_at + L.last_inactive_at, + L.inactive_timeout I think it would be better to keep 'inactive_timeout' ahead of 'last_inactive_at' as that is the primary field. In major versions, we don't have to strictly keep the new fields at the end. In this case, it seems better to keep these two new fields after two_phase so that these are before invalidation_reason where we can show the invalidation due to these fields. 2. void -ReplicationSlotRelease(void) +ReplicationSlotRelease(bool set_last_inactive_at) Why do we need a parameter here? Can't we directly check from the slot whether 'inactive_timeout' has a non-zero value? 3. + /* + * There's no point in allowing failover slots to get invalidated + * based on slot's inactive_timeout parameter on standby. The failover + * slots simply get synced from the primary on the standby. + */ + if (!(RecoveryInProgress() && slot->data.failover)) I think you need to check 'sync' flag instead of 'failover'. Generally, failover marker slots should be invalidated either on primary or standby unless on standby the 'failover' marked slot is synced from the primary. 4. I feel the patches should be arranged like 0003->0001, 0002->0002, 0006->0003. We can leave remaining for the time being till we get these three patches (all three need to be committed as one but it is okay to keep them separate for review) committed. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Please find the v14-0001 patch for now. > > > > Thanks! > > > > > LGTM. Let's wait for Bertrand to see if he has more comments on 0001 > > > and then I'll push it. > > > > LGTM too. > > > Please see the attached v14 patch set. No change in the attached > v14-0001 from the previous patch. Looking at v14-0002: 1 === @@ -691,6 +699,13 @@ ReplicationSlotRelease(void) ConditionVariableBroadcast(&slot->active_cv); } + if (slot->data.persistency == RS_PERSISTENT) + { + SpinLockAcquire(&slot->mutex); + slot->last_inactive_at = GetCurrentTimestamp(); + SpinLockRelease(&slot->mutex); + } I'm not sure we should do system calls while we're holding a spinlock. Assign a variable before? 2 === Also, what about moving this here? " if (slot->data.persistency == RS_PERSISTENT) { /* * Mark persistent slot inactive. We're not freeing it, just * disconnecting, but wake up others that may be waiting for it. */ SpinLockAcquire(&slot->mutex); slot->active_pid = 0; SpinLockRelease(&slot->mutex); ConditionVariableBroadcast(&slot->active_cv); } " That would avoid testing twice "slot->data.persistency == RS_PERSISTENT". 3 === @@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name) slot->in_use = true; slot->active_pid = 0; + slot->last_inactive_at = 0; I think we should put GetCurrentTimestamp() here. It's done in v14-0006 but I think it's better to do it in 0002 (and not taking care of inactive_timeout). 4 === Track last_inactive_at in pg_replication_slots doc/src/sgml/system-views.sgml | 11 +++++++++++ src/backend/catalog/system_views.sql | 3 ++- src/backend/replication/slot.c | 16 ++++++++++++++++ src/backend/replication/slotfuncs.c | 7 ++++++- src/include/catalog/pg_proc.dat | 6 +++--- src/include/replication/slot.h | 3 +++ src/test/regress/expected/rules.out | 5 +++-- 7 files changed, 44 insertions(+), 7 deletions(-) Worth to add some tests too (or we postpone them in future commits because we're confident enough they will follow soon)? 5 === Most of the fields that reflect a time (not duration) in the system views are xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use something like "last_inactive_time"? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 22, 2024 at 02:59:21PM +0530, Amit Kapila wrote: > On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > > > > > 0001 Track invalidation_reason in pg_replication_slots > > > 0002 Track last_inactive_at in pg_replication_slots > > > 0003 Allow setting inactive_timeout for replication slots via SQL API > > > 0004 Introduce new SQL funtion pg_alter_replication_slot > > > 0005 Allow setting inactive_timeout in the replication command > > > 0006 Add inactive_timeout based replication slot invalidation > > > > > > 1. Keep it last_inactive_at as a shared memory variable, but always > > > set it at restart if the slot's inactive_timeout has non-zero value > > > and reset it as soon as someone acquires that slot so that if the slot > > > doesn't get acquired till inactive_timeout, checkpointer will > > > invalidate the slot. > > > 4. last_inactive_at should also be set to the current time during slot > > > creation because if one creates a slot and does nothing with it then > > > it's the time it starts to be inactive. > > > > I did not look at the code yet but just tested the behavior. It works as you > > describe it but I think this behavior is weird because: > > > > - when we create a slot without a timeout then last_inactive_at is set. I think > > that's fine, but then: > > - when we restart the engine, then last_inactive_at is gone (as timeout is not > > set). > > > > I think last_inactive_at should be set also at engine restart even if there is > > no timeout. > > I think it is the opposite. Why do we need to set 'last_inactive_at' > when inactive_timeout is not set? I think those are unrelated, one could want to know when a slot has been inactive even if no timeout is set. I understand that for this patch series we have in mind to use them both to invalidate slots but I think that there is use case to not use both in correlation. Also not setting last_inactive_at could give the "false" impression that the slot is active. > BTW, haven't we discussed that we > don't need to set 'last_inactive_at' at the time of slot creation as > it is sufficient to set it at the time ReplicationSlotRelease()? Right. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > > 1 === > > @@ -691,6 +699,13 @@ ReplicationSlotRelease(void) > ConditionVariableBroadcast(&slot->active_cv); > } > > + if (slot->data.persistency == RS_PERSISTENT) > + { > + SpinLockAcquire(&slot->mutex); > + slot->last_inactive_at = GetCurrentTimestamp(); > + SpinLockRelease(&slot->mutex); > + } > > I'm not sure we should do system calls while we're holding a spinlock. > Assign a variable before? > > 2 === > > Also, what about moving this here? > > " > if (slot->data.persistency == RS_PERSISTENT) > { > /* > * Mark persistent slot inactive. We're not freeing it, just > * disconnecting, but wake up others that may be waiting for it. > */ > SpinLockAcquire(&slot->mutex); > slot->active_pid = 0; > SpinLockRelease(&slot->mutex); > ConditionVariableBroadcast(&slot->active_cv); > } > " > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT". > That sounds like a good idea. Also, don't we need to consider physical slots where we don't reserve WAL during slot creation? I don't think there is a need to set inactive_at for such slots. If we agree, probably checking restart_lsn should suffice the need to know whether the WAL is reserved or not. > > 5 === > > Most of the fields that reflect a time (not duration) in the system views are > xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use > something like "last_inactive_time"? > How about naming it as last_active_time? This will indicate the time at which the slot was last active. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 22, 2024 at 3:23 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 02:59:21PM +0530, Amit Kapila wrote: > > On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > > > > > > > 0001 Track invalidation_reason in pg_replication_slots > > > > 0002 Track last_inactive_at in pg_replication_slots > > > > 0003 Allow setting inactive_timeout for replication slots via SQL API > > > > 0004 Introduce new SQL funtion pg_alter_replication_slot > > > > 0005 Allow setting inactive_timeout in the replication command > > > > 0006 Add inactive_timeout based replication slot invalidation > > > > > > > > 1. Keep it last_inactive_at as a shared memory variable, but always > > > > set it at restart if the slot's inactive_timeout has non-zero value > > > > and reset it as soon as someone acquires that slot so that if the slot > > > > doesn't get acquired till inactive_timeout, checkpointer will > > > > invalidate the slot. > > > > 4. last_inactive_at should also be set to the current time during slot > > > > creation because if one creates a slot and does nothing with it then > > > > it's the time it starts to be inactive. > > > > > > I did not look at the code yet but just tested the behavior. It works as you > > > describe it but I think this behavior is weird because: > > > > > > - when we create a slot without a timeout then last_inactive_at is set. I think > > > that's fine, but then: > > > - when we restart the engine, then last_inactive_at is gone (as timeout is not > > > set). > > > > > > I think last_inactive_at should be set also at engine restart even if there is > > > no timeout. > > > > I think it is the opposite. Why do we need to set 'last_inactive_at' > > when inactive_timeout is not set? > > I think those are unrelated, one could want to know when a slot has been inactive > even if no timeout is set. I understand that for this patch series we have in mind > to use them both to invalidate slots but I think that there is use case to not > use both in correlation. Also not setting last_inactive_at could give the "false" > impression that the slot is active. > I see your point and agree with this. I feel we can commit this part first then, probably that is the reason Bharath has kept it as a separate patch. It would be good add the use case for this patch in the commit message. A minor comment: if (SlotIsLogical(s)) pgstat_acquire_replslot(s); + if (s->data.persistency == RS_PERSISTENT) + { + SpinLockAcquire(&s->mutex); + s->last_inactive_at = 0; + SpinLockRelease(&s->mutex); + } + I think this part of the change needs a comment. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote: > On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > > > > 1 === > > > > @@ -691,6 +699,13 @@ ReplicationSlotRelease(void) > > ConditionVariableBroadcast(&slot->active_cv); > > } > > > > + if (slot->data.persistency == RS_PERSISTENT) > > + { > > + SpinLockAcquire(&slot->mutex); > > + slot->last_inactive_at = GetCurrentTimestamp(); > > + SpinLockRelease(&slot->mutex); > > + } > > > > I'm not sure we should do system calls while we're holding a spinlock. > > Assign a variable before? > > > > 2 === > > > > Also, what about moving this here? > > > > " > > if (slot->data.persistency == RS_PERSISTENT) > > { > > /* > > * Mark persistent slot inactive. We're not freeing it, just > > * disconnecting, but wake up others that may be waiting for it. > > */ > > SpinLockAcquire(&slot->mutex); > > slot->active_pid = 0; > > SpinLockRelease(&slot->mutex); > > ConditionVariableBroadcast(&slot->active_cv); > > } > > " > > > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT". > > > > That sounds like a good idea. Also, don't we need to consider physical > slots where we don't reserve WAL during slot creation? I don't think > there is a need to set inactive_at for such slots. If the slot is not active, why shouldn't we set inactive_at? I can understand that such a slots do not present "any risks" but I think we should still set inactive_at (also to not give the false impression that the slot is active). > > 5 === > > > > Most of the fields that reflect a time (not duration) in the system views are > > xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use > > something like "last_inactive_time"? > > > > How about naming it as last_active_time? This will indicate the time > at which the slot was last active. I thought about it too but I think it could be missleading as one could think that it should be updated each time WAL record decoding is happening. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 22, 2024 at 04:16:19PM +0530, Amit Kapila wrote: > On Fri, Mar 22, 2024 at 3:23 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Fri, Mar 22, 2024 at 02:59:21PM +0530, Amit Kapila wrote: > > > On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > > > > > > > > > 0001 Track invalidation_reason in pg_replication_slots > > > > > 0002 Track last_inactive_at in pg_replication_slots > > > > > 0003 Allow setting inactive_timeout for replication slots via SQL API > > > > > 0004 Introduce new SQL funtion pg_alter_replication_slot > > > > > 0005 Allow setting inactive_timeout in the replication command > > > > > 0006 Add inactive_timeout based replication slot invalidation > > > > > > > > > > 1. Keep it last_inactive_at as a shared memory variable, but always > > > > > set it at restart if the slot's inactive_timeout has non-zero value > > > > > and reset it as soon as someone acquires that slot so that if the slot > > > > > doesn't get acquired till inactive_timeout, checkpointer will > > > > > invalidate the slot. > > > > > 4. last_inactive_at should also be set to the current time during slot > > > > > creation because if one creates a slot and does nothing with it then > > > > > it's the time it starts to be inactive. > > > > > > > > I did not look at the code yet but just tested the behavior. It works as you > > > > describe it but I think this behavior is weird because: > > > > > > > > - when we create a slot without a timeout then last_inactive_at is set. I think > > > > that's fine, but then: > > > > - when we restart the engine, then last_inactive_at is gone (as timeout is not > > > > set). > > > > > > > > I think last_inactive_at should be set also at engine restart even if there is > > > > no timeout. > > > > > > I think it is the opposite. Why do we need to set 'last_inactive_at' > > > when inactive_timeout is not set? > > > > I think those are unrelated, one could want to know when a slot has been inactive > > even if no timeout is set. I understand that for this patch series we have in mind > > to use them both to invalidate slots but I think that there is use case to not > > use both in correlation. Also not setting last_inactive_at could give the "false" > > impression that the slot is active. > > > > I see your point and agree with this. I feel we can commit this part > first then, Agree that in this case the current ordering makes sense (as setting last_inactive_at would be completly unrelated to the timeout). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Ajin Cherian
Date:
On Fri, Mar 22, 2024 at 7:15 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> > > Please find the v14-0001 patch for now.
>
> Thanks!
>
> > LGTM. Let's wait for Bertrand to see if he has more comments on 0001
> > and then I'll push it.
>
> LGTM too.
Thanks. Here I'm implementing the following:
0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation
1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
2. Ensure with pg_alter_replication_slot one could "only" alter the
timeout property for the time being, if not that could lead to the
subscription inconsistency.
3. Have some notes in the CREATE and ALTER SUBSCRIPTION docs about
using an existing slot to leverage inactive_timeout feature.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.
5. We don't set last_inactive_at to GetCurrentTimestamp() for failover slots.
6. Leave the patch that added support for inactive_timeout in subscriptions.
Please see the attached v14 patch set. No change in the attached
v14-0001 from the previous patch.
Some comments:
1. In patch 0005:
In ReplicationSlotAlter():
+ lock_acquired = false;
if (MyReplicationSlot->data.failover != failover)
{
SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
MyReplicationSlot->data.failover = failover;
+ }
+
+ if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+ {
+ if (!lock_acquired)
+ {
+ SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
+ }
+
+ MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+ }
+
+ if (lock_acquired)
+ {
SpinLockRelease(&MyReplicationSlot->mutex);
+ lock_acquired = false;
if (MyReplicationSlot->data.failover != failover)
{
SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
MyReplicationSlot->data.failover = failover;
+ }
+
+ if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+ {
+ if (!lock_acquired)
+ {
+ SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
+ }
+
+ MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+ }
+
+ if (lock_acquired)
+ {
SpinLockRelease(&MyReplicationSlot->mutex);
Can't you make it shorter like below:
lock_acquired = false;
if (MyReplicationSlot->data.failover != failover || MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
SpinLockAcquire(&MyReplicationSlot->mutex);
lock_acquired = true;
}
if (MyReplicationSlot->data.failover != failover) {
MyReplicationSlot->data.failover = failover;
}
if (MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
MyReplicationSlot->data.inactive_timeout = inactive_timeout;
}
if (lock_acquired) {
SpinLockRelease(&MyReplicationSlot->mutex);
ReplicationSlotMarkDirty();
ReplicationSlotSave();
}
lock_acquired = false;
if (MyReplicationSlot->data.failover != failover || MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
SpinLockAcquire(&MyReplicationSlot->mutex);
lock_acquired = true;
}
if (MyReplicationSlot->data.failover != failover) {
MyReplicationSlot->data.failover = failover;
}
if (MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
MyReplicationSlot->data.inactive_timeout = inactive_timeout;
}
if (lock_acquired) {
SpinLockRelease(&MyReplicationSlot->mutex);
ReplicationSlotMarkDirty();
ReplicationSlotSave();
}
2. In patch 0005: why change walrcv_alter_slot option? it doesn't seem to be used anywhere, any use case for it? If required, would the intention be to add this as a Create Subscription option?
regards,
Ajin Cherian
Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote: > > On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote: > > > > > > 1 === > > > > > > @@ -691,6 +699,13 @@ ReplicationSlotRelease(void) > > > ConditionVariableBroadcast(&slot->active_cv); > > > } > > > > > > + if (slot->data.persistency == RS_PERSISTENT) > > > + { > > > + SpinLockAcquire(&slot->mutex); > > > + slot->last_inactive_at = GetCurrentTimestamp(); > > > + SpinLockRelease(&slot->mutex); > > > + } > > > > > > I'm not sure we should do system calls while we're holding a spinlock. > > > Assign a variable before? > > > > > > 2 === > > > > > > Also, what about moving this here? > > > > > > " > > > if (slot->data.persistency == RS_PERSISTENT) > > > { > > > /* > > > * Mark persistent slot inactive. We're not freeing it, just > > > * disconnecting, but wake up others that may be waiting for it. > > > */ > > > SpinLockAcquire(&slot->mutex); > > > slot->active_pid = 0; > > > SpinLockRelease(&slot->mutex); > > > ConditionVariableBroadcast(&slot->active_cv); > > > } > > > " > > > > > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT". > > > > > > > That sounds like a good idea. Also, don't we need to consider physical > > slots where we don't reserve WAL during slot creation? I don't think > > there is a need to set inactive_at for such slots. > > If the slot is not active, why shouldn't we set inactive_at? I can understand > that such a slots do not present "any risks" but I think we should still set > inactive_at (also to not give the false impression that the slot is active). > But OTOH, there is a chance that we will invalidate such slots even though they have never reserved WAL in the first place which doesn't appear to be a good thing. > > > 5 === > > > > > > Most of the fields that reflect a time (not duration) in the system views are > > > xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use > > > something like "last_inactive_time"? > > > > > > > How about naming it as last_active_time? This will indicate the time > > at which the slot was last active. > > I thought about it too but I think it could be missleading as one could think that > it should be updated each time WAL record decoding is happening. > Fair enough. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 22, 2024 at 06:02:11PM +0530, Amit Kapila wrote: > On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote: > > > > > > > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT". > > > > > > > > > > That sounds like a good idea. Also, don't we need to consider physical > > > slots where we don't reserve WAL during slot creation? I don't think > > > there is a need to set inactive_at for such slots. > > > > If the slot is not active, why shouldn't we set inactive_at? I can understand > > that such a slots do not present "any risks" but I think we should still set > > inactive_at (also to not give the false impression that the slot is active). > > > > But OTOH, there is a chance that we will invalidate such slots even > though they have never reserved WAL in the first place which doesn't > appear to be a good thing. That's right but I don't think it is not a good thing. I think we should treat inactive_at as an independent field (like if the timeout one does not exist at all) and just focus on its meaning (slot being inactive). If one sets a timeout (> 0) and gets an invalidation then I think it works as designed (even if the slot does not present any "risk" as it does not hold any rows or WAL). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Looking at v14-0002: Thanks for reviewing. I agree that 0002 with last_inactive_at can go independently and be of use on its own in addition to helping implement inactive_timeout based invalidation. > 1 === > > @@ -691,6 +699,13 @@ ReplicationSlotRelease(void) > ConditionVariableBroadcast(&slot->active_cv); > } > > + if (slot->data.persistency == RS_PERSISTENT) > + { > + SpinLockAcquire(&slot->mutex); > + slot->last_inactive_at = GetCurrentTimestamp(); > + SpinLockRelease(&slot->mutex); > + } > > I'm not sure we should do system calls while we're holding a spinlock. > Assign a variable before? Can do that. Then, the last_inactive_at = current_timestamp + mutex acquire time. But, that shouldn't be a problem than doing system calls while holding the mutex. So, done that way. > 2 === > > Also, what about moving this here? > > " > if (slot->data.persistency == RS_PERSISTENT) > { > /* > * Mark persistent slot inactive. We're not freeing it, just > * disconnecting, but wake up others that may be waiting for it. > */ > SpinLockAcquire(&slot->mutex); > slot->active_pid = 0; > SpinLockRelease(&slot->mutex); > ConditionVariableBroadcast(&slot->active_cv); > } > " > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT". Ugh. Done that now. > 3 === > > @@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name) > > slot->in_use = true; > slot->active_pid = 0; > + slot->last_inactive_at = 0; > > I think we should put GetCurrentTimestamp() here. It's done in v14-0006 but I > think it's better to do it in 0002 (and not taking care of inactive_timeout). Done. > 4 === > > Track last_inactive_at in pg_replication_slots > > doc/src/sgml/system-views.sgml | 11 +++++++++++ > src/backend/catalog/system_views.sql | 3 ++- > src/backend/replication/slot.c | 16 ++++++++++++++++ > src/backend/replication/slotfuncs.c | 7 ++++++- > src/include/catalog/pg_proc.dat | 6 +++--- > src/include/replication/slot.h | 3 +++ > src/test/regress/expected/rules.out | 5 +++-- > 7 files changed, 44 insertions(+), 7 deletions(-) > > Worth to add some tests too (or we postpone them in future commits because we're > confident enough they will follow soon)? Yes. Added some tests in a new TAP test file named src/test/recovery/t/043_replslot_misc.pl. This new file can be used to add miscellaneous replication tests in future as well. I couldn't find a better place in existing test files - tried having the new tests for physical slots in t/001_stream_rep.pl and I didn't find a right place for logical slots. > 5 === > > Most of the fields that reflect a time (not duration) in the system views are > xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use > something like "last_inactive_time"? Yeah, I can see that. So, I changed it to last_inactive_time. I agree with treating last_inactive_time as a separate property of the slot having its own use in addition to helping implement inactive_timeout based invalidation. I think it can go separately. I tried to address the review comments received for this patch alone and attached v15-0001. I'll post other patches soon. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 22, 2024 at 7:17 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 06:02:11PM +0530, Amit Kapila wrote: > > On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote: > > > > > > > > > > That would avoid testing twice "slot->data.persistency == RS_PERSISTENT". > > > > > > > > > > > > > That sounds like a good idea. Also, don't we need to consider physical > > > > slots where we don't reserve WAL during slot creation? I don't think > > > > there is a need to set inactive_at for such slots. > > > > > > If the slot is not active, why shouldn't we set inactive_at? I can understand > > > that such a slots do not present "any risks" but I think we should still set > > > inactive_at (also to not give the false impression that the slot is active). > > > > > > > But OTOH, there is a chance that we will invalidate such slots even > > though they have never reserved WAL in the first place which doesn't > > appear to be a good thing. > > That's right but I don't think it is not a good thing. I think we should treat > inactive_at as an independent field (like if the timeout one does not exist at > all) and just focus on its meaning (slot being inactive). If one sets a timeout > (> 0) and gets an invalidation then I think it works as designed (even if the > slot does not present any "risk" as it does not hold any rows or WAL). > Fair point. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Sat, Mar 23, 2024 at 3:02 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Worth to add some tests too (or we postpone them in future commits because we're > > confident enough they will follow soon)? > > Yes. Added some tests in a new TAP test file named > src/test/recovery/t/043_replslot_misc.pl. This new file can be used to > add miscellaneous replication tests in future as well. I couldn't find > a better place in existing test files - tried having the new tests for > physical slots in t/001_stream_rep.pl and I didn't find a right place > for logical slots. > How about adding the test in 019_replslot_limit? It is not a direct fit but I feel later we can even add 'invalid_timeout' related tests in this file which will use last_inactive_time feature. It is also possible that some of the tests added by the 'invalid_timeout' feature will obviate the need for some of these tests. Review of v15 ============== 1. @@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS L.conflicting, L.invalidation_reason, L.failover, - L.synced + L.synced, + L.last_inactive_time FROM pg_get_replication_slots() AS L As mentioned previously, let's keep these new fields before conflicting and after two_phase. 2. +# Get last_inactive_time value after slot's creation. Note that the slot is still +# inactive unless it's used by the standby below. +my $last_inactive_time_1 = $primary->safe_psql('postgres', + qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;) +); We should check $last_inactive_time_1 to be a valid value and add a similar check for logical slots. 3. BTW, why don't we set last_inactive_time for temporary slots (RS_TEMPORARY) as well? Don't we even invalidate temporary slots? If so, then I think we should set last_inactive_time for those as well and later allow them to be invalidated based on timeout parameter. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > How about adding the test in 019_replslot_limit? It is not a direct > fit but I feel later we can even add 'invalid_timeout' related tests > in this file which will use last_inactive_time feature. I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl can have last_inactive_time tests, and later invalid_timeout ones too. This way 019_replslot_limit.pl is not cluttered. > It is also > possible that some of the tests added by the 'invalid_timeout' feature > will obviate the need for some of these tests. Might be. But, I prefer to keep both these tests separate but in the same file 043_replslot_misc.pl. Because we cover some corner cases the last_inactive_time is set upon loading the slot from disk. > Review of v15 > ============== > 1. > @@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS > L.conflicting, > L.invalidation_reason, > L.failover, > - L.synced > + L.synced, > + L.last_inactive_time > FROM pg_get_replication_slots() AS L > > As mentioned previously, let's keep these new fields before > conflicting and after two_phase. Sorry, I forgot to notice that comment (out of a flood of comments really :)). Now, done that way. > 2. > +# Get last_inactive_time value after slot's creation. Note that the > slot is still > +# inactive unless it's used by the standby below. > +my $last_inactive_time_1 = $primary->safe_psql('postgres', > + qq(SELECT last_inactive_time FROM pg_replication_slots WHERE > slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;) > +); > > We should check $last_inactive_time_1 to be a valid value and add a > similar check for logical slots. That's taken care by the type cast we do, right? Isn't that enough? is( $primary->safe_psql( 'postgres', qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;] ), 't', 'last inactive time for an inactive physical slot is updated correctly'); For instance, setting last_inactive_time_1 to an invalid value fails with the following error: error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for type timestamp with time zone: "foo" LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli... > 3. BTW, why don't we set last_inactive_time for temporary slots > (RS_TEMPORARY) as well? Don't we even invalidate temporary slots? If > so, then I think we should set last_inactive_time for those as well > and later allow them to be invalidated based on timeout parameter. WFM. Done that way. Please see the attached v16 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Sat, Mar 23, 2024 at 01:11:50PM +0530, Bharath Rupireddy wrote: > On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > How about adding the test in 019_replslot_limit? It is not a direct > > fit but I feel later we can even add 'invalid_timeout' related tests > > in this file which will use last_inactive_time feature. > > I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl > can have last_inactive_time tests, and later invalid_timeout ones too. > This way 019_replslot_limit.pl is not cluttered. I share the same opinion as Amit: I think 019_replslot_limit would be a better place, because I see the timeout as another kind of limit. > > > It is also > > possible that some of the tests added by the 'invalid_timeout' feature > > will obviate the need for some of these tests. > > Might be. But, I prefer to keep both these tests separate but in the > same file 043_replslot_misc.pl. Because we cover some corner cases the > last_inactive_time is set upon loading the slot from disk. Right but I think that this test does not necessary have to be in the same .pl as the one testing the timeout. Could be added in one of the existing .pl like 001_stream_rep.pl for example. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Mar 23, 2024 at 2:34 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > How about adding the test in 019_replslot_limit? It is not a direct > > > fit but I feel later we can even add 'invalid_timeout' related tests > > > in this file which will use last_inactive_time feature. > > > > I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl > > can have last_inactive_time tests, and later invalid_timeout ones too. > > This way 019_replslot_limit.pl is not cluttered. > > I share the same opinion as Amit: I think 019_replslot_limit would be a better > place, because I see the timeout as another kind of limit. Hm. Done that way. Please see the attached v17 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Sat, Mar 23, 2024 at 1:12 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > 2. > > +# Get last_inactive_time value after slot's creation. Note that the > > slot is still > > +# inactive unless it's used by the standby below. > > +my $last_inactive_time_1 = $primary->safe_psql('postgres', > > + qq(SELECT last_inactive_time FROM pg_replication_slots WHERE > > slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;) > > +); > > > > We should check $last_inactive_time_1 to be a valid value and add a > > similar check for logical slots. > > That's taken care by the type cast we do, right? Isn't that enough? > > is( $primary->safe_psql( > 'postgres', > qq[SELECT last_inactive_time > > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE > slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;] > ), > 't', > 'last inactive time for an inactive physical slot is updated correctly'); > > For instance, setting last_inactive_time_1 to an invalid value fails > with the following error: > > error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for > type timestamp with time zone: "foo" > LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli... > It would be found at a later point. It would be probably better to verify immediately after the test that fetches the last_inactive_time value. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sun, Mar 24, 2024 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > For instance, setting last_inactive_time_1 to an invalid value fails > > with the following error: > > > > error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for > > type timestamp with time zone: "foo" > > LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli... > > > > It would be found at a later point. It would be probably better to > verify immediately after the test that fetches the last_inactive_time > value. Agree. I've added a few more checks explicitly to verify the last_inactive_time is sane with the following: qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND '$last_inactive_time'::timestamptz > '$slot_creation_time'::timestamptz;] I've attached the v18 patch set here. I've also addressed earlier review comments from Amit, Ajin Cherian. Note that I've added new invalidation mechanism tests in a separate TAP test file just because I don't want to clutter or bloat any of the existing files and spread tests for physical slots and logical slots into separate existing TAP files. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
- v18-0001-Track-last_inactive_time-in-pg_replication_slots.patch
- v18-0002-Allow-setting-inactive_timeout-for-replication-s.patch
- v18-0003-Introduce-new-SQL-funtion-pg_alter_replication_s.patch
- v18-0004-Allow-setting-inactive_timeout-in-the-replicatio.patch
- v18-0005-Add-inactive_timeout-based-replication-slot-inva.patch
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Sun, Mar 24, 2024 at 3:05 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Sun, Mar 24, 2024 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > For instance, setting last_inactive_time_1 to an invalid value fails > > > with the following error: > > > > > > error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for > > > type timestamp with time zone: "foo" > > > LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli... > > > > > > > It would be found at a later point. It would be probably better to > > verify immediately after the test that fetches the last_inactive_time > > value. > > Agree. I've added a few more checks explicitly to verify the > last_inactive_time is sane with the following: > > qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) > AND '$last_inactive_time'::timestamptz > > '$slot_creation_time'::timestamptz;] > Such a test looks reasonable but shall we add equal to in the second part of the test (like '$last_inactive_time'::timestamptz >= > '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the same time,the test shouldn't fail. I think it won't matter for correctness as well. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Such a test looks reasonable but shall we add equal to in the second > part of the test (like '$last_inactive_time'::timestamptz >= > > '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the sametime, the test shouldn't fail. I think it won't matter for correctness as well. > Apart from this, I have made minor changes in the comments. See and let me know what you think of attached. -- With Regards, Amit Kapila.
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > I've attached the v18 patch set here. Thanks for the patches. Please find few comments: patch 001: -------- 1) slot.h: + /* The time at which this slot become inactive */ + TimestampTz last_inactive_time; become -->became --------- patch 002: 2) slotsync.c: ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY, remote_slot->two_phase, remote_slot->failover, - true); + true, 0); + slot->data.inactive_timeout = remote_slot->inactive_timeout; Is there a reason we are not passing 'remote_slot->inactive_timeout' to ReplicationSlotCreate() directly? --------- 3) slotfuncs.c pg_create_logical_replication_slot(): + int inactive_timeout = PG_GETARG_INT32(5); Can we mention here that timeout is in seconds either in comment or rename variable to inactive_timeout_secs? Please do this for create_physical_replication_slot(), create_logical_replication_slot(), pg_create_physical_replication_slot() as well. --------- 4) + int inactive_timeout; /* The amount of time in seconds the slot + * is allowed to be inactive. */ } LogicalSlotInfo; Do we need to mention "before getting invalided" like other places (in last patch)? ---------- 5) Same at these two places. "before getting invalided" to be added in the last patch otherwise the info is incompleted. + + /* The amount of time in seconds the slot is allowed to be inactive */ + int inactive_timeout; } ReplicationSlotPersistentData; + * inactive_timeout: The amount of time in seconds the slot is allowed to be + * inactive. */ void ReplicationSlotCreate(const char *name, bool db_specific, Same here. "before getting invalidated" ? -------- Reviewing more.. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > I've attached the v18 patch set here. > I have a question. Don't we allow creating subscriptions on an existing slot with a non-null 'inactive_timeout' set where 'inactive_timeout' of the slot is retained even after subscription creation? I tried this: =================== --On publisher, create slot with 120sec inactive_timeout: SELECT * FROM pg_create_logical_replication_slot('logical_slot1', 'pgoutput', false, true, true, 120); --On subscriber, create sub using logical_slot1 create subscription mysubnew1_1 connection 'dbname=newdb1 host=localhost user=shveta port=5433' publication mypubnew1_1 WITH (failover = true, create_slot=false, slot_name='logical_slot1'); --Before creating sub, pg_replication_slots output: slot_name | failover | synced | active | temp | conf | lat | inactive_timeout ---------------+----------+--------+--------+------+------+----------------------------------+------------------ logical_slot1 | t | f | f | f | f | 2024-03-25 11:11:55.375736+05:30 | 120 --After creating sub pg_replication_slots output: (inactive_timeout is 0 now): slot_name |failover | synced | active | temp | conf | | lat | inactive_timeout ---------------+---------+--------+--------+------+------+-+-----+------------------ logical_slot1 |t | f | t | f | f | | | 0 =================== In CreateSubscription, we call 'walrcv_alter_slot()' / 'ReplicationSlotAlter()' when create_slot is false. This call ends up setting active_timeout from 120sec to 0. Is it intentional? thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 25, 2024 at 10:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Such a test looks reasonable but shall we add equal to in the second > > part of the test (like '$last_inactive_time'::timestamptz >= > > > '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the sametime, the test shouldn't fail. I think it won't matter for correctness as well. Agree. I added that in v19 patch. I was having that concern in my mind. That's the reason I wasn't capturing current_time something like below for the same worry that current_timestamp might be the same (or nearly the same) as the slot creation time. That's why I ended up capturing current_timestamp in a separate query than clubbing it up with pg_create_physical_replication_slot. SELECT current_timestamp FROM pg_create_physical_replication_slot('foo'); > Apart from this, I have made minor changes in the comments. See and > let me know what you think of attached. LGTM. I've merged the diff into v19 patch. Please find the attached v19 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > I've attached the v18 patch set here. I have one concern, for synced slots on standby, how do we disallow invalidation due to inactive-timeout immediately after promotion? For synced slots, last_inactive_time and inactive_timeout are both set. Let's say I bring down primary for promotion of standby and then promote standby, there are chances that it may end up invalidating synced slots (considering standby is not brought down during promotion and thus inactive_timeout may already be past 'last_inactive_time'). I tried with smaller unit of inactive_timeout: --Shutdown primary to prepare for planned promotion. --On standby, one synced slot with last_inactive_time (lat) as 12:21 slot_name | failover | synced | active | temp | conf | res | lat | inactive_timeout ---------------+----------+--------+--------+------+------+-----+----------------------------------+------------------ logical_slot1 | t | t | f | f | f | | 2024-03-25 12:21:09.020757+05:30 | 60 --wait for some time, now the time is 12:24 postgres=# select now(); now ---------------------------------- 2024-03-25 12:24:17.616716+05:30 -- promote immediately: ./pg_ctl -D ../../standbydb/ promote -w --on promoted standby: postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- f --synced slot is invalidated immediately on promotion. slot_name | failover | synced | active | temp | conf | res | lat | inactive_timeout ---------------+----------+--------+--------+------+------+------------------+----------------------------------+-------- logical_slot1 | t | t | f | f | f | inactive_timeout | 2024-03-25 12:21:09.020757+05:30 | thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > > > I've attached the v18 patch set here. > > I have one concern, for synced slots on standby, how do we disallow > invalidation due to inactive-timeout immediately after promotion? > > For synced slots, last_inactive_time and inactive_timeout are both > set. Let's say I bring down primary for promotion of standby and then > promote standby, there are chances that it may end up invalidating > synced slots (considering standby is not brought down during promotion > and thus inactive_timeout may already be past 'last_inactive_time'). > This raises the question of whether we need to set 'last_inactive_time' synced slots on the standby? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Mon, Mar 25, 2024 at 12:25:21PM +0530, Bharath Rupireddy wrote: > On Mon, Mar 25, 2024 at 10:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > Such a test looks reasonable but shall we add equal to in the second > > > part of the test (like '$last_inactive_time'::timestamptz >= > > > > '$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the sametime, the test shouldn't fail. I think it won't matter for correctness as well. > > Agree. I added that in v19 patch. I was having that concern in my > mind. That's the reason I wasn't capturing current_time something like > below for the same worry that current_timestamp might be the same (or > nearly the same) as the slot creation time. That's why I ended up > capturing current_timestamp in a separate query than clubbing it up > with pg_create_physical_replication_slot. > > SELECT current_timestamp FROM pg_create_physical_replication_slot('foo'); > > > Apart from this, I have made minor changes in the comments. See and > > let me know what you think of attached. > Thanks! v19-0001 LGTM, just one Nit comment for 019_replslot_limit.pl: The code for "Get last_inactive_time value after the slot's creation" and "Check that the captured time is sane" is somehow duplicated: is it worth creating 2 functions? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote: > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > > > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > > > > > I've attached the v18 patch set here. > > > > I have one concern, for synced slots on standby, how do we disallow > > invalidation due to inactive-timeout immediately after promotion? > > > > For synced slots, last_inactive_time and inactive_timeout are both > > set. Yeah, and I can see last_inactive_time is moving on the standby (while not the case on the primary), probably due to the sync worker slot acquisition/release which does not seem right. > Let's say I bring down primary for promotion of standby and then > > promote standby, there are chances that it may end up invalidating > > synced slots (considering standby is not brought down during promotion > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > This raises the question of whether we need to set > 'last_inactive_time' synced slots on the standby? Yeah, I think that last_inactive_time should stay at 0 on synced slots on the standby because such slots are not usable anyway (until the standby gets promoted). So, I think that last_inactive_time does not make sense if the slot never had the chance to be active. OTOH I think the timeout invalidation (if any) should be synced from primary. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote: > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > > > > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > > > > > > > I've attached the v18 patch set here. > > > > > > I have one concern, for synced slots on standby, how do we disallow > > > invalidation due to inactive-timeout immediately after promotion? > > > > > > For synced slots, last_inactive_time and inactive_timeout are both > > > set. > > Yeah, and I can see last_inactive_time is moving on the standby (while not the > case on the primary), probably due to the sync worker slot acquisition/release > which does not seem right. > > > Let's say I bring down primary for promotion of standby and then > > > promote standby, there are chances that it may end up invalidating > > > synced slots (considering standby is not brought down during promotion > > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > > > > This raises the question of whether we need to set > > 'last_inactive_time' synced slots on the standby? > > Yeah, I think that last_inactive_time should stay at 0 on synced slots on the > standby because such slots are not usable anyway (until the standby gets promoted). > > So, I think that last_inactive_time does not make sense if the slot never had > the chance to be active. > > OTOH I think the timeout invalidation (if any) should be synced from primary. Yes, even I feel that last_inactive_time makes sense only when the slot is available to be used. Synced slots are not available to be used until standby is promoted and thus last_inactive_time can be skipped to be set for synced_slots. But once primay is invalidated due to inactive-timeout, that invalidation should be synced to standby (which is happening currently). thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Mon, Mar 25, 2024 at 02:07:21PM +0530, shveta malik wrote: > On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote: > > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > > > > > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > > > > > > > > > I've attached the v18 patch set here. > > > > > > > > I have one concern, for synced slots on standby, how do we disallow > > > > invalidation due to inactive-timeout immediately after promotion? > > > > > > > > For synced slots, last_inactive_time and inactive_timeout are both > > > > set. > > > > Yeah, and I can see last_inactive_time is moving on the standby (while not the > > case on the primary), probably due to the sync worker slot acquisition/release > > which does not seem right. > > > > > Let's say I bring down primary for promotion of standby and then > > > > promote standby, there are chances that it may end up invalidating > > > > synced slots (considering standby is not brought down during promotion > > > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > > > > > > > This raises the question of whether we need to set > > > 'last_inactive_time' synced slots on the standby? > > > > Yeah, I think that last_inactive_time should stay at 0 on synced slots on the > > standby because such slots are not usable anyway (until the standby gets promoted). > > > > So, I think that last_inactive_time does not make sense if the slot never had > > the chance to be active. > > > > OTOH I think the timeout invalidation (if any) should be synced from primary. > > Yes, even I feel that last_inactive_time makes sense only when the > slot is available to be used. Synced slots are not available to be > used until standby is promoted and thus last_inactive_time can be > skipped to be set for synced_slots. But once primay is invalidated due > to inactive-timeout, that invalidation should be synced to standby > (which is happening currently). > yeah, syncing the invalidation and always keeping last_inactive_time to zero for synced slots looks right to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > Yeah, and I can see last_inactive_time is moving on the standby (while not the > case on the primary), probably due to the sync worker slot acquisition/release > which does not seem right. > Yes, you are right, last_inactive_time keeps on moving for synced slots on standby. Once I disabled slot-sync worker, then it is constant. Then it only changes if I call pg_sync_replication_slots(). On a different note, I noticed that we allow altering inactive_timeout for synced-slots on standby. And again overwrite it with the primary's value in the next sync cycle. Steps: ==================== --Check pg_replication_slots for synced slot on standby, inactive_timeout is 120 slot_name | failover | synced | active | inactive_timeout ---------------+----------+--------+--------+------------------ logical_slot1 | t | t | f | 120 --Alter on standby SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900); --Check pg_replication_slots: slot_name | failover | synced | active | inactive_timeout ---------------+----------+--------+--------+------------------ logical_slot1 | t | t | f | 900 --Run sync function SELECT pg_sync_replication_slots(); --check again, inactive_timeout is set back to primary's value. slot_name | failover | synced | active | inactive_timeout ---------------+----------+--------+--------+------------------ logical_slot1 | t | t | f | 120 ==================== I feel altering synced slot's inactive_timeout should be prohibited on standby. It should be in sync with primary always. Thoughts? I am listing the concerns raised by me: 1) create-subscription with create_slot=false overwriting inactive_timeout of existing slot ([1]) 2) last_inactive_time set for synced slots may result in invalidation of slot on promotion. ([2]) 3) alter replication slot to alter inactive_timout for synced slots on standby, should this be allowed? [1]: https://www.postgresql.org/message-id/CAJpy0uAqBi%2BGbNn2ngJ-A_Z905CD3ss896bqY2ACUjGiF1Gkng%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CAJpy0uCLu%2BmqAwAMum%3DpXE9YYsy0BE7hOSw_Wno5vjwpFY%3D63g%40mail.gmail.com thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Mon, Mar 25, 2024 at 02:39:50PM +0530, shveta malik wrote: > I am listing the concerns raised by me: > 3) alter replication slot to alter inactive_timout for synced slots on > standby, should this be allowed? I don't think it should be allowed. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > I have one concern, for synced slots on standby, how do we disallow > > > invalidation due to inactive-timeout immediately after promotion? > > > > > > For synced slots, last_inactive_time and inactive_timeout are both > > > set. > > Yeah, and I can see last_inactive_time is moving on the standby (while not the > case on the primary), probably due to the sync worker slot acquisition/release > which does not seem right. > > > Let's say I bring down primary for promotion of standby and then > > > promote standby, there are chances that it may end up invalidating > > > synced slots (considering standby is not brought down during promotion > > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > > > > This raises the question of whether we need to set > > 'last_inactive_time' synced slots on the standby? > > Yeah, I think that last_inactive_time should stay at 0 on synced slots on the > standby because such slots are not usable anyway (until the standby gets promoted). > > So, I think that last_inactive_time does not make sense if the slot never had > the chance to be active. Right. Done that way i.e. not setting the last_inactive_time for slots both while releasing the slot and restoring from the disk. Also, I've added a TAP function to check if the captured times are sane per Bertrand's review comment. Please see the attached v20 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Right. Done that way i.e. not setting the last_inactive_time for slots > both while releasing the slot and restoring from the disk. > > Also, I've added a TAP function to check if the captured times are > sane per Bertrand's review comment. > > Please see the attached v20 patch. Thanks for the patch. The issue of unnecessary invalidation of synced slots on promotion is resolved in this patch. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Right. Done that way i.e. not setting the last_inactive_time for slots > both while releasing the slot and restoring from the disk. > > Also, I've added a TAP function to check if the captured times are > sane per Bertrand's review comment. > > Please see the attached v20 patch. > Pushed, after minor changes. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Mar 25, 2024 at 2:40 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > Yeah, and I can see last_inactive_time is moving on the standby (while not the > > case on the primary), probably due to the sync worker slot acquisition/release > > which does not seem right. > > > > Yes, you are right, last_inactive_time keeps on moving for synced > slots on standby. Once I disabled slot-sync worker, then it is > constant. Then it only changes if I call pg_sync_replication_slots(). > > On a different note, I noticed that we allow altering > inactive_timeout for synced-slots on standby. And again overwrite it > with the primary's value in the next sync cycle. Steps: > > ==================== > --Check pg_replication_slots for synced slot on standby, inactive_timeout is 120 > slot_name | failover | synced | active | inactive_timeout > ---------------+----------+--------+--------+------------------ > logical_slot1 | t | t | f | 120 > > --Alter on standby > SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900); > I think we should keep pg_alter_replication_slot() as the last priority among the remaining patches for this release. Let's try to first finish the primary functionality of inactive_timeout patch. Otherwise, I agree that the problem reported by you should be fixed. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Mar 25, 2024 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I think we should keep pg_alter_replication_slot() as the last > priority among the remaining patches for this release. Let's try to > first finish the primary functionality of inactive_timeout patch. > Otherwise, I agree that the problem reported by you should be fixed. Noted. Will focus on v18-002 patch now. I was debugging the flow and just noticed that RecoveryInProgress() always returns 'true' during StartupReplicationSlots()-->RestoreSlotFromDisk() (even on primary) as 'xlogctl->SharedRecoveryState' is always 'RECOVERY_STATE_CRASH' at that time. The 'xlogctl->SharedRecoveryState' is changed to 'RECOVERY_STATE_DONE' on primary and to 'RECOVERY_STATE_ARCHIVE' on standby at a later stage in StartupXLOG() (after we are done loading slots). The impact of this is, the condition in RestoreSlotFromDisk() in v20-001: if (!(RecoveryInProgress() && slot->data.synced)) slot->last_inactive_time = GetCurrentTimestamp(); is merely equivalent to: if (!slot->data.synced) slot->last_inactive_time = GetCurrentTimestamp(); Thus on primary, after restart, last_inactive_at is set correctly, while on promoted standby (new primary), last_inactive_at is always NULL after restart for the synced slots. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
I apologize that I haven't been able to keep up with this thread for a while, but I'm happy to see the continued interest in $SUBJECT. On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote: > This commit particularly lets one specify the inactive_timeout for > a slot via SQL functions pg_create_physical_replication_slot and > pg_create_logical_replication_slot. Off-list, Bharath brought to my attention that the current proposal was to set the timeout at the slot level. While I think that is an entirely reasonable thing to support, the main use-case I have in mind for this feature is for an administrator that wants to prevent inactive slots from causing problems (e.g., transaction ID wraparound) on a server or a number of servers. For that use-case, I think a GUC would be much more convenient. Perhaps there could be a default inactive slot timeout GUC that would be used in the absence of a slot-level setting. Thoughts? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > I have one concern, for synced slots on standby, how do we disallow > invalidation due to inactive-timeout immediately after promotion? > > For synced slots, last_inactive_time and inactive_timeout are both > set. Let's say I bring down primary for promotion of standby and then > promote standby, there are chances that it may end up invalidating > synced slots (considering standby is not brought down during promotion > and thus inactive_timeout may already be past 'last_inactive_time'). > On standby, if we decide to maintain valid last_inactive_time for synced slots, then invalidation is correctly restricted in InvalidateSlotForInactiveTimeout() for synced slots using the check: if (RecoveryInProgress() && slot->data.synced) return false; But immediately after promotion, we can not rely on the above check and thus possibility of synced slots invalidation is there. To maintain consistent behavior regarding the setting of last_inactive_time for synced slots, similar to user slots, one potential solution to prevent this invalidation issue is to update the last_inactive_time of all synced slots within the ShutDownSlotSync() function during FinishWalRecovery(). This approach ensures that promotion doesn't immediately invalidate slots, and henceforth, we possess a correct last_inactive_time as a basis for invalidation going forward. This will be equivalent to updating last_inactive_time during restart (but without actual restart during promotion). The plus point of maintaining last_inactive_time for synced slots could be, this can provide data to the user on when last time the sync was attempted on that particular slot by background slot sync worker or SQl function. Thoughts? thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 26, 2024 at 1:24 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > > On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote: > > This commit particularly lets one specify the inactive_timeout for > > a slot via SQL functions pg_create_physical_replication_slot and > > pg_create_logical_replication_slot. > > Off-list, Bharath brought to my attention that the current proposal was to > set the timeout at the slot level. While I think that is an entirely > reasonable thing to support, the main use-case I have in mind for this > feature is for an administrator that wants to prevent inactive slots from > causing problems (e.g., transaction ID wraparound) on a server or a number > of servers. For that use-case, I think a GUC would be much more > convenient. Perhaps there could be a default inactive slot timeout GUC > that would be used in the absence of a slot-level setting. Thoughts? > Yeah, that is a valid point. One of the reasons for keeping it at slot level was to allow different subscribers/output plugins to have a different setting for invalid_timeout for their respective slots based on their usage. Now, having it as a GUC also has some valid use cases as pointed out by you but I am not sure having both at slot level and at GUC level is required. I was a bit inclined to have it at slot level for now and then based on some field usage report we can later add GUC as well. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > I have one concern, for synced slots on standby, how do we disallow > > invalidation due to inactive-timeout immediately after promotion? > > > > For synced slots, last_inactive_time and inactive_timeout are both > > set. Let's say I bring down primary for promotion of standby and then > > promote standby, there are chances that it may end up invalidating > > synced slots (considering standby is not brought down during promotion > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > On standby, if we decide to maintain valid last_inactive_time for > synced slots, then invalidation is correctly restricted in > InvalidateSlotForInactiveTimeout() for synced slots using the check: > > if (RecoveryInProgress() && slot->data.synced) > return false; > > But immediately after promotion, we can not rely on the above check > and thus possibility of synced slots invalidation is there. To > maintain consistent behavior regarding the setting of > last_inactive_time for synced slots, similar to user slots, one > potential solution to prevent this invalidation issue is to update the > last_inactive_time of all synced slots within the ShutDownSlotSync() > function during FinishWalRecovery(). This approach ensures that > promotion doesn't immediately invalidate slots, and henceforth, we > possess a correct last_inactive_time as a basis for invalidation going > forward. This will be equivalent to updating last_inactive_time during > restart (but without actual restart during promotion). > The plus point of maintaining last_inactive_time for synced slots > could be, this can provide data to the user on when last time the sync > was attempted on that particular slot by background slot sync worker > or SQl function. Thoughts? Please find the attached v21 patch implementing the above idea. It also has changes for renaming last_inactive_time to inactive_since. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote: > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > I have one concern, for synced slots on standby, how do we disallow > > invalidation due to inactive-timeout immediately after promotion? > > > > For synced slots, last_inactive_time and inactive_timeout are both > > set. Let's say I bring down primary for promotion of standby and then > > promote standby, there are chances that it may end up invalidating > > synced slots (considering standby is not brought down during promotion > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > On standby, if we decide to maintain valid last_inactive_time for > synced slots, then invalidation is correctly restricted in > InvalidateSlotForInactiveTimeout() for synced slots using the check: > > if (RecoveryInProgress() && slot->data.synced) > return false; Right. > But immediately after promotion, we can not rely on the above check > and thus possibility of synced slots invalidation is there. To > maintain consistent behavior regarding the setting of > last_inactive_time for synced slots, similar to user slots, one > potential solution to prevent this invalidation issue is to update the > last_inactive_time of all synced slots within the ShutDownSlotSync() > function during FinishWalRecovery(). This approach ensures that > promotion doesn't immediately invalidate slots, and henceforth, we > possess a correct last_inactive_time as a basis for invalidation going > forward. This will be equivalent to updating last_inactive_time during > restart (but without actual restart during promotion). > The plus point of maintaining last_inactive_time for synced slots > could be, this can provide data to the user on when last time the sync > was attempted on that particular slot by background slot sync worker > or SQl function. Thoughts? Yeah, another plus point is that if the primary is down then one could look at the synced "active_since" on the standby to get an idea of it (depends of the last sync though). The issue that I can see with your proposal is: what if one synced the slots manually (with pg_sync_replication_slots()) but does not use the sync worker? Then I think ShutDownSlotSync() is not going to help in that case. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Sun, Mar 24, 2024 at 3:05 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > I've attached the v18 patch set here. I've also addressed earlier > review comments from Amit, Ajin Cherian. Note that I've added new > invalidation mechanism tests in a separate TAP test file just because > I don't want to clutter or bloat any of the existing files and spread > tests for physical slots and logical slots into separate existing TAP > files. > Review comments on v18_0002 and v18_0005 ======================================= 1. ReplicationSlotCreate(const char *name, bool db_specific, ReplicationSlotPersistency persistency, - bool two_phase, bool failover, bool synced) + bool two_phase, bool failover, bool synced, + int inactive_timeout) { ReplicationSlot *slot = NULL; int i; @@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific, errmsg("cannot enable failover for a temporary replication slot")); } + if (inactive_timeout > 0) + { + /* + * Do not allow users to set inactive_timeout for temporary slots, + * because temporary slots will not be saved to the disk. + */ + if (persistency == RS_TEMPORARY) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot set inactive_timeout for a temporary replication slot")); + } We have decided to update inactive_since for temporary slots. So, unless there is some reason, we should allow inactive_timeout to also be set for temporary slots. 2. --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1024,6 +1024,7 @@ CREATE VIEW pg_replication_slots AS L.safe_wal_size, L.two_phase, L.last_inactive_time, + L.inactive_timeout, Shall we keep inactive_timeout before last_inactive_time/inactive_since? I don't have any strong reason to propose that way apart from that the former is provided by the user. 3. @@ -287,6 +288,13 @@ pg_get_replication_slots(PG_FUNCTION_ARGS) slot_contents = *slot; SpinLockRelease(&slot->mutex); + /* + * Here's an opportunity to invalidate inactive replication slots + * based on timeout, so let's do it. + */ + if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true)) + invalidated = true; I don't think we should try to invalidate the slots in pg_get_replication_slots. This function's purpose is to get the current information on slots and has no intention to perform any work for slots. Any error due to invalidation won't be what the user would be expecting here. 4. +static bool +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, + bool need_control_lock, + bool need_mutex) { ... ... + if (need_control_lock) + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); + + Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED)); + + /* + * Check if the slot needs to be invalidated due to inactive_timeout. We + * do this with the spinlock held to avoid race conditions -- for example + * the restart_lsn could move forward, or the slot could be dropped. + */ + if (need_mutex) + SpinLockAcquire(&slot->mutex); ... I find this combination of parameters a bit strange. Because, say if need_mutex is false and need_control_lock is true then that means this function will acquire LWlock after acquiring spinlock which is unacceptable. Now, this may not happen in practice as the callers won't pass such a combination but still, this functionality should be improved. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 05:55:11AM +0000, Bertrand Drouvot wrote: > Hi, > > On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote: > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > I have one concern, for synced slots on standby, how do we disallow > > > invalidation due to inactive-timeout immediately after promotion? > > > > > > For synced slots, last_inactive_time and inactive_timeout are both > > > set. Let's say I bring down primary for promotion of standby and then > > > promote standby, there are chances that it may end up invalidating > > > synced slots (considering standby is not brought down during promotion > > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > > > > On standby, if we decide to maintain valid last_inactive_time for > > synced slots, then invalidation is correctly restricted in > > InvalidateSlotForInactiveTimeout() for synced slots using the check: > > > > if (RecoveryInProgress() && slot->data.synced) > > return false; > > Right. > > > But immediately after promotion, we can not rely on the above check > > and thus possibility of synced slots invalidation is there. To > > maintain consistent behavior regarding the setting of > > last_inactive_time for synced slots, similar to user slots, one > > potential solution to prevent this invalidation issue is to update the > > last_inactive_time of all synced slots within the ShutDownSlotSync() > > function during FinishWalRecovery(). This approach ensures that > > promotion doesn't immediately invalidate slots, and henceforth, we > > possess a correct last_inactive_time as a basis for invalidation going > > forward. This will be equivalent to updating last_inactive_time during > > restart (but without actual restart during promotion). > > The plus point of maintaining last_inactive_time for synced slots > > could be, this can provide data to the user on when last time the sync > > was attempted on that particular slot by background slot sync worker > > or SQl function. Thoughts? > > Yeah, another plus point is that if the primary is down then one could look > at the synced "active_since" on the standby to get an idea of it (depends of the > last sync though). > > The issue that I can see with your proposal is: what if one synced the slots > manually (with pg_sync_replication_slots()) but does not use the sync worker? > Then I think ShutDownSlotSync() is not going to help in that case. It looks like ShutDownSlotSync() is always called (even if sync_replication_slots = off), so that sounds ok to me (I should have checked the code, I was under the impression ShutDownSlotSync() was not called if sync_replication_slots = off). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 11:36 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > The issue that I can see with your proposal is: what if one synced the slots > > manually (with pg_sync_replication_slots()) but does not use the sync worker? > > Then I think ShutDownSlotSync() is not going to help in that case. > > It looks like ShutDownSlotSync() is always called (even if sync_replication_slots = off), > so that sounds ok to me (I should have checked the code, I was under the impression > ShutDownSlotSync() was not called if sync_replication_slots = off). Right, it is called irrespective of sync_replication_slots. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > I have one concern, for synced slots on standby, how do we disallow > > > invalidation due to inactive-timeout immediately after promotion? > > > > > > For synced slots, last_inactive_time and inactive_timeout are both > > > set. Let's say I bring down primary for promotion of standby and then > > > promote standby, there are chances that it may end up invalidating > > > synced slots (considering standby is not brought down during promotion > > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > > > > On standby, if we decide to maintain valid last_inactive_time for > > synced slots, then invalidation is correctly restricted in > > InvalidateSlotForInactiveTimeout() for synced slots using the check: > > > > if (RecoveryInProgress() && slot->data.synced) > > return false; > > > > But immediately after promotion, we can not rely on the above check > > and thus possibility of synced slots invalidation is there. To > > maintain consistent behavior regarding the setting of > > last_inactive_time for synced slots, similar to user slots, one > > potential solution to prevent this invalidation issue is to update the > > last_inactive_time of all synced slots within the ShutDownSlotSync() > > function during FinishWalRecovery(). This approach ensures that > > promotion doesn't immediately invalidate slots, and henceforth, we > > possess a correct last_inactive_time as a basis for invalidation going > > forward. This will be equivalent to updating last_inactive_time during > > restart (but without actual restart during promotion). > > The plus point of maintaining last_inactive_time for synced slots > > could be, this can provide data to the user on when last time the sync > > was attempted on that particular slot by background slot sync worker > > or SQl function. Thoughts? > > Please find the attached v21 patch implementing the above idea. It > also has changes for renaming last_inactive_time to inactive_since. > Thanks for the patch. I have tested this patch alone, and it does what it says. One additional thing which I noticed is that now it sets inactive_since for temp slots as well, but that idea looks fine to me. I could not test 'invalidation on promotion bug' with this change, as that needed rebasing of the rest of the patches. Few trivial things: 1) Commti msg: ensures the value is set to current timestamp during the shutdown to help correctly interpret the time if the standby gets promoted without a restart. shutdown --> shutdown of slot sync worker (as it was not clear if it is instance shutdown or something else) 2) 'The time since the slot has became inactive'. has became-->has become or just became Please check it in all the files. There are multiple places. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 11:07:51AM +0530, Bharath Rupireddy wrote: > On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote: > > But immediately after promotion, we can not rely on the above check > > and thus possibility of synced slots invalidation is there. To > > maintain consistent behavior regarding the setting of > > last_inactive_time for synced slots, similar to user slots, one > > potential solution to prevent this invalidation issue is to update the > > last_inactive_time of all synced slots within the ShutDownSlotSync() > > function during FinishWalRecovery(). This approach ensures that > > promotion doesn't immediately invalidate slots, and henceforth, we > > possess a correct last_inactive_time as a basis for invalidation going > > forward. This will be equivalent to updating last_inactive_time during > > restart (but without actual restart during promotion). > > The plus point of maintaining last_inactive_time for synced slots > > could be, this can provide data to the user on when last time the sync > > was attempted on that particular slot by background slot sync worker > > or SQl function. Thoughts? > > Please find the attached v21 patch implementing the above idea. It > also has changes for renaming last_inactive_time to inactive_since. Thanks! A few comments: 1 === One trailing whitespace: Applying: Fix review comments for slot's last_inactive_time property .git/rebase-apply/patch:433: trailing whitespace. # got a valid inactive_since value representing the last slot sync time. warning: 1 line adds whitespace errors. 2 === It looks like inactive_since is set to the current timestamp on the standby each time the sync worker does a cycle: primary: postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; slot_name | inactive_since -------------+------------------------------- lsub27_slot | 2024-03-26 07:39:19.745517+00 lsub28_slot | 2024-03-26 07:40:24.953826+00 standby: postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; slot_name | inactive_since -------------+------------------------------- lsub27_slot | 2024-03-26 07:43:56.387324+00 lsub28_slot | 2024-03-26 07:43:56.387338+00 I don't think that should be the case. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > 2 === > > It looks like inactive_since is set to the current timestamp on the standby > each time the sync worker does a cycle: > > primary: > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > slot_name | inactive_since > -------------+------------------------------- > lsub27_slot | 2024-03-26 07:39:19.745517+00 > lsub28_slot | 2024-03-26 07:40:24.953826+00 > > standby: > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > slot_name | inactive_since > -------------+------------------------------- > lsub27_slot | 2024-03-26 07:43:56.387324+00 > lsub28_slot | 2024-03-26 07:43:56.387338+00 > > I don't think that should be the case. > But why? This is exactly what we discussed in another thread where we agreed to update inactive_since even for sync slots. In each sync cycle, we acquire/release the slot, so the inactive_since gets updated. See synchronize_one_slot(). -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote: > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > 2 === > > > > It looks like inactive_since is set to the current timestamp on the standby > > each time the sync worker does a cycle: > > > > primary: > > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > > slot_name | inactive_since > > -------------+------------------------------- > > lsub27_slot | 2024-03-26 07:39:19.745517+00 > > lsub28_slot | 2024-03-26 07:40:24.953826+00 > > > > standby: > > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > > slot_name | inactive_since > > -------------+------------------------------- > > lsub27_slot | 2024-03-26 07:43:56.387324+00 > > lsub28_slot | 2024-03-26 07:43:56.387338+00 > > > > I don't think that should be the case. > > > > But why? This is exactly what we discussed in another thread where we > agreed to update inactive_since even for sync slots. Hum, I thought we agreed to "sync" it and to "update it to current time" only at promotion time. I don't think updating inactive_since to current time during each cycle makes sense (I mean I understand the use case: being able to say when slots have been sync, but if this is what we want then we should consider an extra view or an extra field but not relying on the inactive_since one). If the primary goes down, not updating inactive_since to the current time could also provide benefit such as knowing the inactive_since of the primary slots (from the standby) the last time it has been synced. If we update it to the current time then this information is lost. > In each sync > cycle, we acquire/release the slot, so the inactive_since gets > updated. See synchronize_one_slot(). Right, and I think we should put an extra condition if in recovery. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Mar 26, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Review comments on v18_0002 and v18_0005 > ======================================= > > 1. > We have decided to update inactive_since for temporary slots. So, > unless there is some reason, we should allow inactive_timeout to also > be set for temporary slots. WFM. A temporary slot that's inactive for a long time before even the server isn't shutdown can utilize this inactive_timeout based invalidation mechanism. And, I'd also vote for we being consistent for temporary and synced slots. > L.last_inactive_time, > + L.inactive_timeout, > > Shall we keep inactive_timeout before > last_inactive_time/inactive_since? I don't have any strong reason to > propose that way apart from that the former is provided by the user. Done. > + if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true)) > + invalidated = true; > > I don't think we should try to invalidate the slots in > pg_get_replication_slots. This function's purpose is to get the > current information on slots and has no intention to perform any work > for slots. Any error due to invalidation won't be what the user would > be expecting here. Agree. Removed. > 4. > +static bool > +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, > + bool need_control_lock, > + bool need_mutex) > { > ... > ... > + if (need_control_lock) > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > + > + Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED)); > + > + /* > + * Check if the slot needs to be invalidated due to inactive_timeout. We > + * do this with the spinlock held to avoid race conditions -- for example > + * the restart_lsn could move forward, or the slot could be dropped. > + */ > + if (need_mutex) > + SpinLockAcquire(&slot->mutex); > ... > > I find this combination of parameters a bit strange. Because, say if > need_mutex is false and need_control_lock is true then that means this > function will acquire LWlock after acquiring spinlock which is > unacceptable. Now, this may not happen in practice as the callers > won't pass such a combination but still, this functionality should be > improved. Right. Either we need two locks or not. So, changed it to use just one bool need_locks, upon set both control lock and spin lock are acquired and released. On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > patch 002: > > 2) > slotsync.c: > > ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY, > remote_slot->two_phase, > remote_slot->failover, > - true); > + true, 0); > > + slot->data.inactive_timeout = remote_slot->inactive_timeout; > > Is there a reason we are not passing 'remote_slot->inactive_timeout' > to ReplicationSlotCreate() directly? The slot there gets created temporarily for which we were not supporting inactive_timeout being set. But, in the latest v22 patch we are supporting, so passing the remote_slot->inactive_timeout directly. > 3) > slotfuncs.c > pg_create_logical_replication_slot(): > + int inactive_timeout = PG_GETARG_INT32(5); > > Can we mention here that timeout is in seconds either in comment or > rename variable to inactive_timeout_secs? > > Please do this for create_physical_replication_slot(), > create_logical_replication_slot(), > pg_create_physical_replication_slot() as well. Added /* in seconds */ next the variable declaration. > --------- > 4) > + int inactive_timeout; /* The amount of time in seconds the slot > + * is allowed to be inactive. */ > } LogicalSlotInfo; > > Do we need to mention "before getting invalided" like other places > (in last patch)? Done. > 5) > Same at these two places. "before getting invalided" to be added in > the last patch otherwise the info is incompleted. > > + > + /* The amount of time in seconds the slot is allowed to be inactive */ > + int inactive_timeout; > } ReplicationSlotPersistentData; > > > + * inactive_timeout: The amount of time in seconds the slot is allowed to be > + * inactive. > */ > void > ReplicationSlotCreate(const char *name, bool db_specific, > Same here. "before getting invalidated" ? Done. On Tue, Mar 26, 2024 at 12:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > > Please find the attached v21 patch implementing the above idea. It > > also has changes for renaming last_inactive_time to inactive_since. > > Thanks for the patch. I have tested this patch alone, and it does what > it says. One additional thing which I noticed is that now it sets > inactive_since for temp slots as well, but that idea looks fine to me. Right. Let's be consistent by treating all slots the same. > I could not test 'invalidation on promotion bug' with this change, as > that needed rebasing of the rest of the patches. Please use the v22 patch set. > Few trivial things: > > 1) > Commti msg: > > ensures the value is set to current timestamp during the > shutdown to help correctly interpret the time if the standby gets > promoted without a restart. > > shutdown --> shutdown of slot sync worker (as it was not clear if it > is instance shutdown or something else) Changed it to "shutdown of slot sync machinery" to be consistent with the comments. > 2) > 'The time since the slot has became inactive'. > > has became-->has become > or just became > > Please check it in all the files. There are multiple places. Fixed. Please see the attached v23 patches. I've addressed all the review comments received so far from Amit and Shveta. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 2:27 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > 1) > > Commti msg: > > > > ensures the value is set to current timestamp during the > > shutdown to help correctly interpret the time if the standby gets > > promoted without a restart. > > > > shutdown --> shutdown of slot sync worker (as it was not clear if it > > is instance shutdown or something else) > > Changed it to "shutdown of slot sync machinery" to be consistent with > the comments. Thanks for addressing the comments. Just to give more clarity here (so that you take a informed decision), I am not sure if we actually shut down slot-sync machinery. We only shot down slot sync worker. Slot-sync machinery can still be used using 'pg_sync_replication_slots' SQL function. I can easily reproduce the scenario where SQL function and reset_synced_slots_info() are going in parallel where the latter hits 'Assert(s->active_pid == 0)' due to the fact that parallel SQL sync function is active on that slot. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 02:27:17PM +0530, Bharath Rupireddy wrote: > Please use the v22 patch set. Thanks! 1 === +reset_synced_slots_info(void) I'm not sure "reset" is the right word, what about slot_sync_shutdown_update()? 2 === + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; + + /* Check if it is a synchronized slot */ + if (s->in_use && s->data.synced) + { + TimestampTz now; + + Assert(SlotIsLogical(s)); + Assert(s->active_pid == 0); + + /* + * Set the time since the slot has become inactive after shutting + * down slot sync machinery. This helps correctly interpret the + * time if the standby gets promoted without a restart. We get the + * current time beforehand to avoid a system call while holding + * the lock. + */ + now = GetCurrentTimestamp(); What about moving "now = GetCurrentTimestamp()" outside of the for loop? (it would be less costly and probably good enough). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote: > > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > 2 === > > > > > > It looks like inactive_since is set to the current timestamp on the standby > > > each time the sync worker does a cycle: > > > > > > primary: > > > > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > > > slot_name | inactive_since > > > -------------+------------------------------- > > > lsub27_slot | 2024-03-26 07:39:19.745517+00 > > > lsub28_slot | 2024-03-26 07:40:24.953826+00 > > > > > > standby: > > > > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > > > slot_name | inactive_since > > > -------------+------------------------------- > > > lsub27_slot | 2024-03-26 07:43:56.387324+00 > > > lsub28_slot | 2024-03-26 07:43:56.387338+00 > > > > > > I don't think that should be the case. > > > > > > > But why? This is exactly what we discussed in another thread where we > > agreed to update inactive_since even for sync slots. > > Hum, I thought we agreed to "sync" it and to "update it to current time" > only at promotion time. I think there may have been some misunderstanding here. But now if I rethink this, I am fine with 'inactive_since' getting synced from primary to standby. But if we do that, we need to add docs stating "inactive_since" represents primary's inactivity and not standby's slots inactivity for synced slots. The reason for this clarification is that the synced slot might be generated much later, yet 'inactive_since' is synced from the primary, potentially indicating a time considerably earlier than when the synced slot was actually created. Another approach could be that "inactive_since" for synced slot actually gives its own inactivity data rather than giving primary's slot data. We update inactive_since on standby only at 3 occasions: 1) at the time of creation of the synced slot. 2) during standby restart. 3) during promotion of standby. I have attached a sample patch for this idea as.txt file. I am fine with any of these approaches. One gives data synced from primary for synced slots, while another gives actual inactivity data of synced slots. thanks Shveta
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 03:17:36PM +0530, shveta malik wrote: > On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote: > > > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > 2 === > > > > > > > > It looks like inactive_since is set to the current timestamp on the standby > > > > each time the sync worker does a cycle: > > > > > > > > primary: > > > > > > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > > > > slot_name | inactive_since > > > > -------------+------------------------------- > > > > lsub27_slot | 2024-03-26 07:39:19.745517+00 > > > > lsub28_slot | 2024-03-26 07:40:24.953826+00 > > > > > > > > standby: > > > > > > > > postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't'; > > > > slot_name | inactive_since > > > > -------------+------------------------------- > > > > lsub27_slot | 2024-03-26 07:43:56.387324+00 > > > > lsub28_slot | 2024-03-26 07:43:56.387338+00 > > > > > > > > I don't think that should be the case. > > > > > > > > > > But why? This is exactly what we discussed in another thread where we > > > agreed to update inactive_since even for sync slots. > > > > Hum, I thought we agreed to "sync" it and to "update it to current time" > > only at promotion time. > > I think there may have been some misunderstanding here. Indeed ;-) > But now if I > rethink this, I am fine with 'inactive_since' getting synced from > primary to standby. But if we do that, we need to add docs stating > "inactive_since" represents primary's inactivity and not standby's > slots inactivity for synced slots. Yeah sure. > The reason for this clarification > is that the synced slot might be generated much later, yet > 'inactive_since' is synced from the primary, potentially indicating a > time considerably earlier than when the synced slot was actually > created. Right. > Another approach could be that "inactive_since" for synced slot > actually gives its own inactivity data rather than giving primary's > slot data. We update inactive_since on standby only at 3 occasions: > 1) at the time of creation of the synced slot. > 2) during standby restart. > 3) during promotion of standby. > > I have attached a sample patch for this idea as.txt file. Thanks! > I am fine with any of these approaches. One gives data synced from > primary for synced slots, while another gives actual inactivity data > of synced slots. What about another approach?: inactive_since gives data synced from primary for synced slots and another dedicated field (could be added later...) could represent what you suggest as the other option. Another cons of updating inactive_since at the current time during each slot sync cycle is that calling GetCurrentTimestamp() very frequently (during each sync cycle of very active slots) could be too costly. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Mar 26, 2024 at 3:12 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 02:27:17PM +0530, Bharath Rupireddy wrote: > > Please use the v22 patch set. > > Thanks! > > 1 === > > +reset_synced_slots_info(void) > > I'm not sure "reset" is the right word, what about slot_sync_shutdown_update()? > *shutdown_update() sounds generic. How about update_synced_slots_inactive_time()? I think it is a bit longer but conveys the meaning. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Ajin Cherian
Date:
On Tue, Mar 26, 2024 at 7:57 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
Please see the attached v23 patches. I've addressed all the review
comments received so far from Amit and Shveta.
In patch 0003:
+ SpinLockAcquire(&slot->mutex);
+ }
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ if (slot->inactive_since > 0 &&
+ slot->data.inactive_timeout > 0)
+ {
+ TimestampTz now;
+
+ /* inactive_since is only tracked for inactive slots */
+ Assert(slot->active_pid == 0);
+
+ now = GetCurrentTimestamp();
+ if (TimestampDifferenceExceeds(slot->inactive_since, now,
+ slot->data.inactive_timeout * 1000))
+ inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+ }
+
+ if (need_locks)
+ {
+ SpinLockRelease(&slot->mutex);
+ SpinLockAcquire(&slot->mutex);
+ }
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ if (slot->inactive_since > 0 &&
+ slot->data.inactive_timeout > 0)
+ {
+ TimestampTz now;
+
+ /* inactive_since is only tracked for inactive slots */
+ Assert(slot->active_pid == 0);
+
+ now = GetCurrentTimestamp();
+ if (TimestampDifferenceExceeds(slot->inactive_since, now,
+ slot->data.inactive_timeout * 1000))
+ inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+ }
+
+ if (need_locks)
+ {
+ SpinLockRelease(&slot->mutex);
Here, GetCurrentTimestamp() is still called with SpinLock held. Maybe do this prior to acquiring the spinlock.
regards,
Ajin Cherian
Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > > I think there may have been some misunderstanding here. > > Indeed ;-) > > > But now if I > > rethink this, I am fine with 'inactive_since' getting synced from > > primary to standby. But if we do that, we need to add docs stating > > "inactive_since" represents primary's inactivity and not standby's > > slots inactivity for synced slots. > > Yeah sure. > > > The reason for this clarification > > is that the synced slot might be generated much later, yet > > 'inactive_since' is synced from the primary, potentially indicating a > > time considerably earlier than when the synced slot was actually > > created. > > Right. > > > Another approach could be that "inactive_since" for synced slot > > actually gives its own inactivity data rather than giving primary's > > slot data. We update inactive_since on standby only at 3 occasions: > > 1) at the time of creation of the synced slot. > > 2) during standby restart. > > 3) during promotion of standby. > > > > I have attached a sample patch for this idea as.txt file. > > Thanks! > > > I am fine with any of these approaches. One gives data synced from > > primary for synced slots, while another gives actual inactivity data > > of synced slots. > > What about another approach?: inactive_since gives data synced from primary for > synced slots and another dedicated field (could be added later...) could > represent what you suggest as the other option. Yes, okay with me. I think there is some confusion here as well. In my second approach above, I have not suggested anything related to sync-worker. We can think on that later if we really need another field which give us sync time. In my second approach, I have tried to avoid updating inactive_since for synced slots during sync process. We update that field during creation of synced slot so that inactive_since reflects correct info even for synced slots (rather than copying from primary). Please have a look at my patch and let me know your thoughts. I am fine with copying it from primary as well and documenting this behaviour. > Another cons of updating inactive_since at the current time during each slot > sync cycle is that calling GetCurrentTimestamp() very frequently > (during each sync cycle of very active slots) could be too costly. Right. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote: > > > What about another approach?: inactive_since gives data synced from primary for > > synced slots and another dedicated field (could be added later...) could > > represent what you suggest as the other option. > > Yes, okay with me. I think there is some confusion here as well. In my > second approach above, I have not suggested anything related to > sync-worker. We can think on that later if we really need another > field which give us sync time. In my second approach, I have tried to > avoid updating inactive_since for synced slots during sync process. We > update that field during creation of synced slot so that > inactive_since reflects correct info even for synced slots (rather > than copying from primary). Please have a look at my patch and let me > know your thoughts. I am fine with copying it from primary as well and > documenting this behaviour. I took a look at your patch. --- a/src/backend/replication/logical/slotsync.c +++ b/src/backend/replication/logical/slotsync.c @@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) SpinLockAcquire(&slot->mutex); slot->effective_catalog_xmin = xmin_horizon; slot->data.catalog_xmin = xmin_horizon; + slot->inactive_since = GetCurrentTimestamp(); SpinLockRelease(&slot->mutex); If we just sync inactive_since value for synced slots while in recovery from the primary, so be it. Why do we need to update it to the current time when the slot is being created? We don't expose slot creation time, no? Aren't we fine if we just sync the value from primary and document that fact? After the promotion, we can reset it to the current time so that it gets its own time. Do you see any issues with it? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > What about another approach?: inactive_since gives data synced from primary for > > > synced slots and another dedicated field (could be added later...) could > > > represent what you suggest as the other option. > > > > Yes, okay with me. I think there is some confusion here as well. In my > > second approach above, I have not suggested anything related to > > sync-worker. We can think on that later if we really need another > > field which give us sync time. In my second approach, I have tried to > > avoid updating inactive_since for synced slots during sync process. We > > update that field during creation of synced slot so that > > inactive_since reflects correct info even for synced slots (rather > > than copying from primary). Please have a look at my patch and let me > > know your thoughts. I am fine with copying it from primary as well and > > documenting this behaviour. > > I took a look at your patch. > > --- a/src/backend/replication/logical/slotsync.c > +++ b/src/backend/replication/logical/slotsync.c > @@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid > remote_dbid) > SpinLockAcquire(&slot->mutex); > slot->effective_catalog_xmin = xmin_horizon; > slot->data.catalog_xmin = xmin_horizon; > + slot->inactive_since = GetCurrentTimestamp(); > SpinLockRelease(&slot->mutex); > > If we just sync inactive_since value for synced slots while in > recovery from the primary, so be it. Why do we need to update it to > the current time when the slot is being created? If we update inactive_since at synced slot's creation or during restart (skipping setting it during sync), then this time reflects actual 'inactive_since' for that particular synced slot. Isn't that a clear info for the user and in alignment of what the name 'inactive_since' actually suggests? > We don't expose slot > creation time, no? No, we don't. But for synced slot, that is the time since that slot is inactive (unless promoted), so we are exposing inactive_since and not creation time. >Aren't we fine if we just sync the value from > primary and document that fact? After the promotion, we can reset it > to the current time so that it gets its own time. Do you see any > issues with it? Yes, we can do that. But curious to know, do we see any additional benefit of reflecting primary's inactive_since at standby which I might be missing? thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 04:49:18PM +0530, shveta malik wrote: > On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > What about another approach?: inactive_since gives data synced from primary for > > > > synced slots and another dedicated field (could be added later...) could > > > > represent what you suggest as the other option. > > > > > > Yes, okay with me. I think there is some confusion here as well. In my > > > second approach above, I have not suggested anything related to > > > sync-worker. We can think on that later if we really need another > > > field which give us sync time. In my second approach, I have tried to > > > avoid updating inactive_since for synced slots during sync process. We > > > update that field during creation of synced slot so that > > > inactive_since reflects correct info even for synced slots (rather > > > than copying from primary). Please have a look at my patch and let me > > > know your thoughts. I am fine with copying it from primary as well and > > > documenting this behaviour. > > > > I took a look at your patch. > > > > --- a/src/backend/replication/logical/slotsync.c > > +++ b/src/backend/replication/logical/slotsync.c > > @@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid > > remote_dbid) > > SpinLockAcquire(&slot->mutex); > > slot->effective_catalog_xmin = xmin_horizon; > > slot->data.catalog_xmin = xmin_horizon; > > + slot->inactive_since = GetCurrentTimestamp(); > > SpinLockRelease(&slot->mutex); > > > > If we just sync inactive_since value for synced slots while in > > recovery from the primary, so be it. Why do we need to update it to > > the current time when the slot is being created? > > If we update inactive_since at synced slot's creation or during > restart (skipping setting it during sync), then this time reflects > actual 'inactive_since' for that particular synced slot. Isn't that a > clear info for the user and in alignment of what the name > 'inactive_since' actually suggests? > > > We don't expose slot > > creation time, no? > > No, we don't. But for synced slot, that is the time since that slot is > inactive (unless promoted), so we are exposing inactive_since and not > creation time. > > >Aren't we fine if we just sync the value from > > primary and document that fact? After the promotion, we can reset it > > to the current time so that it gets its own time. Do you see any > > issues with it? > > Yes, we can do that. But curious to know, do we see any additional > benefit of reflecting primary's inactive_since at standby which I > might be missing? In case the primary goes down, then one could use the value on the standby to get the value coming from the primary. I think that could be useful info to have. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Mar 26, 2024 at 09:59:23PM +0530, Bharath Rupireddy wrote: > On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > If we just sync inactive_since value for synced slots while in > > recovery from the primary, so be it. Why do we need to update it to > > the current time when the slot is being created? We don't expose slot > > creation time, no? Aren't we fine if we just sync the value from > > primary and document that fact? After the promotion, we can reset it > > to the current time so that it gets its own time. > > I'm attaching v24 patches. It implements the above idea proposed > upthread for synced slots. I've now separated > s/last_inactive_time/inactive_since and synced slots behaviour. Please > have a look. Thanks! ==== v24-0001 It's now pure mechanical changes and it looks good to me. ==== v24-0002 1 === This commit does two things: 1) Updates inactive_since for sync slots with the value received from the primary's slot. Tested it and it does that. 2 === 2) Ensures the value is set to current timestamp during the shutdown of slot sync machinery to help correctly interpret the time if the standby gets promoted without a restart. Tested it and it does that. 3 === +/* + * Reset the synced slots info such as inactive_since after shutting + * down the slot sync machinery. + */ +static void +update_synced_slots_inactive_time(void) Looks like the comment "reset" is not matching the name of the function and what it does. 4 === + /* + * We get the current time beforehand and only once to avoid + * system calls overhead while holding the lock. + */ + if (now == 0) + now = GetCurrentTimestamp(); Also +1 of having GetCurrentTimestamp() just called one time within the loop. 5 === - if (!(RecoveryInProgress() && slot->data.synced)) + if (!(InRecovery && slot->data.synced)) slot->inactive_since = GetCurrentTimestamp(); else slot->inactive_since = 0; Not related to this change but more the way RestoreSlotFromDisk() behaves here: For a sync slot on standby it will be set to zero and then later will be synchronized with the one coming from the primary. I think that's fine to have it to zero for this window of time. Now, if the standby is down and one sets sync_replication_slots to off, then inactive_since will be set to zero on the standby at startup and not synchronized (unless one triggers a manual sync). I also think that's fine but it might be worth to document this behavior (that after a standby startup inactive_since is zero until the next sync...). 6 === + print "HI $slot_name $name $inactive_since $slot_creation_time\n"; garbage? 7 === +# Capture and validate inactive_since of a given slot. +sub capture_and_validate_slot_inactive_since +{ + my ($node, $slot_name, $slot_creation_time) = @_; + my $name = $node->name; We know have capture_and_validate_slot_inactive_since at 2 places: 040_standby_failover_slots_sync.pl and 019_replslot_limit.pl. Worth to create a sub in Cluster.pm? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 9:59 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > If we just sync inactive_since value for synced slots while in > > recovery from the primary, so be it. Why do we need to update it to > > the current time when the slot is being created? We don't expose slot > > creation time, no? Aren't we fine if we just sync the value from > > primary and document that fact? After the promotion, we can reset it > > to the current time so that it gets its own time. > > I'm attaching v24 patches. It implements the above idea proposed > upthread for synced slots. I've now separated > s/last_inactive_time/inactive_since and synced slots behaviour. Please > have a look. Thanks for the patches. Few trivial comments for v24-002: 1) slot.c: + * data from the remote slot. We use InRecovery flag instead of + * RecoveryInProgress() as it always returns true even for normal + * server startup. a) Not clear what 'it' refers to. Better to use 'the latter' b) Is it better to mention the primary here: 'as the latter always returns true even on the primary server during startup'. 2) update_local_synced_slot(): - strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0) + strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 && + remote_slot->inactive_since == slot->inactive_since) When this code was written initially, the intent was to do strcmp at the end (only if absolutely needed). It will be good if we maintain the same and add new checks before strcmp. 3) update_synced_slots_inactive_time(): This assert is removed, is it intentional? Assert(s->active_pid == 0); 4) 040_standby_failover_slots_sync.pl: +# Capture the inactive_since of the slot from the standby the logical failover +# slots are synced/created on the standby. The comment is unclear, something seems missing. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > I'm attaching v24 patches. It implements the above idea proposed > > upthread for synced slots. > > ==== v24-0002 > > 1 === > > This commit does two things: > 1) Updates inactive_since for sync slots with the value > received from the primary's slot. > > Tested it and it does that. Thanks. I've added a test case for this. > 2 === > > 2) Ensures the value is set to current timestamp during the > shutdown of slot sync machinery to help correctly interpret the > time if the standby gets promoted without a restart. > > Tested it and it does that. Thanks. I've added a test case for this. > 3 === > > +/* > + * Reset the synced slots info such as inactive_since after shutting > + * down the slot sync machinery. > + */ > +static void > +update_synced_slots_inactive_time(void) > > Looks like the comment "reset" is not matching the name of the function and > what it does. Changed. I've also changed the function name to update_synced_slots_inactive_since to be precise on what it exactly does. > 4 === > > + /* > + * We get the current time beforehand and only once to avoid > + * system calls overhead while holding the lock. > + */ > + if (now == 0) > + now = GetCurrentTimestamp(); > > Also +1 of having GetCurrentTimestamp() just called one time within the loop. Right. > 5 === > > - if (!(RecoveryInProgress() && slot->data.synced)) > + if (!(InRecovery && slot->data.synced)) > slot->inactive_since = GetCurrentTimestamp(); > else > slot->inactive_since = 0; > > Not related to this change but more the way RestoreSlotFromDisk() behaves here: > > For a sync slot on standby it will be set to zero and then later will be > synchronized with the one coming from the primary. I think that's fine to have > it to zero for this window of time. Right. > Now, if the standby is down and one sets sync_replication_slots to off, > then inactive_since will be set to zero on the standby at startup and not > synchronized (unless one triggers a manual sync). I also think that's fine but > it might be worth to document this behavior (that after a standby startup > inactive_since is zero until the next sync...). Isn't this behaviour applicable for other slot parameters that the slot syncs from the remote slot on the primary? I've added the following note in the comments when we update inactive_since in RestoreSlotFromDisk. * Note that for synced slots after the standby starts up (i.e. after * the slots are loaded from the disk), the inactive_since will remain * zero until the next slot sync cycle. */ if (!(InRecovery && slot->data.synced)) slot->inactive_since = GetCurrentTimestamp(); else slot->inactive_since = 0; > 6 === > > + print "HI $slot_name $name $inactive_since $slot_creation_time\n"; > > garbage? Removed. > 7 === > > +# Capture and validate inactive_since of a given slot. > +sub capture_and_validate_slot_inactive_since > +{ > + my ($node, $slot_name, $slot_creation_time) = @_; > + my $name = $node->name; > > We know have capture_and_validate_slot_inactive_since at 2 places: > 040_standby_failover_slots_sync.pl and 019_replslot_limit.pl. > > Worth to create a sub in Cluster.pm? I'd second that thought for now. We might have to debate first if it's useful for all the nodes even without replication, and if yes, the naming stuff and all that. Historically, we've had such duplicated functions until recently, for instance advance_wal and log_contains. We moved them over to a common perl library Cluster.pm very recently. I'm sure we can come back later to move it to Cluster.pm. On Wed, Mar 27, 2024 at 9:02 AM shveta malik <shveta.malik@gmail.com> wrote: > > 1) > slot.c: > + * data from the remote slot. We use InRecovery flag instead of > + * RecoveryInProgress() as it always returns true even for normal > + * server startup. > > a) Not clear what 'it' refers to. Better to use 'the latter' > b) Is it better to mention the primary here: > 'as the latter always returns true even on the primary server during startup'. Modified. > 2) > update_local_synced_slot(): > > - strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0) > + strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 && > + remote_slot->inactive_since == slot->inactive_since) > > When this code was written initially, the intent was to do strcmp at > the end (only if absolutely needed). It will be good if we maintain > the same and add new checks before strcmp. Done. > 3) > update_synced_slots_inactive_time(): > > This assert is removed, is it intentional? > Assert(s->active_pid == 0); Yes, the slot can get acquired in the corner case when someone runs pg_sync_replication_slots concurrently at this time. I'm referring to the issue reported upthread. We don't prevent one running pg_sync_replication_slots in promotion/ShutDownSlotSync phase right? Maybe we should prevent that otherwise some of the slots are synced and the standby gets promoted while others are yet-to-be-synced. > 4) > 040_standby_failover_slots_sync.pl: > > +# Capture the inactive_since of the slot from the standby the logical failover > +# slots are synced/created on the standby. > > The comment is unclear, something seems missing. Nice catch. Yes, that was wrong. I've modified it now. Please find the attached v25-0001 (made this 0001 patch now as inactive_since patch is committed) patch with the above changes. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > 3) > > update_synced_slots_inactive_time(): > > > > This assert is removed, is it intentional? > > Assert(s->active_pid == 0); > > Yes, the slot can get acquired in the corner case when someone runs > pg_sync_replication_slots concurrently at this time. I'm referring to > the issue reported upthread. We don't prevent one running > pg_sync_replication_slots in promotion/ShutDownSlotSync phase right? > Maybe we should prevent that otherwise some of the slots are synced > and the standby gets promoted while others are yet-to-be-synced. > We should do something about it but that shouldn't be done in this patch. We can handle it separately and then add such an assert. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > 3) > > > update_synced_slots_inactive_time(): > > > > > > This assert is removed, is it intentional? > > > Assert(s->active_pid == 0); > > > > Yes, the slot can get acquired in the corner case when someone runs > > pg_sync_replication_slots concurrently at this time. I'm referring to > > the issue reported upthread. We don't prevent one running > > pg_sync_replication_slots in promotion/ShutDownSlotSync phase right? > > Maybe we should prevent that otherwise some of the slots are synced > > and the standby gets promoted while others are yet-to-be-synced. > > > > We should do something about it but that shouldn't be done in this > patch. We can handle it separately and then add such an assert. Agreed. Once this patch is concluded, I can fix the slot sync shutdown issue and will also add this 'assert' back. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 27, 2024 at 10:24 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > 3) > > > > update_synced_slots_inactive_time(): > > > > > > > > This assert is removed, is it intentional? > > > > Assert(s->active_pid == 0); > > > > > > Yes, the slot can get acquired in the corner case when someone runs > > > pg_sync_replication_slots concurrently at this time. I'm referring to > > > the issue reported upthread. We don't prevent one running > > > pg_sync_replication_slots in promotion/ShutDownSlotSync phase right? > > > Maybe we should prevent that otherwise some of the slots are synced > > > and the standby gets promoted while others are yet-to-be-synced. > > > > > > > We should do something about it but that shouldn't be done in this > > patch. We can handle it separately and then add such an assert. > > Agreed. Once this patch is concluded, I can fix the slot sync shutdown > issue and will also add this 'assert' back. Agreed. Thanks. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Mar 26, 2024 at 6:05 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > We can think on that later if we really need another > > field which give us sync time. > > I think that calling GetCurrentTimestamp() so frequently could be too costly, so > I'm not sure we should. Agreed. > > In my second approach, I have tried to > > avoid updating inactive_since for synced slots during sync process. We > > update that field during creation of synced slot so that > > inactive_since reflects correct info even for synced slots (rather > > than copying from primary). > > Yeah, and I think we could create a dedicated field with this information > if we feel the need. Okay. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Please find the attached v25-0001 (made this 0001 patch now as > inactive_since patch is committed) patch with the above changes. Fixed an issue in synchronize_slots where DatumGetLSN is being used in place of DatumGetTimestampTz. Found this via CF bot member [1], not on my dev system. Please find the attached v6 patch. [1] [05:14:39.281] #7 DatumGetLSN (X=<optimized out>) at ../src/include/utils/pg_lsn.h:24 [05:14:39.281] No locals. [05:14:39.281] #8 synchronize_slots (wrconn=wrconn@entry=0x583cd170) at ../src/backend/replication/logical/slotsync.c:757 [05:14:39.281] isnull = false [05:14:39.281] remote_slot = 0x583ce1a8 [05:14:39.281] d = <optimized out> [05:14:39.281] col = 10 [05:14:39.281] slotRow = {25, 25, 3220, 3220, 28, 16, 16, 25, 25, 1184} [05:14:39.281] res = 0x583cd1b8 [05:14:39.281] tupslot = 0x583ce11c [05:14:39.281] remote_slot_list = 0x0 [05:14:39.281] some_slot_updated = false [05:14:39.281] started_tx = false [05:14:39.281] query = 0x57692bc4 "SELECT slot_name, plugin, confirmed_flush_lsn, restart_lsn, catalog_xmin, two_phase, failover, database, invalidation_reason, inactive_since FROM pg_catalog.pg_replication_slots WHERE failover and NOT"... [05:14:39.281] __func__ = "synchronize_slots" [05:14:39.281] #9 0x56ff9d1e in SyncReplicationSlots (wrconn=0x583cd170) at ../src/backend/replication/logical/slotsync.c:1504 -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 27, 2024 at 10:08:33AM +0530, Bharath Rupireddy wrote: > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > - if (!(RecoveryInProgress() && slot->data.synced)) > > + if (!(InRecovery && slot->data.synced)) > > slot->inactive_since = GetCurrentTimestamp(); > > else > > slot->inactive_since = 0; > > > > Not related to this change but more the way RestoreSlotFromDisk() behaves here: > > > > For a sync slot on standby it will be set to zero and then later will be > > synchronized with the one coming from the primary. I think that's fine to have > > it to zero for this window of time. > > Right. > > > Now, if the standby is down and one sets sync_replication_slots to off, > > then inactive_since will be set to zero on the standby at startup and not > > synchronized (unless one triggers a manual sync). I also think that's fine but > > it might be worth to document this behavior (that after a standby startup > > inactive_since is zero until the next sync...). > > Isn't this behaviour applicable for other slot parameters that the > slot syncs from the remote slot on the primary? No they are persisted on disk. If not, we'd not know where to resume the decoding from on the standby in case primary is down and/or sync is off. > I've added the following note in the comments when we update > inactive_since in RestoreSlotFromDisk. > > * Note that for synced slots after the standby starts up (i.e. after > * the slots are loaded from the disk), the inactive_since will remain > * zero until the next slot sync cycle. > */ > if (!(InRecovery && slot->data.synced)) > slot->inactive_since = GetCurrentTimestamp(); > else > slot->inactive_since = 0; I think we should add some words in the doc too and also about what the meaning of inactive_since on the standby is (as suggested by Shveta in [1]). [1]: https://www.postgresql.org/message-id/CAJpy0uDkTW%2Bt1k3oPkaipFBzZePfFNB5DmiA%3D%3DpxRGcAdpF%3DPg%40mail.gmail.com > > 7 === > > > > +# Capture and validate inactive_since of a given slot. > > +sub capture_and_validate_slot_inactive_since > > +{ > > + my ($node, $slot_name, $slot_creation_time) = @_; > > + my $name = $node->name; > > > > We know have capture_and_validate_slot_inactive_since at 2 places: > > 040_standby_failover_slots_sync.pl and 019_replslot_limit.pl. > > > > Worth to create a sub in Cluster.pm? > > I'd second that thought for now. We might have to debate first if it's > useful for all the nodes even without replication, and if yes, the > naming stuff and all that. Historically, we've had such duplicated > functions until recently, for instance advance_wal and log_contains. > We > moved them over to a common perl library Cluster.pm very recently. I'm > sure we can come back later to move it to Cluster.pm. I thought that would be the right time not to introduce duplicated code. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Mar 27, 2024 at 11:05 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Fixed an issue in synchronize_slots where DatumGetLSN is being used in > place of DatumGetTimestampTz. Found this via CF bot member [1], not on > my dev system. > > Please find the attached v6 patch. Thanks for the patch. Few trivial things: ---------- 1) system-views.sgml: a) "Note that the slots" --> "Note that the slots on the standbys," --it is good to mention "standbys" as synced could be true on primary as well (promoted standby) b) If you plan to add more info which Bertrand suggested, then it will be better to make a <note> section instead of using "Note" 2) commit msg: "The impact of this on a promoted standby inactive_since is always NULL for all synced slots even after server restart. " Sentence looks broken. --------- Apart from the above trivial things, v26-001 looks good to me. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 27, 2024 at 11:39 AM shveta malik <shveta.malik@gmail.com> wrote: > > Thanks for the patch. Few trivial things: Thanks for reviewing. > ---------- > 1) > system-views.sgml: > > a) "Note that the slots" --> "Note that the slots on the standbys," > --it is good to mention "standbys" as synced could be true on primary > as well (promoted standby) Done. > b) If you plan to add more info which Bertrand suggested, then it will > be better to make a <note> section instead of using "Note" I added the note that Bertrand specified upthread. But, I couldn't find an instance of adding <note> ... </note> within a table. Hence with "Note that ...." statments just like any other notes in the system-views.sgml. pg_replication_slot in system-vews.sgml renders as table, so having <note> ... </note> may not be a great idea. > 2) > commit msg: > > "The impact of this > on a promoted standby inactive_since is always NULL for all > synced slots even after server restart. > " > Sentence looks broken. > --------- Reworded. > Apart from the above trivial things, v26-001 looks good to me. Please check the attached v27 patch which also has Bertrand's comment on deduplicating the TAP function. I've now moved it to Cluster.pm. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 27, 2024 at 02:55:17PM +0530, Bharath Rupireddy wrote: > Please check the attached v27 patch which also has Bertrand's comment > on deduplicating the TAP function. I've now moved it to Cluster.pm. Thanks! 1 === + Note that the slots on the standbys that are being synced from a + primary server (whose <structfield>synced</structfield> field is + <literal>true</literal>), will get the + <structfield>inactive_since</structfield> value from the + corresponding remote slot on the primary. Also, note that for the + synced slots on the standby, after the standby starts up (i.e. after + the slots are loaded from the disk), the inactive_since will remain + zero until the next slot sync cycle. Not sure we should mention the "(i.e. after the slots are loaded from the disk)" and also "cycle" (as that does not sound right in case of manual sync). My proposal (in text) but feel free to reword it: Note that the slots on the standbys that are being synced from a primary server (whose synced field is true), will get the inactive_since value from the corresponding remote slot on the primary. Also, after the standby starts up, the inactive_since (for such synced slots) will remain zero until the next synchronization. 2 === +=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname) get_slot_inactive_since_value instead? 3 === +against given reference time. s/given reference/optional given reference/? Apart from the above, LGTM. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Mar 27, 2024 at 2:55 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Mar 27, 2024 at 11:39 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > Thanks for the patch. Few trivial things: > > Thanks for reviewing. > > > ---------- > > 1) > > system-views.sgml: > > > > a) "Note that the slots" --> "Note that the slots on the standbys," > > --it is good to mention "standbys" as synced could be true on primary > > as well (promoted standby) > > Done. > > > b) If you plan to add more info which Bertrand suggested, then it will > > be better to make a <note> section instead of using "Note" > > I added the note that Bertrand specified upthread. But, I couldn't > find an instance of adding <note> ... </note> within a table. Hence > with "Note that ...." statments just like any other notes in the > system-views.sgml. pg_replication_slot in system-vews.sgml renders as > table, so having <note> ... </note> may not be a great idea. > > > 2) > > commit msg: > > > > "The impact of this > > on a promoted standby inactive_since is always NULL for all > > synced slots even after server restart. > > " > > Sentence looks broken. > > --------- > > Reworded. > > > Apart from the above trivial things, v26-001 looks good to me. > > Please check the attached v27 patch which also has Bertrand's comment > on deduplicating the TAP function. I've now moved it to Cluster.pm. > Thanks for the patch. Regarding doc, I have few comments. + Note that the slots on the standbys that are being synced from a + primary server (whose <structfield>synced</structfield> field is + <literal>true</literal>), will get the + <structfield>inactive_since</structfield> value from the + corresponding remote slot on the primary. Also, note that for the + synced slots on the standby, after the standby starts up (i.e. after + the slots are loaded from the disk), the inactive_since will remain + zero until the next slot sync cycle. a) "inactive_since will remain zero" Since it is user exposed info and the user finds it NULL in pg_replication_slots, shall we mention NULL instead of 0? b) Since we are referring to the sync cycle here, I feel it will be good to give a link to that page. + zero until the next slot sync cycle (see + <xref linkend="logicaldecoding-replication-slots-synchronization"/> for + slot synchronization details). thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > 1 === > > My proposal (in text) but feel free to reword it: > > Note that the slots on the standbys that are being synced from a > primary server (whose synced field is true), will get the inactive_since value > from the corresponding remote slot on the primary. Also, after the standby starts > up, the inactive_since (for such synced slots) will remain zero until the next > synchronization. WFM. > 2 === > > +=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname) > > get_slot_inactive_since_value instead? Ugh. Changed. > 3 === > > +against given reference time. > > s/given reference/optional given reference/? Done. > Apart from the above, LGTM. Thanks for reviewing. On Wed, Mar 27, 2024 at 3:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > Thanks for the patch. Regarding doc, I have few comments. Thanks for reviewing. > a) "inactive_since will remain zero" > Since it is user exposed info and the user finds it NULL in > pg_replication_slots, shall we mention NULL instead of 0? Right. Changed. > b) Since we are referring to the sync cycle here, I feel it will be > good to give a link to that page. > + zero until the next slot sync cycle (see > + <xref linkend="logicaldecoding-replication-slots-synchronization"/> for > + slot synchronization details). WFM. Please see the attached v28 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot > Please see the attached v28 patch. Thanks! 1 === sorry I missed it in the previous review if (!(RecoveryInProgress() && slot->data.synced)) + { now = GetCurrentTimestamp(); + update_inactive_since = true; + } + else + update_inactive_since = false; I think update_inactive_since is not needed, we could rely on (now > 0) instead. 2 === +=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname) + +Get inactive_since column value for a given replication slot validating it +against optional reference time. + +=cut + +sub get_slot_inactive_since_value +{ shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)" instead? Apart from the above, LGTM. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Mar 27, 2024 at 6:54 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote: > > On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot > > Please see the attached v28 patch. > > Thanks! > > 1 === sorry I missed it in the previous review > > if (!(RecoveryInProgress() && slot->data.synced)) > + { > now = GetCurrentTimestamp(); > + update_inactive_since = true; > + } > + else > + update_inactive_since = false; > > I think update_inactive_since is not needed, we could rely on (now > 0) instead. Thought of using it, but, at the expense of readability. I prefer to use a variable instead. However, I changed the variable to be more meaningful to is_slot_being_synced. > 2 === > > +=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname) > + > +Get inactive_since column value for a given replication slot validating it > +against optional reference time. > + > +=cut > + > +sub get_slot_inactive_since_value > +{ > > shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)" > instead? Ugh. Changed. > Apart from the above, LGTM. Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the standby for sync slots. 0002 implementing inactive timeout GUC based invalidation mechanism. Please have a look. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 27, 2024 at 09:00:37PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 27, 2024 at 6:54 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote: > > > On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot > > > Please see the attached v28 patch. > > > > Thanks! > > > > 1 === sorry I missed it in the previous review > > > > if (!(RecoveryInProgress() && slot->data.synced)) > > + { > > now = GetCurrentTimestamp(); > > + update_inactive_since = true; > > + } > > + else > > + update_inactive_since = false; > > > > I think update_inactive_since is not needed, we could rely on (now > 0) instead. > > Thought of using it, but, at the expense of readability. I prefer to > use a variable instead. That's fine too. > However, I changed the variable to be more meaningful to is_slot_being_synced. Yeah makes sense and even easier to read. v29-0001 LGTM. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the > standby for sync slots. 0002 implementing inactive timeout GUC based > invalidation mechanism. > > Please have a look. Thanks for the patches. v29-001 looks good to me. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Mar 27, 2024 at 09:00:37PM +0530, Bharath Rupireddy wrote: > standby for sync slots. 0002 implementing inactive timeout GUC based > invalidation mechanism. > > Please have a look. Thanks! Regarding 0002: Some testing: T1 === When the slot is invalidated on the primary, then the reason is propagated to the sync slot (if any). That's fine but we are loosing the inactive_since on the standby: Primary: postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot'; slot_name | inactive_since | conflicting | invalidation_reason -------------+-------------------------------+-------------+--------------------- lsub29_slot | 2024-03-28 08:24:51.672528+00 | f | inactive_timeout (1 row) Standby: postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot'; slot_name | inactive_since | conflicting | invalidation_reason -------------+----------------+-------------+--------------------- lsub29_slot | | f | inactive_timeout (1 row) I think in this case it should always reflect the value from the primary (so that one can understand why it is invalidated). T2 === And it is set to a value during promotion: postgres=# select pg_promote(); pg_promote ------------ t (1 row) postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot'; slot_name | inactive_since | conflicting | invalidation_reason -------------+------------------------------+-------------+--------------------- lsub29_slot | 2024-03-28 08:30:11.74505+00 | f | inactive_timeout (1 row) I think when it is invalidated it should always reflect the value from the primary (so that one can understand why it is invalidated). T3 === As far the slot invalidation on the primary: postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub29_slot', NULL, NULL, 'include-xids', '0'); ERROR: cannot acquire invalidated replication slot "lsub29_slot" Can we make the message more consistent with what can be found in CreateDecodingContext() for example? T4 === Also, it looks like querying pg_replication_slots() does not trigger an invalidation: I think it should if the slot is not invalidated yet (and matches the invalidation criteria). Code review: CR1 === + Invalidate replication slots that are inactive for longer than this + amount of time. If this value is specified without units, it is taken s/Invalidate/Invalidates/? Should we mention the relationship with inactive_since? CR2 === + * + * If check_for_invalidation is true, the slot is checked for invalidation + * based on replication_slot_inactive_timeout GUC and an error is raised after making the slot ours. */ void -ReplicationSlotAcquire(const char *name, bool nowait) +ReplicationSlotAcquire(const char *name, bool nowait, + bool check_for_invalidation) s/check_for_invalidation/check_for_timeout_invalidation/? CR3 === + if (slot->inactive_since == 0 || + replication_slot_inactive_timeout == 0) + return false; Better to test replication_slot_inactive_timeout first? (I mean there is no point of testing inactive_since if replication_slot_inactive_timeout == 0) CR4 === + if (slot->inactive_since > 0 && + replication_slot_inactive_timeout > 0) + { Same. So, instead of CR3 === and CR4 ===, I wonder if it wouldn't be better to do something like: if (replication_slot_inactive_timeout == 0) return false; else if (slot->inactive_since > 0) . . . . else return false; That would avoid checking replication_slot_inactive_timeout and inactive_since multiple times. CR5 === + * held to avoid race conditions -- for example the restart_lsn could move + * forward, or the slot could be dropped. Does the restart_lsn example makes sense here? CR6 === +static bool +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks) +{ InvalidatePossiblyInactiveSlot() maybe? CR7 === + /* Make sure the invalidated state persists across server restart */ + slot->just_dirtied = true; + slot->dirty = true; + SpinLockRelease(&slot->mutex); Maybe we could create a new function say MarkGivenReplicationSlotDirty() with a slot as parameter, that ReplicationSlotMarkDirty could call too? Then maybe we could set slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT in InvalidateSlotForInactiveTimeout()? (to avoid multiple SpinLockAcquire/SpinLockRelease). CR8 === + if (persist_state) + { + char path[MAXPGPATH]; + + sprintf(path, "pg_replslot/%s", NameStr(slot->data.name)); + SaveSlotToPath(slot, path, ERROR); + } Maybe we could create a new function say GivenReplicationSlotSave() with a slot as parameter, that ReplicationSlotSave() could call too? CR9 === + if (check_for_invalidation) + { + /* The slot is ours by now */ + Assert(s->active_pid == MyProcPid); + + /* + * Well, the slot is not yet ours really unless we check for the + * invalidation below. + */ + s->active_pid = 0; + if (InvalidateReplicationSlotForInactiveTimeout(s, true, true)) + { + /* + * If the slot has been invalidated, recalculate the resource + * limits. + */ + ReplicationSlotsComputeRequiredXmin(false); + ReplicationSlotsComputeRequiredLSN(); + + /* Might need it for slot clean up on error, so restore it */ + s->active_pid = MyProcPid; + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot acquire invalidated replication slot \"%s\"", + NameStr(MyReplicationSlot->data.name)))); + } + s->active_pid = MyProcPid; Are we not missing some SpinLockAcquire/Release on the slot's mutex here? (the places where we set the active_pid). CR10 === @@ -1628,6 +1674,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause, if (SlotIsLogical(s)) invalidation_cause = cause; break; + case RS_INVAL_INACTIVE_TIMEOUT: + if (InvalidateReplicationSlotForInactiveTimeout(s, false, false)) + invalidation_cause = cause; + break; InvalidatePossiblyObsoleteSlot() is not called with such a reason, better to use an Assert here and in the caller too? CR11 === +++ b/src/test/recovery/t/050_invalidate_slots.pl why not using 019_replslot_limit.pl? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the > standby for sync slots. > Commit message states: "why we can't just update inactive_since for synced slots on the standby with the value received from remote slot on the primary. This is consistent with any other slot parameter i.e. all of them are synced from the primary." The inactive_since is not consistent with other slot parameters which we copy. We don't perform anything related to those other parameters like say two_phase phase which can change that property. However, we do acquire the slot, advance the slot (as per recent discussion [1]), and release it. Since these operations can impact inactive_since, it seems to me that inactive_since is not the same as other parameters. It can have a different value than the primary. Why would anyone want to know the value of inactive_since from primary after the standby is promoted? Now, the other concern is that calling GetCurrentTimestamp() could be costly when the values for the slot are not going to be updated but if that happens we can optimize such that before acquiring the slot we can have some minimal pre-checks to ensure whether we need to update the slot or not. [1] - https://www.postgresql.org/message-id/OS0PR01MB571615D35F486080616CA841943A2%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote: > On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the > > standby for sync slots. > > > > Commit message states: "why we can't just update inactive_since for > synced slots on the standby with the value received from remote slot > on the primary. This is consistent with any other slot parameter i.e. > all of them are synced from the primary." > > The inactive_since is not consistent with other slot parameters which > we copy. We don't perform anything related to those other parameters > like say two_phase phase which can change that property. However, we > do acquire the slot, advance the slot (as per recent discussion [1]), > and release it. Since these operations can impact inactive_since, it > seems to me that inactive_since is not the same as other parameters. > It can have a different value than the primary. Why would anyone want > to know the value of inactive_since from primary after the standby is > promoted? I think it can be useful "before" it is promoted and in case the primary is down. I agree that tracking the activity time of a synced slot can be useful, why not creating a dedicated field for that purpose (and keep inactive_since a perfect "copy" of the primary)? > Now, the other concern is that calling GetCurrentTimestamp() > could be costly when the values for the slot are not going to be > updated but if that happens we can optimize such that before acquiring > the slot we can have some minimal pre-checks to ensure whether we need > to update the slot or not. Right, but for a very active slot it is likely that we call GetCurrentTimestamp() during almost each sync cycle. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote: > > > > Commit message states: "why we can't just update inactive_since for > > synced slots on the standby with the value received from remote slot > > on the primary. This is consistent with any other slot parameter i.e. > > all of them are synced from the primary." > > > > The inactive_since is not consistent with other slot parameters which > > we copy. We don't perform anything related to those other parameters > > like say two_phase phase which can change that property. However, we > > do acquire the slot, advance the slot (as per recent discussion [1]), > > and release it. Since these operations can impact inactive_since, it > > seems to me that inactive_since is not the same as other parameters. > > It can have a different value than the primary. Why would anyone want > > to know the value of inactive_since from primary after the standby is > > promoted? > > I think it can be useful "before" it is promoted and in case the primary is down. > It is not clear to me what is user going to do by checking the inactivity time for slots when the corresponding server is down. I thought the idea was to check such slots and see if they need to be dropped or enabled again to avoid excessive disk usage, etc. > I agree that tracking the activity time of a synced slot can be useful, why > not creating a dedicated field for that purpose (and keep inactive_since a > perfect "copy" of the primary)? > We can have a separate field for this but not sure if it is worth it. > > Now, the other concern is that calling GetCurrentTimestamp() > > could be costly when the values for the slot are not going to be > > updated but if that happens we can optimize such that before acquiring > > the slot we can have some minimal pre-checks to ensure whether we need > > to update the slot or not. > > Right, but for a very active slot it is likely that we call GetCurrentTimestamp() > during almost each sync cycle. > True, but if we have to save a slot to disk each time to persist the changes (for an active slot) then probably GetCurrentTimestamp() shouldn't be costly enough to matter. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote: > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote: > > > > > > Commit message states: "why we can't just update inactive_since for > > > synced slots on the standby with the value received from remote slot > > > on the primary. This is consistent with any other slot parameter i.e. > > > all of them are synced from the primary." > > > > > > The inactive_since is not consistent with other slot parameters which > > > we copy. We don't perform anything related to those other parameters > > > like say two_phase phase which can change that property. However, we > > > do acquire the slot, advance the slot (as per recent discussion [1]), > > > and release it. Since these operations can impact inactive_since, it > > > seems to me that inactive_since is not the same as other parameters. > > > It can have a different value than the primary. Why would anyone want > > > to know the value of inactive_since from primary after the standby is > > > promoted? > > > > I think it can be useful "before" it is promoted and in case the primary is down. > > > > It is not clear to me what is user going to do by checking the > inactivity time for slots when the corresponding server is down. Say a failover needs to be done, then it could be useful to know for which slots the activity needs to be resumed (thinking about external logical decoding plugin, not about pub/sub here). If one see an inactive slot (since long "enough") then he can start to reasonate about what to do with it. > I thought the idea was to check such slots and see if they need to be > dropped or enabled again to avoid excessive disk usage, etc. Yeah that's the case but it does not mean inactive_since can't be useful in other ways. Also, say the slot has been invalidated on the primary (due to inactivity timeout), primary is down and there is a failover. By keeping the inactive_since from the primary, one could know when the inactivity that lead to the timeout started. Again, more concerned about external logical decoding plugin than pub/sub here. > > I agree that tracking the activity time of a synced slot can be useful, why > > not creating a dedicated field for that purpose (and keep inactive_since a > > perfect "copy" of the primary)? > > > > We can have a separate field for this but not sure if it is worth it. OTOH I'm not sure that erasing this information from the primary is useful. I think that 2 fields would be the best option and would be less subject of misinterpretation. > > > Now, the other concern is that calling GetCurrentTimestamp() > > > could be costly when the values for the slot are not going to be > > > updated but if that happens we can optimize such that before acquiring > > > the slot we can have some minimal pre-checks to ensure whether we need > > > to update the slot or not. > > > > Right, but for a very active slot it is likely that we call GetCurrentTimestamp() > > during almost each sync cycle. > > > > True, but if we have to save a slot to disk each time to persist the > changes (for an active slot) then probably GetCurrentTimestamp() > shouldn't be costly enough to matter. Right, persisting the changes to disk would be even more costly. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Regarding 0002: Thanks for reviewing it. > Some testing: > > T1 === > > When the slot is invalidated on the primary, then the reason is propagated to > the sync slot (if any). That's fine but we are loosing the inactive_since on the > standby: > > Primary: > > postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot'; > slot_name | inactive_since | conflicting | invalidation_reason > -------------+-------------------------------+-------------+--------------------- > lsub29_slot | 2024-03-28 08:24:51.672528+00 | f | inactive_timeout > (1 row) > > Standby: > > postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot'; > slot_name | inactive_since | conflicting | invalidation_reason > -------------+----------------+-------------+--------------------- > lsub29_slot | | f | inactive_timeout > (1 row) > > I think in this case it should always reflect the value from the primary (so > that one can understand why it is invalidated). I'll come back to this as soon as we all agree on inactive_since behavior for synced slots. > T2 === > > And it is set to a value during promotion: > > postgres=# select pg_promote(); > pg_promote > ------------ > t > (1 row) > > postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot'; > slot_name | inactive_since | conflicting | invalidation_reason > -------------+------------------------------+-------------+--------------------- > lsub29_slot | 2024-03-28 08:30:11.74505+00 | f | inactive_timeout > (1 row) > > I think when it is invalidated it should always reflect the value from the > primary (so that one can understand why it is invalidated). I'll come back to this as soon as we all agree on inactive_since behavior for synced slots. > T3 === > > As far the slot invalidation on the primary: > > postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub29_slot', NULL, NULL, 'include-xids', '0'); > ERROR: cannot acquire invalidated replication slot "lsub29_slot" > > Can we make the message more consistent with what can be found in CreateDecodingContext() > for example? Hm, that makes sense because slot acquisition and release is something internal to the server. > T4 === > > Also, it looks like querying pg_replication_slots() does not trigger an > invalidation: I think it should if the slot is not invalidated yet (and matches > the invalidation criteria). There's a different opinion on this, check comment #3 from https://www.postgresql.org/message-id/CAA4eK1LLj%2BeaMN-K8oeOjfG%2BUuzTY%3DL5PXbcMJURZbFm%2B_aJSA%40mail.gmail.com. > Code review: > > CR1 === > > + Invalidate replication slots that are inactive for longer than this > + amount of time. If this value is specified without units, it is taken > > s/Invalidate/Invalidates/? Done. > Should we mention the relationship with inactive_since? Done. > CR2 === > > + * > + * If check_for_invalidation is true, the slot is checked for invalidation > + * based on replication_slot_inactive_timeout GUC and an error is raised after making the slot ours. > */ > void > -ReplicationSlotAcquire(const char *name, bool nowait) > +ReplicationSlotAcquire(const char *name, bool nowait, > + bool check_for_invalidation) > > > s/check_for_invalidation/check_for_timeout_invalidation/? Done. > CR3 === > > + if (slot->inactive_since == 0 || > + replication_slot_inactive_timeout == 0) > + return false; > > Better to test replication_slot_inactive_timeout first? (I mean there is no > point of testing inactive_since if replication_slot_inactive_timeout == 0) > > CR4 === > > + if (slot->inactive_since > 0 && > + replication_slot_inactive_timeout > 0) > + { > > Same. > > So, instead of CR3 === and CR4 ===, I wonder if it wouldn't be better to do > something like: > > if (replication_slot_inactive_timeout == 0) > return false; > else if (slot->inactive_since > 0) > . > else > return false; > > That would avoid checking replication_slot_inactive_timeout and inactive_since > multiple times. Done. > CR5 === > > + * held to avoid race conditions -- for example the restart_lsn could move > + * forward, or the slot could be dropped. > > Does the restart_lsn example makes sense here? No, it doesn't. Modified that. > CR6 === > > +static bool > +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks) > +{ > > InvalidatePossiblyInactiveSlot() maybe? I think we will lose the essence i.e. timeout from the suggested function name, otherwise just the inactive doesn't give a clearer meaning. I kept it that way unless anyone suggests otherwise. > CR7 === > > + /* Make sure the invalidated state persists across server restart */ > + slot->just_dirtied = true; > + slot->dirty = true; > + SpinLockRelease(&slot->mutex); > > Maybe we could create a new function say MarkGivenReplicationSlotDirty() > with a slot as parameter, that ReplicationSlotMarkDirty could call too? Done that. > Then maybe we could set slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT in > InvalidateSlotForInactiveTimeout()? (to avoid multiple SpinLockAcquire/SpinLockRelease). Done that. > CR8 === > > + if (persist_state) > + { > + char path[MAXPGPATH]; > + > + sprintf(path, "pg_replslot/%s", NameStr(slot->data.name)); > + SaveSlotToPath(slot, path, ERROR); > + } > > Maybe we could create a new function say GivenReplicationSlotSave() > with a slot as parameter, that ReplicationSlotSave() could call too? Done that. > CR9 === > > + if (check_for_invalidation) > + { > + /* The slot is ours by now */ > + Assert(s->active_pid == MyProcPid); > + > + /* > + * Well, the slot is not yet ours really unless we check for the > + * invalidation below. > + */ > + s->active_pid = 0; > + if (InvalidateReplicationSlotForInactiveTimeout(s, true, true)) > + { > + /* > + * If the slot has been invalidated, recalculate the resource > + * limits. > + */ > + ReplicationSlotsComputeRequiredXmin(false); > + ReplicationSlotsComputeRequiredLSN(); > + > + /* Might need it for slot clean up on error, so restore it */ > + s->active_pid = MyProcPid; > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("cannot acquire invalidated replication slot \"%s\"", > + NameStr(MyReplicationSlot->data.name)))); > + } > + s->active_pid = MyProcPid; > > Are we not missing some SpinLockAcquire/Release on the slot's mutex here? (the > places where we set the active_pid). Hm, yes. But, shall I acquire the mutex, set active_pid to 0 for a moment just to satisfy Assert(slot->active_pid == 0); in InvalidateReplicationSlotForInactiveTimeout and InvalidateSlotForInactiveTimeout? I just removed the assertions because being replication_slot_inactive_timeout > 0 and inactive_since > 0 is enough for these functions to think and decide on inactive timeout invalidation. > CR10 === > > @@ -1628,6 +1674,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause, > if (SlotIsLogical(s)) > invalidation_cause = cause; > break; > + case RS_INVAL_INACTIVE_TIMEOUT: > + if (InvalidateReplicationSlotForInactiveTimeout(s, false, false)) > + invalidation_cause = cause; > + break; > > InvalidatePossiblyObsoleteSlot() is not called with such a reason, better to use > an Assert here and in the caller too? Done. > CR11 === > > +++ b/src/test/recovery/t/050_invalidate_slots.pl > > why not using 019_replslot_limit.pl? I understand that 019_replslot_limit covers wal_removed related invalidations. But, I don't want to kludge it with a bunch of other tests. The new tests anyway need a bunch of new nodes and a couple of helper functions. Any future invalidation mechanisms can be added here in this new file. Also, having a separate file quickly helps isolate any test failures that BF animals might report in future. I don't think a separate test file here hurts anyone unless there's a strong reason against it. Please see the attached v30 patch. 0002 is where all of the above review comments have been addressed. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Commit message states: "why we can't just update inactive_since for > synced slots on the standby with the value received from remote slot > on the primary. This is consistent with any other slot parameter i.e. > all of them are synced from the primary." > > The inactive_since is not consistent with other slot parameters which > we copy. We don't perform anything related to those other parameters > like say two_phase phase which can change that property. However, we > do acquire the slot, advance the slot (as per recent discussion [1]), > and release it. Since these operations can impact inactive_since, it > seems to me that inactive_since is not the same as other parameters. > It can have a different value than the primary. Why would anyone want > to know the value of inactive_since from primary after the standby is > promoted? After thinking about it for a while now, it feels to me that the synced slots (slots on the standby that are being synced from the primary) can have their own inactive_sicne value. Fundamentally, inactive_sicne is set to 0 when slot is acquired and set to current time when slot is released, no matter who acquires and releases it - be it walsenders for replication, or backends for slot advance, or backends for slot sync using pg_sync_replication_slots, or backends for other slot functions, or background sync worker. Remember the earlier patch was updating inactive_since just for walsenders, but then the suggestion was to update it unconditionally - https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com. Whoever syncs the slot, *acutally* acquires the slot i.e. makes it theirs, syncs it from the primary, and releases it. IMO, no differentiation is to be made for synced slots. There was a suggestion on using inactive_since of the synced slot on the standby to know the inactivity of the slot on the primary. If one wants to do that, they better look at/monitor the primary slot info/logs/pg_replication_slot/whatever. I really don't see a point in having two different meanings for a single property of a replication slot - inactive_since for a regular slot tells since when this slot has become inactive, and for a synced slot since when the corresponding remote slot has become inactive. I think this will confuse users for sure. Also, if inactive_since is being changed on the primary so frequently, and none of the other parameters are changing, if we copy inactive_since to the synced slots, then standby will just be doing *sync* work (mark the slots dirty and save to disk) for updating inactive_since. I think this is unnecessary behaviour for sure. Coming to a future patch for inactive timeout based slot invalidation, we can either allow invalidation without any differentiation for synced slots or restrict invalidation to avoid more sync work. For instance, if inactive timeout is kept low on the standby, the sync worker will be doing more work as it drops and recreates a slot repeatedly if it keeps getting invalidated. Another thing is that the standby takes independent invalidation decisions for synced slots. AFAICS, invalidation due to wal_removal is the only sole reason (out of all available invalidation reasons) for a synced slot to get invalidated independently of the primary. Check https://www.postgresql.org/message-id/CAA4eK1JXBwTaDRD_%3D8t6UB1fhRNjC1C%2BgH4YdDxj_9U6djLnXw%40mail.gmail.com for the suggestion on we better not differentiaing invalidation decisions for synced slots. The assumption of letting synced slots have their own inactive_since not only simplifies the code, but also looks less-confusing and more meaningful to the user. The only code that we put in on top of the committed code is to use InRecovery in place of RecoveryInProgress() in RestoreSlotFromDisk() to fix the issue raised by Shveta upthread. > Now, the other concern is that calling GetCurrentTimestamp() > could be costly when the values for the slot are not going to be > updated but if that happens we can optimize such that before acquiring > the slot we can have some minimal pre-checks to ensure whether we need > to update the slot or not. > > [1] - https://www.postgresql.org/message-id/OS0PR01MB571615D35F486080616CA841943A2%40OS0PR01MB5716.jpnprd01.prod.outlook.com A quick test with a function to measure the cost of GetCurrentTimestamp [1] on my Ubuntu dev system (an AWS EC2 c5.4xlarge instance), gives me [2]. It took 0.388 ms, 2.269 ms, 21.144 ms, 209.333 ms, 2091.174 ms, 20908.942 ms for 10K, 100K, 1million, 10million, 100million, 1billion times respectively. Costs might be different on various systems with different OS, but it gives us a rough idea. If we are too much concerned about the cost of GetCurrentTimestamp(), a possible approach is just don't set inactive_since for slots being synced on the standby. Just let the first acquisition and release after the promotion do that job. We can always call this out in the docs saying "replication slots on the streaming standbys which are being synced from the primary are not inactive in practice, so the inactive_since is always NULL for them unless the standby is promoted". [1] Datum pg_get_current_timestamp(PG_FUNCTION_ARGS) { int loops = PG_GETARG_INT32(0); TimestampTz ctime; for (int i = 0; i < loops; i++) ctime = GetCurrentTimestamp(); PG_RETURN_TIMESTAMPTZ(ctime); } [2] postgres=# \timing Timing is on. postgres=# SELECT pg_get_current_timestamp(1000000000); pg_get_current_timestamp ------------------------------- 2024-03-30 19:07:57.374797+00 (1 row) Time: 20908.942 ms (00:20.909) postgres=# SELECT pg_get_current_timestamp(100000000); pg_get_current_timestamp ------------------------------- 2024-03-30 19:08:21.038064+00 (1 row) Time: 2091.174 ms (00:02.091) postgres=# SELECT pg_get_current_timestamp(10000000); pg_get_current_timestamp ------------------------------- 2024-03-30 19:08:24.329949+00 (1 row) Time: 209.333 ms postgres=# SELECT pg_get_current_timestamp(1000000); pg_get_current_timestamp ------------------------------- 2024-03-30 19:08:26.978016+00 (1 row) Time: 21.144 ms postgres=# SELECT pg_get_current_timestamp(100000); pg_get_current_timestamp ------------------------------- 2024-03-30 19:08:29.142248+00 (1 row) Time: 2.269 ms postgres=# SELECT pg_get_current_timestamp(10000); pg_get_current_timestamp ------------------------------ 2024-03-30 19:08:31.34621+00 (1 row) Time: 0.388 ms -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote: > > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote: > > > > > > > > Commit message states: "why we can't just update inactive_since for > > > > synced slots on the standby with the value received from remote slot > > > > on the primary. This is consistent with any other slot parameter i.e. > > > > all of them are synced from the primary." > > > > > > > > The inactive_since is not consistent with other slot parameters which > > > > we copy. We don't perform anything related to those other parameters > > > > like say two_phase phase which can change that property. However, we > > > > do acquire the slot, advance the slot (as per recent discussion [1]), > > > > and release it. Since these operations can impact inactive_since, it > > > > seems to me that inactive_since is not the same as other parameters. > > > > It can have a different value than the primary. Why would anyone want > > > > to know the value of inactive_since from primary after the standby is > > > > promoted? > > > > > > I think it can be useful "before" it is promoted and in case the primary is down. > > > > > > > It is not clear to me what is user going to do by checking the > > inactivity time for slots when the corresponding server is down. > > Say a failover needs to be done, then it could be useful to know for which > slots the activity needs to be resumed (thinking about external logical decoding > plugin, not about pub/sub here). If one see an inactive slot (since long "enough") > then he can start to reasonate about what to do with it. > > > I thought the idea was to check such slots and see if they need to be > > dropped or enabled again to avoid excessive disk usage, etc. > > Yeah that's the case but it does not mean inactive_since can't be useful in other > ways. > > Also, say the slot has been invalidated on the primary (due to inactivity timeout), > primary is down and there is a failover. By keeping the inactive_since from > the primary, one could know when the inactivity that lead to the timeout started. > So, this means at promotion, we won't set the current_time for inactive_since which is not what the currently proposed patch is doing. Moreover, doing the invalidation on promoted standby based on inactive_since of the primary node sounds debatable because the inactive_timeout could be different on the new node (promoted standby). > Again, more concerned about external logical decoding plugin than pub/sub here. > > > > I agree that tracking the activity time of a synced slot can be useful, why > > > not creating a dedicated field for that purpose (and keep inactive_since a > > > perfect "copy" of the primary)? > > > > > > > We can have a separate field for this but not sure if it is worth it. > > OTOH I'm not sure that erasing this information from the primary is useful. I > think that 2 fields would be the best option and would be less subject of > misinterpretation. > > > > > Now, the other concern is that calling GetCurrentTimestamp() > > > > could be costly when the values for the slot are not going to be > > > > updated but if that happens we can optimize such that before acquiring > > > > the slot we can have some minimal pre-checks to ensure whether we need > > > > to update the slot or not. > > > > > > Right, but for a very active slot it is likely that we call GetCurrentTimestamp() > > > during almost each sync cycle. > > > > > > > True, but if we have to save a slot to disk each time to persist the > > changes (for an active slot) then probably GetCurrentTimestamp() > > shouldn't be costly enough to matter. > > Right, persisting the changes to disk would be even more costly. > The point I was making is that currently after copying the remote_node's values, we always persist the slots to disk, so the cost of current_time shouldn't be much. Now, if the values won't change then probably there is some cost but in most cases (active slots), the values will always change. Also, if all the slots are inactive then we will slow down the speed of sync. We also need to consider if we want to copy the value of inactive_since from the primary and if that is the only value changed then shall we persist the slot or not? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Mon, Apr 01, 2024 at 09:04:43AM +0530, Amit Kapila wrote: > On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote: > > > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote: > > > > > > > > > > Commit message states: "why we can't just update inactive_since for > > > > > synced slots on the standby with the value received from remote slot > > > > > on the primary. This is consistent with any other slot parameter i.e. > > > > > all of them are synced from the primary." > > > > > > > > > > The inactive_since is not consistent with other slot parameters which > > > > > we copy. We don't perform anything related to those other parameters > > > > > like say two_phase phase which can change that property. However, we > > > > > do acquire the slot, advance the slot (as per recent discussion [1]), > > > > > and release it. Since these operations can impact inactive_since, it > > > > > seems to me that inactive_since is not the same as other parameters. > > > > > It can have a different value than the primary. Why would anyone want > > > > > to know the value of inactive_since from primary after the standby is > > > > > promoted? > > > > > > > > I think it can be useful "before" it is promoted and in case the primary is down. > > > > > > > > > > It is not clear to me what is user going to do by checking the > > > inactivity time for slots when the corresponding server is down. > > > > Say a failover needs to be done, then it could be useful to know for which > > slots the activity needs to be resumed (thinking about external logical decoding > > plugin, not about pub/sub here). If one see an inactive slot (since long "enough") > > then he can start to reasonate about what to do with it. > > > > > I thought the idea was to check such slots and see if they need to be > > > dropped or enabled again to avoid excessive disk usage, etc. > > > > Yeah that's the case but it does not mean inactive_since can't be useful in other > > ways. > > > > Also, say the slot has been invalidated on the primary (due to inactivity timeout), > > primary is down and there is a failover. By keeping the inactive_since from > > the primary, one could know when the inactivity that lead to the timeout started. > > > > So, this means at promotion, we won't set the current_time for > inactive_since which is not what the currently proposed patch is > doing. Yeah, that's why I made the comment T2 in [1]. > Moreover, doing the invalidation on promoted standby based on > inactive_since of the primary node sounds debatable because the > inactive_timeout could be different on the new node (promoted > standby). I think that if the slot is not invalidated before the promotion then we should erase the value from the primary and use the promotion time. > > Again, more concerned about external logical decoding plugin than pub/sub here. > > > > > > I agree that tracking the activity time of a synced slot can be useful, why > > > > not creating a dedicated field for that purpose (and keep inactive_since a > > > > perfect "copy" of the primary)? > > > > > > > > > > We can have a separate field for this but not sure if it is worth it. > > > > OTOH I'm not sure that erasing this information from the primary is useful. I > > think that 2 fields would be the best option and would be less subject of > > misinterpretation. > > > > > > > Now, the other concern is that calling GetCurrentTimestamp() > > > > > could be costly when the values for the slot are not going to be > > > > > updated but if that happens we can optimize such that before acquiring > > > > > the slot we can have some minimal pre-checks to ensure whether we need > > > > > to update the slot or not. > > > > > > > > Right, but for a very active slot it is likely that we call GetCurrentTimestamp() > > > > during almost each sync cycle. > > > > > > > > > > True, but if we have to save a slot to disk each time to persist the > > > changes (for an active slot) then probably GetCurrentTimestamp() > > > shouldn't be costly enough to matter. > > > > Right, persisting the changes to disk would be even more costly. > > > > The point I was making is that currently after copying the > remote_node's values, we always persist the slots to disk, so the cost > of current_time shouldn't be much. Oh right, I missed this (was focusing only on inactive_since that we don't persist to disk IIRC). BTW, If we are going this way, maybe we could accept a bit less accuracy and use GetCurrentTransactionStopTimestamp() instead? > Now, if the values won't change > then probably there is some cost but in most cases (active slots), the > values will always change. Right. > Also, if all the slots are inactive then we > will slow down the speed of sync. Yes. > We also need to consider if we want > to copy the value of inactive_since from the primary and if that is > the only value changed then shall we persist the slot or not? Good point, then I don't think we should as inactive_since is not persisted on disk. [1]: https://www.postgresql.org/message-id/ZgU70MjdOfO6l0O0%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Mon, Apr 01, 2024 at 08:47:59AM +0530, Bharath Rupireddy wrote: > On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Commit message states: "why we can't just update inactive_since for > > synced slots on the standby with the value received from remote slot > > on the primary. This is consistent with any other slot parameter i.e. > > all of them are synced from the primary." > > > > The inactive_since is not consistent with other slot parameters which > > we copy. We don't perform anything related to those other parameters > > like say two_phase phase which can change that property. However, we > > do acquire the slot, advance the slot (as per recent discussion [1]), > > and release it. Since these operations can impact inactive_since, it > > seems to me that inactive_since is not the same as other parameters. > > It can have a different value than the primary. Why would anyone want > > to know the value of inactive_since from primary after the standby is > > promoted? > > After thinking about it for a while now, it feels to me that the > synced slots (slots on the standby that are being synced from the > primary) can have their own inactive_sicne value. Fundamentally, > inactive_sicne is set to 0 when slot is acquired and set to current > time when slot is released, no matter who acquires and releases it - > be it walsenders for replication, or backends for slot advance, or > backends for slot sync using pg_sync_replication_slots, or backends > for other slot functions, or background sync worker. Remember the > earlier patch was updating inactive_since just for walsenders, but > then the suggestion was to update it unconditionally - > https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com. > Whoever syncs the slot, *acutally* acquires the slot i.e. makes it > theirs, syncs it from the primary, and releases it. IMO, no > differentiation is to be made for synced slots. > > There was a suggestion on using inactive_since of the synced slot on > the standby to know the inactivity of the slot on the primary. If one > wants to do that, they better look at/monitor the primary slot > info/logs/pg_replication_slot/whatever. Yeah but the use case was in case the primary is down for whatever reason. > I really don't see a point in > having two different meanings for a single property of a replication > slot - inactive_since for a regular slot tells since when this slot > has become inactive, and for a synced slot since when the > corresponding remote slot has become inactive. I think this will > confuse users for sure. I'm not sure as we are speaking about "synced" slots. I can also see some confusion if this value is not "synced". > Also, if inactive_since is being changed on the primary so frequently, > and none of the other parameters are changing, if we copy > inactive_since to the synced slots, then standby will just be doing > *sync* work (mark the slots dirty and save to disk) for updating > inactive_since. I think this is unnecessary behaviour for sure. Right, I think we should avoid the save slot to disk in that case (question raised by Amit in [1]). > Coming to a future patch for inactive timeout based slot invalidation, > we can either allow invalidation without any differentiation for > synced slots or restrict invalidation to avoid more sync work. For > instance, if inactive timeout is kept low on the standby, the sync > worker will be doing more work as it drops and recreates a slot > repeatedly if it keeps getting invalidated. Another thing is that the > standby takes independent invalidation decisions for synced slots. > AFAICS, invalidation due to wal_removal is the only sole reason (out > of all available invalidation reasons) for a synced slot to get > invalidated independently of the primary. Check > https://www.postgresql.org/message-id/CAA4eK1JXBwTaDRD_%3D8t6UB1fhRNjC1C%2BgH4YdDxj_9U6djLnXw%40mail.gmail.com > for the suggestion on we better not differentiaing invalidation > decisions for synced slots. Yeah, I think the invalidation decision on the standby is highly linked to what inactive_since on the standby is: synced from primary or not. > The assumption of letting synced slots have their own inactive_since > not only simplifies the code, but also looks less-confusing and more > meaningful to the user. I'm not sure at all. But if the majority of us thinks it's the case then let's go that way. > > Now, the other concern is that calling GetCurrentTimestamp() > > could be costly when the values for the slot are not going to be > > updated but if that happens we can optimize such that before acquiring > > the slot we can have some minimal pre-checks to ensure whether we need > > to update the slot or not. Also maybe we could accept a bit less accuracy and use GetCurrentTransactionStopTimestamp() instead? > If we are too much concerned about the cost of GetCurrentTimestamp(), > a possible approach is just don't set inactive_since for slots being > synced on the standby. > Just let the first acquisition and release > after the promotion do that job. We can always call this out in the > docs saying "replication slots on the streaming standbys which are > being synced from the primary are not inactive in practice, so the > inactive_since is always NULL for them unless the standby is > promoted". I think that was the initial behavior that lead to Robert's remark (see [2]): " And I'm suspicious that having an exception for slots being synced is a bad idea. That makes too much of a judgement about how the user will use this field. It's usually better to just expose the data, and if the user needs helps to make sense of that data, then give them that help separately. " [1]: https://www.postgresql.org/message-id/CAA4eK1JtKieWMivbswYg5FVVB5FugCftLvQKVsxh%3Dm_8nk04vw%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CA%2BTgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Sun, Mar 31, 2024 at 10:25:46AM +0530, Bharath Rupireddy wrote: > On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > I think in this case it should always reflect the value from the primary (so > > that one can understand why it is invalidated). > > I'll come back to this as soon as we all agree on inactive_since > behavior for synced slots. Makes sense. Also if the majority of us thinks it's not needed for inactive_since to be an exact copy of the primary, then let's go that way. > > I think when it is invalidated it should always reflect the value from the > > primary (so that one can understand why it is invalidated). > > I'll come back to this as soon as we all agree on inactive_since > behavior for synced slots. Yeah. > > T4 === > > > > Also, it looks like querying pg_replication_slots() does not trigger an > > invalidation: I think it should if the slot is not invalidated yet (and matches > > the invalidation criteria). > > There's a different opinion on this, check comment #3 from > https://www.postgresql.org/message-id/CAA4eK1LLj%2BeaMN-K8oeOjfG%2BUuzTY%3DL5PXbcMJURZbFm%2B_aJSA%40mail.gmail.com. Oh right, I can see Amit's point too. Let's put pg_replication_slots() out of the game then. > > CR6 === > > > > +static bool > > +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks) > > +{ > > > > InvalidatePossiblyInactiveSlot() maybe? > > I think we will lose the essence i.e. timeout from the suggested > function name, otherwise just the inactive doesn't give a clearer > meaning. I kept it that way unless anyone suggests otherwise. Right. OTOH I think that "Possibly" adds some nuance (like InvalidatePossiblyObsoleteSlot() is already doing). > Please see the attached v30 patch. 0002 is where all of the above > review comments have been addressed. Thanks! FYI, I did not look at the content yet, just replied to the above comments. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Masahiko Sawada
Date:
On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Commit message states: "why we can't just update inactive_since for > > synced slots on the standby with the value received from remote slot > > on the primary. This is consistent with any other slot parameter i.e. > > all of them are synced from the primary." > > > > The inactive_since is not consistent with other slot parameters which > > we copy. We don't perform anything related to those other parameters > > like say two_phase phase which can change that property. However, we > > do acquire the slot, advance the slot (as per recent discussion [1]), > > and release it. Since these operations can impact inactive_since, it > > seems to me that inactive_since is not the same as other parameters. > > It can have a different value than the primary. Why would anyone want > > to know the value of inactive_since from primary after the standby is > > promoted? > > After thinking about it for a while now, it feels to me that the > synced slots (slots on the standby that are being synced from the > primary) can have their own inactive_sicne value. Fundamentally, > inactive_sicne is set to 0 when slot is acquired and set to current > time when slot is released, no matter who acquires and releases it - > be it walsenders for replication, or backends for slot advance, or > backends for slot sync using pg_sync_replication_slots, or backends > for other slot functions, or background sync worker. Remember the > earlier patch was updating inactive_since just for walsenders, but > then the suggestion was to update it unconditionally - > https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com. > Whoever syncs the slot, *acutally* acquires the slot i.e. makes it > theirs, syncs it from the primary, and releases it. IMO, no > differentiation is to be made for synced slots. FWIW, coming to this thread late, I think that the inactive_since should not be synchronized from the primary. The wall clocks are different on the primary and the standby so having the primary's timestamp on the standby can confuse users, especially when there is a big clock drift. Also, as Amit mentioned, inactive_since seems not to be consistent with other parameters we copy. The replication_slot_inactive_timeout feature should work on the standby independent from the primary, like other slot invalidation mechanisms, and it should be based on its own local clock. > Coming to a future patch for inactive timeout based slot invalidation, > we can either allow invalidation without any differentiation for > synced slots or restrict invalidation to avoid more sync work. For > instance, if inactive timeout is kept low on the standby, the sync > worker will be doing more work as it drops and recreates a slot > repeatedly if it keeps getting invalidated. Another thing is that the > standby takes independent invalidation decisions for synced slots. > AFAICS, invalidation due to wal_removal is the only sole reason (out > of all available invalidation reasons) for a synced slot to get > invalidated independently of the primary. Check > https://www.postgresql.org/message-id/CAA4eK1JXBwTaDRD_%3D8t6UB1fhRNjC1C%2BgH4YdDxj_9U6djLnXw%40mail.gmail.com > for the suggestion on we better not differentiaing invalidation > decisions for synced slots. > > The assumption of letting synced slots have their own inactive_since > not only simplifies the code, but also looks less-confusing and more > meaningful to the user. The only code that we put in on top of the > committed code is to use InRecovery in place of > RecoveryInProgress() in RestoreSlotFromDisk() to fix the issue raised > by Shveta upthread. If we want to invalidate the synced slots due to the timeout, I think we need to define what is "inactive" for synced slots. Suppose that the slotsync worker updates the local (synced) slot's inactive_since whenever releasing the slot, irrespective of the actual LSNs (or other slot parameters) having been updated. I think that this idea cannot handle a slot that is not acquired on the primary. In this case, the remote slot is inactive but the local slot is regarded as active. WAL files are piled up on the standby (and on the primary) as the slot's LSNs don't move forward. I think we want to regard such a slot as "inactive" also on the standby and invalidate it because of the timeout. > > > Now, the other concern is that calling GetCurrentTimestamp() > > could be costly when the values for the slot are not going to be > > updated but if that happens we can optimize such that before acquiring > > the slot we can have some minimal pre-checks to ensure whether we need > > to update the slot or not. If we use such pre-checks, another problem might happen; it cannot handle a case where the slot is acquired on the primary but its LSNs don't move forward. Imagine a logical replication conflict happened on the subscriber, and the logical replication enters the retry loop. In this case, the remote slot's inactive_since gets updated for every retry, but it looks inactive from the standby since the slot LSNs don't change. Therefore, only the local slot could be invalidated due to the timeout but probably we don't want to regard such a slot as "inactive". Another idea I came up with is that the slotsync worker updates the local slot's inactive_since to the local timestamp only when the remote slot might have got inactive. If the remote slot is acquired by someone, the local slot's inactive_since is also NULL. If the remote slot gets inactive, the slotsync worker sets the local timestamp to the local slot's inactive_since. Since the remote slot could be acquired and released before the slotsync worker gets the remote slot data again, if the remote slot's inactive_since > the local slot's inactive_since, the slotsync worker updates the local one. IOW, we detect whether the remote slot was acquired and released since the last synchronization, by checking the remote slot's inactive_since. This idea seems to handle these cases I mentioned unless I'm missing something, but it requires for the slotsync worker to update inactive_since in a different way than other parameters. Or a simple solution is that the slotsync worker updates inactive_since as it does for non-synced slots, and disables timeout-based slot invalidation for synced slots. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote: > On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy > > FWIW, coming to this thread late, I think that the inactive_since > should not be synchronized from the primary. The wall clocks are > different on the primary and the standby so having the primary's > timestamp on the standby can confuse users, especially when there is a > big clock drift. Also, as Amit mentioned, inactive_since seems not to > be consistent with other parameters we copy. The > replication_slot_inactive_timeout feature should work on the standby > independent from the primary, like other slot invalidation mechanisms, > and it should be based on its own local clock. Thanks for sharing your thoughts! So, it looks like that most of us agree to not sync inactive_since from the primary, I'm fine with that. > If we want to invalidate the synced slots due to the timeout, I think > we need to define what is "inactive" for synced slots. > > Suppose that the slotsync worker updates the local (synced) slot's > inactive_since whenever releasing the slot, irrespective of the actual > LSNs (or other slot parameters) having been updated. I think that this > idea cannot handle a slot that is not acquired on the primary. In this > case, the remote slot is inactive but the local slot is regarded as > active. WAL files are piled up on the standby (and on the primary) as > the slot's LSNs don't move forward. I think we want to regard such a > slot as "inactive" also on the standby and invalidate it because of > the timeout. I think that makes sense to somehow link inactive_since on the standby to the actual LSNs (or other slot parameters) being updated or not. > > > Now, the other concern is that calling GetCurrentTimestamp() > > > could be costly when the values for the slot are not going to be > > > updated but if that happens we can optimize such that before acquiring > > > the slot we can have some minimal pre-checks to ensure whether we need > > > to update the slot or not. > > If we use such pre-checks, another problem might happen; it cannot > handle a case where the slot is acquired on the primary but its LSNs > don't move forward. Imagine a logical replication conflict happened on > the subscriber, and the logical replication enters the retry loop. In > this case, the remote slot's inactive_since gets updated for every > retry, but it looks inactive from the standby since the slot LSNs > don't change. Therefore, only the local slot could be invalidated due > to the timeout but probably we don't want to regard such a slot as > "inactive". > > Another idea I came up with is that the slotsync worker updates the > local slot's inactive_since to the local timestamp only when the > remote slot might have got inactive. If the remote slot is acquired by > someone, the local slot's inactive_since is also NULL. If the remote > slot gets inactive, the slotsync worker sets the local timestamp to > the local slot's inactive_since. Since the remote slot could be > acquired and released before the slotsync worker gets the remote slot > data again, if the remote slot's inactive_since > the local slot's > inactive_since, the slotsync worker updates the local one. Then I think we would need to be careful about time zone comparison. > IOW, we > detect whether the remote slot was acquired and released since the > last synchronization, by checking the remote slot's inactive_since. > This idea seems to handle these cases I mentioned unless I'm missing > something, but it requires for the slotsync worker to update > inactive_since in a different way than other parameters. > > Or a simple solution is that the slotsync worker updates > inactive_since as it does for non-synced slots, and disables > timeout-based slot invalidation for synced slots. Yeah, I think the main question to help us decide is: do we want to invalidate "inactive" synced slots locally (in addition to synchronizing the invalidation from the primary)? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > Or a simple solution is that the slotsync worker updates > > inactive_since as it does for non-synced slots, and disables > > timeout-based slot invalidation for synced slots. > > Yeah, I think the main question to help us decide is: do we want to invalidate > "inactive" synced slots locally (in addition to synchronizing the invalidation > from the primary)? I think this approach looks way simpler than the other one. The other approach of linking inactive_since on the standby for synced slots to the actual LSNs (or other slot parameters) being updated or not looks more complicated, and might not go well with the end user. However, we need to be able to say why we don't invalidate synced slots due to inactive timeout unlike the wal_removed invalidation that can happen right now on the standby for synced slots. This leads us to define actually what a slot being active means. Is syncing the data from the remote slot considered as the slot being active? On the other hand, it may not sound great if we don't invalidate synced slots due to inactive timeout even though they hold resources such as WAL and XIDs. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Tue, Apr 02, 2024 at 12:41:35PM +0530, Bharath Rupireddy wrote: > On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > Or a simple solution is that the slotsync worker updates > > > inactive_since as it does for non-synced slots, and disables > > > timeout-based slot invalidation for synced slots. > > > > Yeah, I think the main question to help us decide is: do we want to invalidate > > "inactive" synced slots locally (in addition to synchronizing the invalidation > > from the primary)? > > I think this approach looks way simpler than the other one. The other > approach of linking inactive_since on the standby for synced slots to > the actual LSNs (or other slot parameters) being updated or not looks > more complicated, and might not go well with the end user. However, > we need to be able to say why we don't invalidate synced slots due to > inactive timeout unlike the wal_removed invalidation that can happen > right now on the standby for synced slots. This leads us to define > actually what a slot being active means. Is syncing the data from the > remote slot considered as the slot being active? > > On the other hand, it may not sound great if we don't invalidate > synced slots due to inactive timeout even though they hold resources > such as WAL and XIDs. Right and the "only" benefit then would be to give an idea as to when the last sync did occur on the local slot. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote: > > On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy > > > > FWIW, coming to this thread late, I think that the inactive_since > > should not be synchronized from the primary. The wall clocks are > > different on the primary and the standby so having the primary's > > timestamp on the standby can confuse users, especially when there is a > > big clock drift. Also, as Amit mentioned, inactive_since seems not to > > be consistent with other parameters we copy. The > > replication_slot_inactive_timeout feature should work on the standby > > independent from the primary, like other slot invalidation mechanisms, > > and it should be based on its own local clock. > > Thanks for sharing your thoughts! So, it looks like that most of us agree to not > sync inactive_since from the primary, I'm fine with that. +1 on not syncing slots from primary. > > If we want to invalidate the synced slots due to the timeout, I think > > we need to define what is "inactive" for synced slots. > > > > Suppose that the slotsync worker updates the local (synced) slot's > > inactive_since whenever releasing the slot, irrespective of the actual > > LSNs (or other slot parameters) having been updated. I think that this > > idea cannot handle a slot that is not acquired on the primary. In this > > case, the remote slot is inactive but the local slot is regarded as > > active. WAL files are piled up on the standby (and on the primary) as > > the slot's LSNs don't move forward. I think we want to regard such a > > slot as "inactive" also on the standby and invalidate it because of > > the timeout. > > I think that makes sense to somehow link inactive_since on the standby to > the actual LSNs (or other slot parameters) being updated or not. > > > > > Now, the other concern is that calling GetCurrentTimestamp() > > > > could be costly when the values for the slot are not going to be > > > > updated but if that happens we can optimize such that before acquiring > > > > the slot we can have some minimal pre-checks to ensure whether we need > > > > to update the slot or not. > > > > If we use such pre-checks, another problem might happen; it cannot > > handle a case where the slot is acquired on the primary but its LSNs > > don't move forward. Imagine a logical replication conflict happened on > > the subscriber, and the logical replication enters the retry loop. In > > this case, the remote slot's inactive_since gets updated for every > > retry, but it looks inactive from the standby since the slot LSNs > > don't change. Therefore, only the local slot could be invalidated due > > to the timeout but probably we don't want to regard such a slot as > > "inactive". > > > > Another idea I came up with is that the slotsync worker updates the > > local slot's inactive_since to the local timestamp only when the > > remote slot might have got inactive. If the remote slot is acquired by > > someone, the local slot's inactive_since is also NULL. If the remote > > slot gets inactive, the slotsync worker sets the local timestamp to > > the local slot's inactive_since. Since the remote slot could be > > acquired and released before the slotsync worker gets the remote slot > > data again, if the remote slot's inactive_since > the local slot's > > inactive_since, the slotsync worker updates the local one. > > Then I think we would need to be careful about time zone comparison. Yes. Also we need to consider the case when a user is relying on pg_sync_replication_slots() and has not enabled slot-sync worker. In such a case if synced slot's inactive_since is derived from inactivity of remote-slot, it might not be that frequently updated (based on when the user actually runs the SQL function) and thus may be misleading. OTOH, if inactivty_since of synced slots represents its own inactivity, then it will give correct info even for the case when the SQL function is run after a long time and slot-sync worker is disabled. > > IOW, we > > detect whether the remote slot was acquired and released since the > > last synchronization, by checking the remote slot's inactive_since. > > This idea seems to handle these cases I mentioned unless I'm missing > > something, but it requires for the slotsync worker to update > > inactive_since in a different way than other parameters. > > > > Or a simple solution is that the slotsync worker updates > > inactive_since as it does for non-synced slots, and disables > > timeout-based slot invalidation for synced slots. I like this idea better, it takes care of such a case too when the user is relying on sync-function rather than worker and does not want to get the slots invalidated in between 2 sync function calls. > Yeah, I think the main question to help us decide is: do we want to invalidate > "inactive" synced slots locally (in addition to synchronizing the invalidation > from the primary)? thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > Or a simple solution is that the slotsync worker updates > > > inactive_since as it does for non-synced slots, and disables > > > timeout-based slot invalidation for synced slots. > > I like this idea better, it takes care of such a case too when the > user is relying on sync-function rather than worker and does not want > to get the slots invalidated in between 2 sync function calls. Please find the attached v31 patches implementing the above idea: - synced slots get their on inactive_since just like any other slot - synced slots don't get invalidated due to inactive timeout because such slots not considered active at all as they don't perform logical decoding (of course, they will perform in fast_forward mode to fix the other data loss issue, but they don't generate changes for them to be called as *active* slots) - synced slots inactive_since is set to current timestamp after the standby gets promoted to help inactive_since interpret correctly just like any other slot. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Or a simple solution is that the slotsync worker updates > > > > inactive_since as it does for non-synced slots, and disables > > > > timeout-based slot invalidation for synced slots. > > > > I like this idea better, it takes care of such a case too when the > > user is relying on sync-function rather than worker and does not want > > to get the slots invalidated in between 2 sync function calls. > > Please find the attached v31 patches implementing the above idea: Thanks! Some comments related to v31-0001: === testing the behavior T1 === > - synced slots get their on inactive_since just like any other slot It behaves as described. T2 === > - synced slots inactive_since is set to current timestamp after the > standby gets promoted to help inactive_since interpret correctly just > like any other slot. It behaves as described. CR1 === + <structfield>inactive_since</structfield> value will get updated + after every synchronization indicates the last synchronization time? (I think that after every synchronization could lead to confusion). CR2 === + /* + * Set the time since the slot has become inactive after shutting + * down slot sync machinery. This helps correctly interpret the + * time if the standby gets promoted without a restart. + */ It looks to me that this comment is not at the right place because there is nothing after the comment that indicates that we shutdown the "slot sync machinery". Maybe a better place is before the function definition and mention that this is currently called when we shutdown the "slot sync machinery"? CR3 === + * We get the current time beforehand and only once to avoid + * system calls overhead while holding the lock. s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/? CR4 === + * Set the time since the slot has become inactive. We get the current + * time beforehand to avoid system call overhead while holding the lock Same. CR5 === + # Check that the captured time is sane + if (defined $reference_time) + { s/Check that the captured time is sane/Check that the inactive_since is sane/? Sorry if some of those comments could have been done while I did review v29-0001. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Or a simple solution is that the slotsync worker updates > > > > inactive_since as it does for non-synced slots, and disables > > > > timeout-based slot invalidation for synced slots. > > > > I like this idea better, it takes care of such a case too when the > > user is relying on sync-function rather than worker and does not want > > to get the slots invalidated in between 2 sync function calls. > > Please find the attached v31 patches implementing the above idea: Thanks! Some comments regarding v31-0002: === testing the behavior T1 === > - synced slots don't get invalidated due to inactive timeout because > such slots not considered active at all as they don't perform logical > decoding (of course, they will perform in fast_forward mode to fix the > other data loss issue, but they don't generate changes for them to be > called as *active* slots) It behaves as described. OTOH non synced logical slots on the standby and physical slots on the standby are invalidated which is what is expected. T2 === In case the slot is invalidated on the primary, primary: postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1'; slot_name | inactive_since | invalidation_reason -----------+-------------------------------+--------------------- s1 | 2024-04-03 06:56:28.075637+00 | inactive_timeout then on the standby we get: standby: postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1'; slot_name | inactive_since | invalidation_reason -----------+------------------------------+--------------------- s1 | 2024-04-03 07:06:43.37486+00 | inactive_timeout shouldn't the slot be dropped/recreated instead of updating inactive_since? === code CR1 === + Invalidates replication slots that are inactive for longer the + specified amount of time s/for longer the/for longer that/? CR2 === + <literal>true</literal>) as such synced slots don't actually perform + logical decoding. We're switching in fast forward logical due to [1], so I'm not sure that's 100% accurate here. I'm not sure we need to specify a reason. CR3 === + errdetail("This slot has been invalidated because it was inactive for more than the time specified by replication_slot_inactive_timeoutparameter."))); I think we can remove "parameter" (see for example the error message in validate_remote_info()) and reduce it a bit, something like? "This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout"? CR4 === + appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by replication_slot_inactive_timeoutparameter.")); Same. CR5 === + /* + * This function isn't expected to be called for inactive timeout based + * invalidation. A separate function InvalidateInactiveReplicationSlot is + * to be used for that. Do you think it's worth to explain why? CR6 === + if (replication_slot_inactive_timeout == 0) + return false; + else if (slot->inactive_since > 0) "else" is not needed here. CR7 === + SpinLockAcquire(&slot->mutex); + + /* + * Check if the slot needs to be invalidated due to + * replication_slot_inactive_timeout GUC. We do this with the spinlock + * held to avoid race conditions -- for example the inactive_since + * could change, or the slot could be dropped. + */ + now = GetCurrentTimestamp(); We should not call GetCurrentTimestamp() while holding a spinlock. CR8 === +# Testcase start: Invalidate streaming standby's slot as well as logical +# failover slot on primary due to inactive timeout GUC. Also, check the logical s/inactive timeout GUC/replication_slot_inactive_timeout/? CR9 === +# Start: Helper functions used for this test file +# End: Helper functions used for this test file I think that's the first TAP test with this comment. Not saying we should not but why did you feel the need to add those? [1]: https://www.postgresql.org/message-id/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Or a simple solution is that the slotsync worker updates > > > > inactive_since as it does for non-synced slots, and disables > > > > timeout-based slot invalidation for synced slots. > > > > I like this idea better, it takes care of such a case too when the > > user is relying on sync-function rather than worker and does not want > > to get the slots invalidated in between 2 sync function calls. > > Please find the attached v31 patches implementing the above idea: > Thanks for the patches, please find few comments: v31-001: 1) system-views.sgml: value will get updated after every synchronization from the corresponding remote slot on the primary. --This is confusing. It will be good to rephrase it. 2) update_synced_slots_inactive_since() --May be, we should mention in the header that this function is called only during promotion. 3) 040_standby_failover_slots_sync.pl: We capture inactive_since_on_primary when we do this for the first time at #175 ALTER SUBSCRIPTION regress_mysub1 DISABLE" But we again recreate the sub and disable it at line #280. Do you think we shall get inactive_since_on_primary again here, to be compared with inactive_since_on_new_primary later? v31-002: (I had reviewed v29-002 but missed to post comments, I think these are still applicable) 1) I think replication_slot_inactivity_timeout was recommended here (instead of replication_slot_inactive_timeout, so please give it a thought): https://www.postgresql.org/message-id/202403260739.udlp7lxixktx%40alvherre.pgsql 2) Commit msg: a) "It is often easy for developers to set a timeout of say 1 or 2 or 3 days at slot level, after which the inactive slots get dropped." Shall we say invalidated rather than dropped? b) "To achieve the above, postgres introduces a GUC allowing users set inactive timeout and then a slot stays inactive for this much amount of time it invalidates the slot." Broken sentence. <have not reviewed 002 patch in detail yet> thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Apr 3, 2024 at 12:20 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote: > > On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > Or a simple solution is that the slotsync worker updates > > > > > inactive_since as it does for non-synced slots, and disables > > > > > timeout-based slot invalidation for synced slots. > > > > > > I like this idea better, it takes care of such a case too when the > > > user is relying on sync-function rather than worker and does not want > > > to get the slots invalidated in between 2 sync function calls. > > > > Please find the attached v31 patches implementing the above idea: > > Thanks! > > Some comments related to v31-0001: > > === testing the behavior > > T1 === > > > - synced slots get their on inactive_since just like any other slot > > It behaves as described. > > T2 === > > > - synced slots inactive_since is set to current timestamp after the > > standby gets promoted to help inactive_since interpret correctly just > > like any other slot. > > It behaves as described. > > CR1 === > > + <structfield>inactive_since</structfield> value will get updated > + after every synchronization > > indicates the last synchronization time? (I think that after every synchronization > could lead to confusion). > +1. > CR2 === > > + /* > + * Set the time since the slot has become inactive after shutting > + * down slot sync machinery. This helps correctly interpret the > + * time if the standby gets promoted without a restart. > + */ > > It looks to me that this comment is not at the right place because there is > nothing after the comment that indicates that we shutdown the "slot sync machinery". > > Maybe a better place is before the function definition and mention that this is > currently called when we shutdown the "slot sync machinery"? > Won't it be better to have an assert for SlotSyncCtx->pid? IIRC, we have some existing issues where we don't ensure that no one is running sync API before shutdown is complete but I think we can deal with that separately and here we can still have an Assert. > CR3 === > > + * We get the current time beforehand and only once to avoid > + * system calls overhead while holding the lock. > > s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/? > Is it valid to say that there is overhead of this call while holding spinlock? Because I don't think at the time of promotion we expect any other concurrent slot activity. The first reason seems good enough. One other observation: --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -42,6 +42,7 @@ #include "access/transam.h" #include "access/xlog_internal.h" #include "access/xlogrecovery.h" +#include "access/xlogutils.h" Is there a reason for this inclusion? I don't see any change which should need this one. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > Or a simple solution is that the slotsync worker updates > > > > > inactive_since as it does for non-synced slots, and disables > > > > > timeout-based slot invalidation for synced slots. > > > > > > I like this idea better, it takes care of such a case too when the > > > user is relying on sync-function rather than worker and does not want > > > to get the slots invalidated in between 2 sync function calls. > > > > Please find the attached v31 patches implementing the above idea: > > > > Thanks for the patches, please find few comments: > > v31-001: > > 1) > system-views.sgml: > value will get updated after every synchronization from the > corresponding remote slot on the primary. > > --This is confusing. It will be good to rephrase it. > > 2) > update_synced_slots_inactive_since() > > --May be, we should mention in the header that this function is called > only during promotion. > > 3) 040_standby_failover_slots_sync.pl: > We capture inactive_since_on_primary when we do this for the first time at #175 > ALTER SUBSCRIPTION regress_mysub1 DISABLE" > > But we again recreate the sub and disable it at line #280. > Do you think we shall get inactive_since_on_primary again here, to be > compared with inactive_since_on_new_primary later? > I think so. Few additional comments on tests: 1. +is( $standby1->safe_psql( + 'postgres', + "SELECT '$inactive_since_on_primary'::timestamptz < '$inactive_since_on_standby'::timestamptz AND + '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;" Shall we do <= check as we are doing in the main function get_slot_inactive_since_value as the time duration is less so it can be the same as well? Similarly, please check other tests. 2. +=item $node->get_slot_inactive_since_value(self, slot_name, reference_time) + +Get inactive_since column value for a given replication slot validating it +against optional reference time. + +=cut + +sub get_slot_inactive_since_value I see that all callers validate against reference time. It is better to name it validate_slot_inactive_since rather than using get_* as the main purpose is to validate the passed value. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Apr 3, 2024 at 12:20 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > Please find the attached v31 patches implementing the above idea: > > Some comments related to v31-0001: > > === testing the behavior > > T1 === > > > - synced slots get their on inactive_since just like any other slot > > It behaves as described. > > T2 === > > > - synced slots inactive_since is set to current timestamp after the > > standby gets promoted to help inactive_since interpret correctly just > > like any other slot. > > It behaves as described. Thanks for testing. > CR1 === > > + <structfield>inactive_since</structfield> value will get updated > + after every synchronization > > indicates the last synchronization time? (I think that after every synchronization > could lead to confusion). Done. > CR2 === > > + /* > + * Set the time since the slot has become inactive after shutting > + * down slot sync machinery. This helps correctly interpret the > + * time if the standby gets promoted without a restart. > + */ > > It looks to me that this comment is not at the right place because there is > nothing after the comment that indicates that we shutdown the "slot sync machinery". > > Maybe a better place is before the function definition and mention that this is > currently called when we shutdown the "slot sync machinery"? Done. > CR3 === > > + * We get the current time beforehand and only once to avoid > + * system calls overhead while holding the lock. > > s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/? Done. > CR4 === > > + * Set the time since the slot has become inactive. We get the current > + * time beforehand to avoid system call overhead while holding the lock > > Same. Done. > CR5 === > > + # Check that the captured time is sane > + if (defined $reference_time) > + { > > s/Check that the captured time is sane/Check that the inactive_since is sane/? > > Sorry if some of those comments could have been done while I did review v29-0001. Done. On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote: > > Thanks for the patches, please find few comments: > > v31-001: > > 1) > system-views.sgml: > value will get updated after every synchronization from the > corresponding remote slot on the primary. > > --This is confusing. It will be good to rephrase it. Done as per Bertrand's suggestion. > 2) > update_synced_slots_inactive_since() > > --May be, we should mention in the header that this function is called > only during promotion. Done as per Bertrand's suggestion. > 3) 040_standby_failover_slots_sync.pl: > We capture inactive_since_on_primary when we do this for the first time at #175 > ALTER SUBSCRIPTION regress_mysub1 DISABLE" > > But we again recreate the sub and disable it at line #280. > Do you think we shall get inactive_since_on_primary again here, to be > compared with inactive_since_on_new_primary later? Hm. Done that. Recapturing both slot_creation_time_on_primary and inactive_since_on_primary before and after CREATE SUBSCRIPTION creates the slot again on the primary/publisher. On Wed, Apr 3, 2024 at 3:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > CR2 === > > > > + /* > > + * Set the time since the slot has become inactive after shutting > > + * down slot sync machinery. This helps correctly interpret the > > + * time if the standby gets promoted without a restart. > > + */ > > > > It looks to me that this comment is not at the right place because there is > > nothing after the comment that indicates that we shutdown the "slot sync machinery". > > > > Maybe a better place is before the function definition and mention that this is > > currently called when we shutdown the "slot sync machinery"? > > > Won't it be better to have an assert for SlotSyncCtx->pid? IIRC, we > have some existing issues where we don't ensure that no one is running > sync API before shutdown is complete but I think we can deal with that > separately and here we can still have an Assert. That can work to ensure the slot sync worker isn't running as SlotSyncCtx->pid gets updated only for the slot sync worker. I added this assertion for now. We need to ensure (in a separate patch and thread) there is no backend acquiring it and performing sync while the slot sync worker is shutting down. Otherwise, some of the slots can get resynced and some are not while we are shutting down the slot sync worker as part of the standby promotion which might leave the slots in an inconsistent state. > > CR3 === > > > > + * We get the current time beforehand and only once to avoid > > + * system calls overhead while holding the lock. > > > > s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/? > > > Is it valid to say that there is overhead of this call while holding > spinlock? Because I don't think at the time of promotion we expect any > other concurrent slot activity. The first reason seems good enough. No slot activity but why GetCurrentTimestamp needs to be called every time in a loop. > One other observation: > --- a/src/backend/replication/slot.c > +++ b/src/backend/replication/slot.c > @@ -42,6 +42,7 @@ > #include "access/transam.h" > #include "access/xlog_internal.h" > #include "access/xlogrecovery.h" > +#include "access/xlogutils.h" > > Is there a reason for this inclusion? I don't see any change which > should need this one. Not anymore. It was earlier needed for using the InRecovery flag in the then approach. On Wed, Apr 3, 2024 at 4:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > 3) 040_standby_failover_slots_sync.pl: > > We capture inactive_since_on_primary when we do this for the first time at #175 > > ALTER SUBSCRIPTION regress_mysub1 DISABLE" > > > > But we again recreate the sub and disable it at line #280. > > Do you think we shall get inactive_since_on_primary again here, to be > > compared with inactive_since_on_new_primary later? > > > > I think so. Modified this to recapture the times before and after the slot gets recreated. > Few additional comments on tests: > 1. > +is( $standby1->safe_psql( > + 'postgres', > + "SELECT '$inactive_since_on_primary'::timestamptz < > '$inactive_since_on_standby'::timestamptz AND > + '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;" > > Shall we do <= check as we are doing in the main function > get_slot_inactive_since_value as the time duration is less so it can > be the same as well? Similarly, please check other tests. I get you. If the tests are so fast that losing a bit of precision might cause tests to fail. So, I'v added equality check for all the tests. > 2. > +=item $node->get_slot_inactive_since_value(self, slot_name, reference_time) > + > +Get inactive_since column value for a given replication slot validating it > +against optional reference time. > + > +=cut > + > +sub get_slot_inactive_since_value > > I see that all callers validate against reference time. It is better > to name it validate_slot_inactive_since rather than using get_* as the > main purpose is to validate the passed value. Existing callers yes. Also, I've removed the reference time as an optional parameter. Per an offlist chat with Amit, I've added the following note in synchronize_one_slot: @@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) * overwriting 'invalidated' flag to remote_slot's value. See * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly * if the slot is not acquired by other processes. + * + * XXX: If it ever turns out that slot acquire/release is costly for + * cases when none of the slot property is changed then we can do a + * pre-check to ensure that at least one of the slot property is + * changed before acquiring the slot. */ ReplicationSlotAcquire(remote_slot->name, true); Please find the attached v32-0001 patch with the above review comments addressed. I'm working on review comments for 0002. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Apr 03, 2024 at 05:12:12PM +0530, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 4:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > + 'postgres', > > + "SELECT '$inactive_since_on_primary'::timestamptz < > > '$inactive_since_on_standby'::timestamptz AND > > + '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;" > > > > Shall we do <= check as we are doing in the main function > > get_slot_inactive_since_value as the time duration is less so it can > > be the same as well? Similarly, please check other tests. > > I get you. If the tests are so fast that losing a bit of precision > might cause tests to fail. So, I'v added equality check for all the > tests. > Please find the attached v32-0001 patch with the above review comments > addressed. Thanks! Just one comment on v32-0001: +# Synced slot on the standby must get its own inactive_since. +is( $standby1->safe_psql( + 'postgres', + "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND + '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;" + ), + "t", + 'synchronized slot has got its own inactive_since'); + By using <= we are not testing that it must get its own inactive_since (as we allow them to be equal in the test). I think we should just add some usleep() where appropriate and deny equality during the tests on inactive_since. Except for the above, v32-0001 LGTM. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Just one comment on v32-0001: > > +# Synced slot on the standby must get its own inactive_since. > +is( $standby1->safe_psql( > + 'postgres', > + "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND > + '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;" > + ), > + "t", > + 'synchronized slot has got its own inactive_since'); > + > > By using <= we are not testing that it must get its own inactive_since (as we > allow them to be equal in the test). I think we should just add some usleep() > where appropriate and deny equality during the tests on inactive_since. Thanks. It looks like we can ignore the equality in all of the inactive_since comparisons. IIUC, all the TAP tests do run with primary and standbys on the single BF animals. And, it looks like assigning the inactive_since timestamps to perl variables is giving the microseconds precision level (./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since 2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL tests relying on stats_reset timestamps without equality. So, I've left the equality for the inactive_since tests. > Except for the above, v32-0001 LGTM. Thanks. Please see the attached v33-0001 patch after removing equality on inactive_since TAP tests. On Wed, Apr 3, 2024 at 1:47 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Some comments regarding v31-0002: > > === testing the behavior > > T1 === > > > - synced slots don't get invalidated due to inactive timeout because > > such slots not considered active at all as they don't perform logical > > decoding (of course, they will perform in fast_forward mode to fix the > > other data loss issue, but they don't generate changes for them to be > > called as *active* slots) > > It behaves as described. OTOH non synced logical slots on the standby and > physical slots on the standby are invalidated which is what is expected. Right. > T2 === > > In case the slot is invalidated on the primary, > > primary: > > postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1'; > slot_name | inactive_since | invalidation_reason > -----------+-------------------------------+--------------------- > s1 | 2024-04-03 06:56:28.075637+00 | inactive_timeout > > then on the standby we get: > > standby: > > postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1'; > slot_name | inactive_since | invalidation_reason > -----------+------------------------------+--------------------- > s1 | 2024-04-03 07:06:43.37486+00 | inactive_timeout > > shouldn't the slot be dropped/recreated instead of updating inactive_since? The sync slots that are invalidated on the primary aren't dropped and recreated on the standby. There's no point in doing so because invalidated slots on the primary can't be made useful. However, I found that the synced slot is acquired and released unnecessarily after the invalidation_reason is synced from the primary. I added a skip check in synchronize_one_slot to skip acquiring and releasing the slot if it's locally found inactive. With this, inactive_since won't get updated for invalidated sync slots on the standby as we don't acquire and release the slot. > === code > > CR1 === > > + Invalidates replication slots that are inactive for longer the > + specified amount of time > > s/for longer the/for longer that/? Fixed. > CR2 === > > + <literal>true</literal>) as such synced slots don't actually perform > + logical decoding. > > We're switching in fast forward logical due to [1], so I'm not sure that's 100% > accurate here. I'm not sure we need to specify a reason. Fixed. > CR3 === > > + errdetail("This slot has been invalidated because it was inactive for more than the time specified by replication_slot_inactive_timeoutparameter."))); > > I think we can remove "parameter" (see for example the error message in > validate_remote_info()) and reduce it a bit, something like? > > "This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout"? Done. > CR4 === > > + appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by replication_slot_inactive_timeoutparameter.")); > > Same. Done. Changed it to "The slot has been inactive for more than replication_slot_inactive_timeout." > CR5 === > > + /* > + * This function isn't expected to be called for inactive timeout based > + * invalidation. A separate function InvalidateInactiveReplicationSlot is > + * to be used for that. > > Do you think it's worth to explain why? Hm, I just wanted to point out the actual function here. I modified it to something like the following, if others feel we don't need that, I can remove it. /* * Use InvalidateInactiveReplicationSlot for inactive timeout based * invalidation. */ > CR6 === > > + if (replication_slot_inactive_timeout == 0) > + return false; > + else if (slot->inactive_since > 0) > > "else" is not needed here. Nothing wrong there, but removed. > CR7 === > > + SpinLockAcquire(&slot->mutex); > + > + /* > + * Check if the slot needs to be invalidated due to > + * replication_slot_inactive_timeout GUC. We do this with the spinlock > + * held to avoid race conditions -- for example the inactive_since > + * could change, or the slot could be dropped. > + */ > + now = GetCurrentTimestamp(); > > We should not call GetCurrentTimestamp() while holding a spinlock. I was thinking why to add up the wait time to acquire LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);. Now that I moved it up before the spinlock but after the LWLockAcquire. > CR8 === > > +# Testcase start: Invalidate streaming standby's slot as well as logical > +# failover slot on primary due to inactive timeout GUC. Also, check the logical > > s/inactive timeout GUC/replication_slot_inactive_timeout/? Done. > CR9 === > > +# Start: Helper functions used for this test file > +# End: Helper functions used for this test file > > I think that's the first TAP test with this comment. Not saying we should not but > why did you feel the need to add those? Hm. Removed. > [1]: https://www.postgresql.org/message-id/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote: > > v31-002: > (I had reviewed v29-002 but missed to post comments, I think these > are still applicable) > > 1) I think replication_slot_inactivity_timeout was recommended here > (instead of replication_slot_inactive_timeout, so please give it a > thought): > https://www.postgresql.org/message-id/202403260739.udlp7lxixktx%40alvherre.pgsql Yeah. It's synonymous with inactive_since. If others have an opinion to have replication_slot_inactivity_timeout, I'm fine with it. > 2) Commit msg: > a) > "It is often easy for developers to set a timeout of say 1 > or 2 or 3 days at slot level, after which the inactive slots get > dropped." > > Shall we say invalidated rather than dropped? Right. Done that. > b) > "To achieve the above, postgres introduces a GUC allowing users > set inactive timeout and then a slot stays inactive for this much > amount of time it invalidates the slot." > > Broken sentence. Reworded it a bit. Please find the attached v33 patches. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Wed, Apr 03, 2024 at 08:28:04PM +0530, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Just one comment on v32-0001: > > > > +# Synced slot on the standby must get its own inactive_since. > > +is( $standby1->safe_psql( > > + 'postgres', > > + "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND > > + '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;" > > + ), > > + "t", > > + 'synchronized slot has got its own inactive_since'); > > + > > > > By using <= we are not testing that it must get its own inactive_since (as we > > allow them to be equal in the test). I think we should just add some usleep() > > where appropriate and deny equality during the tests on inactive_since. > > > Except for the above, v32-0001 LGTM. > > Thanks. Please see the attached v33-0001 patch after removing equality > on inactive_since TAP tests. Thanks! v33-0001 LGTM. > On Wed, Apr 3, 2024 at 1:47 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > Some comments regarding v31-0002: > > > > T2 === > > > > In case the slot is invalidated on the primary, > > > > primary: > > > > postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1'; > > slot_name | inactive_since | invalidation_reason > > -----------+-------------------------------+--------------------- > > s1 | 2024-04-03 06:56:28.075637+00 | inactive_timeout > > > > then on the standby we get: > > > > standby: > > > > postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1'; > > slot_name | inactive_since | invalidation_reason > > -----------+------------------------------+--------------------- > > s1 | 2024-04-03 07:06:43.37486+00 | inactive_timeout > > > > shouldn't the slot be dropped/recreated instead of updating inactive_since? > > The sync slots that are invalidated on the primary aren't dropped and > recreated on the standby. Yeah, right (I was confused with synced slots that are invalidated locally). > However, I > found that the synced slot is acquired and released unnecessarily > after the invalidation_reason is synced from the primary. I added a > skip check in synchronize_one_slot to skip acquiring and releasing the > slot if it's locally found inactive. With this, inactive_since won't > get updated for invalidated sync slots on the standby as we don't > acquire and release the slot. CR1 === Yeah, I can see: @@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) " name slot \"%s\" already exists on the standby", remote_slot->name)); + /* + * Skip the sync if the local slot is already invalidated. We do this + * beforehand to save on slot acquire and release. + */ + if (slot->data.invalidated != RS_INVAL_NONE) + return false; Thanks to the drop_local_obsolete_slots() call I think we are not missing the case where the slot has been invalidated on the primary, invalidation reason has been synced on the standby and later the slot is dropped/ recreated manually on the primary (then it should be dropped/recreated on the standby too). Also it seems we are not missing the case where a sync slot is invalidated locally due to wal removal (it should be dropped/recreated). > > > CR5 === > > > > + /* > > + * This function isn't expected to be called for inactive timeout based > > + * invalidation. A separate function InvalidateInactiveReplicationSlot is > > + * to be used for that. > > > > Do you think it's worth to explain why? > > Hm, I just wanted to point out the actual function here. I modified it > to something like the following, if others feel we don't need that, I > can remove it. Sorry If I was not clear but I meant to say "Do you think it's worth to explain why we decided to create a dedicated function"? (currently we "just" explain that we created one). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Masahiko Sawada
Date:
On Wed, Apr 3, 2024 at 11:58 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > Please find the attached v33 patches. @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void) if (SlotSyncCtx->pid == InvalidPid) { SpinLockRelease(&SlotSyncCtx->mutex); + update_synced_slots_inactive_since(); return; } SpinLockRelease(&SlotSyncCtx->mutex); @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void) } SpinLockRelease(&SlotSyncCtx->mutex); + + update_synced_slots_inactive_since(); } Why do we want to update all synced slots' inactive_since values at shutdown in spite of updating the value every time when releasing the slot? It seems to contradict the fact that inactive_since is updated when releasing or restoring the slot. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void) > if (SlotSyncCtx->pid == InvalidPid) > { > SpinLockRelease(&SlotSyncCtx->mutex); > + update_synced_slots_inactive_since(); > return; > } > SpinLockRelease(&SlotSyncCtx->mutex); > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void) > } > > SpinLockRelease(&SlotSyncCtx->mutex); > + > + update_synced_slots_inactive_since(); > } > > Why do we want to update all synced slots' inactive_since values at > shutdown in spite of updating the value every time when releasing the > slot? It seems to contradict the fact that inactive_since is updated > when releasing or restoring the slot. It is to get the inactive_since right for the cases where the standby is promoted without a restart similar to when a standby is promoted with restart in which case the inactive_since is set to current time in RestoreSlotFromDisk. Imagine the slot is synced last time at time t1 and then a few hours passed, the standby is promoted without a restart. If we don't set inactive_since to current time in this case in ShutDownSlotSync, the inactive timeout invalidation mechanism can kick in immediately. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Apr 3, 2024 at 8:28 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Just one comment on v32-0001: > > > > +# Synced slot on the standby must get its own inactive_since. > > +is( $standby1->safe_psql( > > + 'postgres', > > + "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND > > + '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;" > > + ), > > + "t", > > + 'synchronized slot has got its own inactive_since'); > > + > > > > By using <= we are not testing that it must get its own inactive_since (as we > > allow them to be equal in the test). I think we should just add some usleep() > > where appropriate and deny equality during the tests on inactive_since. > > Thanks. It looks like we can ignore the equality in all of the > inactive_since comparisons. IIUC, all the TAP tests do run with > primary and standbys on the single BF animals. And, it looks like > assigning the inactive_since timestamps to perl variables is giving > the microseconds precision level > (./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since > 2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL > tests relying on stats_reset timestamps without equality. So, I've > left the equality for the inactive_since tests. > > > Except for the above, v32-0001 LGTM. > > Thanks. Please see the attached v33-0001 patch after removing equality > on inactive_since TAP tests. > The v33-0001 looks good to me. I have made minor changes in the comments/commit message and removed one part of the test which was a bit confusing and didn't seem to add much value. Let me know what you think of the attached? -- With Regards, Amit Kapila.
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > The v33-0001 looks good to me. I have made minor changes in the > comments/commit message and removed one part of the test which was a > bit confusing and didn't seem to add much value. Let me know what you > think of the attached? Thanks for the changes. v34-0001 LGTM. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Masahiko Sawada
Date:
On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void) > > if (SlotSyncCtx->pid == InvalidPid) > > { > > SpinLockRelease(&SlotSyncCtx->mutex); > > + update_synced_slots_inactive_since(); > > return; > > } > > SpinLockRelease(&SlotSyncCtx->mutex); > > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void) > > } > > > > SpinLockRelease(&SlotSyncCtx->mutex); > > + > > + update_synced_slots_inactive_since(); > > } > > > > Why do we want to update all synced slots' inactive_since values at > > shutdown in spite of updating the value every time when releasing the > > slot? It seems to contradict the fact that inactive_since is updated > > when releasing or restoring the slot. > > It is to get the inactive_since right for the cases where the standby > is promoted without a restart similar to when a standby is promoted > with restart in which case the inactive_since is set to current time > in RestoreSlotFromDisk. > > Imagine the slot is synced last time at time t1 and then a few hours > passed, the standby is promoted without a restart. If we don't set > inactive_since to current time in this case in ShutDownSlotSync, the > inactive timeout invalidation mechanism can kick in immediately. > Thank you for the explanation! I understood the needs. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void) > > > if (SlotSyncCtx->pid == InvalidPid) > > > { > > > SpinLockRelease(&SlotSyncCtx->mutex); > > > + update_synced_slots_inactive_since(); > > > return; > > > } > > > SpinLockRelease(&SlotSyncCtx->mutex); > > > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void) > > > } > > > > > > SpinLockRelease(&SlotSyncCtx->mutex); > > > + > > > + update_synced_slots_inactive_since(); > > > } > > > > > > Why do we want to update all synced slots' inactive_since values at > > > shutdown in spite of updating the value every time when releasing the > > > slot? It seems to contradict the fact that inactive_since is updated > > > when releasing or restoring the slot. > > > > It is to get the inactive_since right for the cases where the standby > > is promoted without a restart similar to when a standby is promoted > > with restart in which case the inactive_since is set to current time > > in RestoreSlotFromDisk. > > > > Imagine the slot is synced last time at time t1 and then a few hours > > passed, the standby is promoted without a restart. If we don't set > > inactive_since to current time in this case in ShutDownSlotSync, the > > inactive timeout invalidation mechanism can kick in immediately. > > > > Thank you for the explanation! I understood the needs. > Do you want to review the v34_0001* further or shall I proceed with the commit of the same? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Masahiko Sawada
Date:
On Thu, Apr 4, 2024 at 5:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void) > > > > if (SlotSyncCtx->pid == InvalidPid) > > > > { > > > > SpinLockRelease(&SlotSyncCtx->mutex); > > > > + update_synced_slots_inactive_since(); > > > > return; > > > > } > > > > SpinLockRelease(&SlotSyncCtx->mutex); > > > > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void) > > > > } > > > > > > > > SpinLockRelease(&SlotSyncCtx->mutex); > > > > + > > > > + update_synced_slots_inactive_since(); > > > > } > > > > > > > > Why do we want to update all synced slots' inactive_since values at > > > > shutdown in spite of updating the value every time when releasing the > > > > slot? It seems to contradict the fact that inactive_since is updated > > > > when releasing or restoring the slot. > > > > > > It is to get the inactive_since right for the cases where the standby > > > is promoted without a restart similar to when a standby is promoted > > > with restart in which case the inactive_since is set to current time > > > in RestoreSlotFromDisk. > > > > > > Imagine the slot is synced last time at time t1 and then a few hours > > > passed, the standby is promoted without a restart. If we don't set > > > inactive_since to current time in this case in ShutDownSlotSync, the > > > inactive timeout invalidation mechanism can kick in immediately. > > > > > > > Thank you for the explanation! I understood the needs. > > > > Do you want to review the v34_0001* further or shall I proceed with > the commit of the same? Thanks for asking. The v34-0001 patch looks good to me. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Apr 4, 2024 at 11:12 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > The v33-0001 looks good to me. I have made minor changes in the > > comments/commit message and removed one part of the test which was a > > bit confusing and didn't seem to add much value. Let me know what you > > think of the attached? > > Thanks for the changes. v34-0001 LGTM. > I was doing a final review before pushing 0001 and found that 'inactive_since' could be set twice during startup after promotion, once while restoring slots and then via ShutDownSlotSync(). The reason is that ShutDownSlotSync() will be invoked in normal startup on primary though it won't do anything apart from setting inactive_since if we have synced slots. I think you need to check 'StandbyMode' in update_synced_slots_inactive_since() and return if the same is not set. We can't use 'InRecovery' flag as that will be set even during crash recovery. Can you please test this once unless you don't agree with the above theory? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Thanks for the changes. v34-0001 LGTM. > > I was doing a final review before pushing 0001 and found that > 'inactive_since' could be set twice during startup after promotion, > once while restoring slots and then via ShutDownSlotSync(). The reason > is that ShutDownSlotSync() will be invoked in normal startup on > primary though it won't do anything apart from setting inactive_since > if we have synced slots. I think you need to check 'StandbyMode' in > update_synced_slots_inactive_since() and return if the same is not > set. We can't use 'InRecovery' flag as that will be set even during > crash recovery. > > Can you please test this once unless you don't agree with the above theory? Nice catch. I've verified that update_synced_slots_inactive_since is called even for normal server startups/crash recovery. I've added a check to exit if the StandbyMode isn't set. Please find the attached v35 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Thu, Apr 4, 2024 at 5:53 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > Thanks for the changes. v34-0001 LGTM. > > > > I was doing a final review before pushing 0001 and found that > > 'inactive_since' could be set twice during startup after promotion, > > once while restoring slots and then via ShutDownSlotSync(). The reason > > is that ShutDownSlotSync() will be invoked in normal startup on > > primary though it won't do anything apart from setting inactive_since > > if we have synced slots. I think you need to check 'StandbyMode' in > > update_synced_slots_inactive_since() and return if the same is not > > set. We can't use 'InRecovery' flag as that will be set even during > > crash recovery. > > > > Can you please test this once unless you don't agree with the above theory? > > Nice catch. I've verified that update_synced_slots_inactive_since is > called even for normal server startups/crash recovery. I've added a > check to exit if the StandbyMode isn't set. > > Please find the attached v35 patch. Thanks for the patch. Tested it , works well. Few cosmetic changes needed: in 040 test file: 1) # Capture the inactive_since of the slot from the primary. Note that the slot # will be inactive since the corresponding subscription is disabled.. 2 .. at the end. Replace with one. 2) # Synced slot on the standby must get its own inactive_since. . not needed in single line comment (to be consistent with neighbouring comments) 3) update_synced_slots_inactive_since(): if (!StandbyMode) return; It will be good to add comments here. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > > shouldn't the slot be dropped/recreated instead of updating inactive_since? > > > > The sync slots that are invalidated on the primary aren't dropped and > > recreated on the standby. > > Yeah, right (I was confused with synced slots that are invalidated locally). > > > However, I > > found that the synced slot is acquired and released unnecessarily > > after the invalidation_reason is synced from the primary. I added a > > skip check in synchronize_one_slot to skip acquiring and releasing the > > slot if it's locally found inactive. With this, inactive_since won't > > get updated for invalidated sync slots on the standby as we don't > > acquire and release the slot. > > CR1 === > > Yeah, I can see: > > @@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) > " name slot \"%s\" already exists on the standby", > remote_slot->name)); > > + /* > + * Skip the sync if the local slot is already invalidated. We do this > + * beforehand to save on slot acquire and release. > + */ > + if (slot->data.invalidated != RS_INVAL_NONE) > + return false; > > Thanks to the drop_local_obsolete_slots() call I think we are not missing the case > where the slot has been invalidated on the primary, invalidation reason has been > synced on the standby and later the slot is dropped/ recreated manually on the > primary (then it should be dropped/recreated on the standby too). > > Also it seems we are not missing the case where a sync slot is invalidated > locally due to wal removal (it should be dropped/recreated). Right. > > > CR5 === > > > > > > + /* > > > + * This function isn't expected to be called for inactive timeout based > > > + * invalidation. A separate function InvalidateInactiveReplicationSlot is > > > + * to be used for that. > > > > > > Do you think it's worth to explain why? > > > > Hm, I just wanted to point out the actual function here. I modified it > > to something like the following, if others feel we don't need that, I > > can remove it. > > Sorry If I was not clear but I meant to say "Do you think it's worth to explain > why we decided to create a dedicated function"? (currently we "just" explain that > we created one). We added a new function (InvalidateInactiveReplicationSlot) to invalidate slot based on inactive timeout because 1) we do the inactive timeout invalidation at slot level as opposed to InvalidateObsoleteReplicationSlots which does loop over all the slots, 2) InvalidatePossiblyObsoleteSlot does release the lock in some cases, has a lot of unneeded code for inactive timeout invalidation check, 3) we want some control over saving the slot to disk because we hook the inactive timeout invalidation into the loop that checkpoints the slot info to the disk in CheckPointReplicationSlots. I've added a comment atop InvalidateInactiveReplicationSlot. Please find the attached v36 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bertrand Drouvot
Date:
Hi, On Fri, Apr 05, 2024 at 11:21:43AM +0530, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > Please find the attached v36 patch. Thanks! A few comments: 1 === + <para> + The timeout is measured from the time since the slot has become + inactive (known from its + <structfield>inactive_since</structfield> value) until it gets + used (i.e., its <structfield>active</structfield> is set to true). + </para> That's right except when it's invalidated during the checkpoint (as the slot is not acquired in CheckPointReplicationSlots()). So, what about adding: "or a checkpoint occurs"? That would also explain that the invalidation could occur during checkpoint. 2 === + /* If the slot has been invalidated, recalculate the resource limits */ + if (invalidated) + { /If the slot/If a slot/? 3 === + * NB - this function also runs as part of checkpoint, so avoid raising errors s/NB - this/NB: This function/? (that looks more consistent with other comments in the code) 4 === + * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause instead" looks weird to me. Maybe it would make sense to reword this a bit. 5 === + * considered not active as they don't actually perform logical decoding. Not sure that's 100% accurate as we switched in fast forward logical in 2ec005b4e2. "as they perform only fast forward logical decoding (or not at all)", maybe? 6 === + if (RecoveryInProgress() && slot->data.synced) + return false; + + if (replication_slot_inactive_timeout == 0) + return false; What about just using one if? It's more a matter of taste but it also probably reduces the object file size a bit for non optimized build. 7 === + /* + * Do not invalidate the slots which are currently being synced from + * the primary to the standby. + */ + if (RecoveryInProgress() && slot->data.synced) + return false; I think we don't need this check as the exact same one is done just before. 8 === +sub check_for_slot_invalidation_in_server_log +{ + my ($node, $slot_name, $offset) = @_; + my $invalidated = 0; + + for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++) + { + $node->safe_psql('postgres', "CHECKPOINT"); Wouldn't be better to wait for the replication_slot_inactive_timeout time before instead of triggering all those checkpoints? (it could be passed as an extra arg to wait_for_slot_invalidation()). 9 === # Synced slot mustn't get invalidated on the standby, it must sync invalidation # from the primary. So, we must not see the slot's invalidation message in server # log. ok( !$standby1->log_contains( "invalidating obsolete replication slot \"lsub1_sync_slot\"", $standby1_logstart), 'check that syned slot has not been invalidated on the standby'); Would that make sense to trigger a checkpoint on the standby before this test? I mean I think that without a checkpoint on the standby we should not see the invalidation in the log anyway. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Fri, Apr 5, 2024 at 1:14 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > Please find the attached v36 patch. > > A few comments: > > 1 === > > + <para> > + The timeout is measured from the time since the slot has become > + inactive (known from its > + <structfield>inactive_since</structfield> value) until it gets > + used (i.e., its <structfield>active</structfield> is set to true). > + </para> > > That's right except when it's invalidated during the checkpoint (as the slot > is not acquired in CheckPointReplicationSlots()). > > So, what about adding: "or a checkpoint occurs"? That would also explain that > the invalidation could occur during checkpoint. Reworded. > 2 === > > + /* If the slot has been invalidated, recalculate the resource limits */ > + if (invalidated) > + { > > /If the slot/If a slot/? Modified it to be like elsewhere. > 3 === > > + * NB - this function also runs as part of checkpoint, so avoid raising errors > > s/NB - this/NB: This function/? (that looks more consistent with other comments > in the code) Done. > 4 === > > + * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead > > I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause instead" > looks weird to me. Maybe it would make sense to reword this a bit. Reworded. > 5 === > > + * considered not active as they don't actually perform logical decoding. > > Not sure that's 100% accurate as we switched in fast forward logical > in 2ec005b4e2. > > "as they perform only fast forward logical decoding (or not at all)", maybe? Changed it to "as they don't perform logical decoding to produce the changes". In fast_forward mode no changes are produced. > 6 === > > + if (RecoveryInProgress() && slot->data.synced) > + return false; > + > + if (replication_slot_inactive_timeout == 0) > + return false; > > What about just using one if? It's more a matter of taste but it also probably > reduces the object file size a bit for non optimized build. Changed. > 7 === > > + /* > + * Do not invalidate the slots which are currently being synced from > + * the primary to the standby. > + */ > + if (RecoveryInProgress() && slot->data.synced) > + return false; > > I think we don't need this check as the exact same one is done just before. Right. Removed. > 8 === > > +sub check_for_slot_invalidation_in_server_log > +{ > + my ($node, $slot_name, $offset) = @_; > + my $invalidated = 0; > + > + for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++) > + { > + $node->safe_psql('postgres', "CHECKPOINT"); > > Wouldn't be better to wait for the replication_slot_inactive_timeout time before > instead of triggering all those checkpoints? (it could be passed as an extra arg > to wait_for_slot_invalidation()). Done. > 9 === > > # Synced slot mustn't get invalidated on the standby, it must sync invalidation > # from the primary. So, we must not see the slot's invalidation message in server > # log. > ok( !$standby1->log_contains( > "invalidating obsolete replication slot \"lsub1_sync_slot\"", > $standby1_logstart), > 'check that syned slot has not been invalidated on the standby'); > > Would that make sense to trigger a checkpoint on the standby before this test? > I mean I think that without a checkpoint on the standby we should not see the > invalidation in the log anyway. Done. Please find the attached v37 patch for further review. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Sat, Apr 6, 2024 at 11:55 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot() is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to ensure that there is no other active slot user? Is it sufficient to check inactive_since for the same? If so, we need some comments to explain the same. Can we avoid introducing the new functions like SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we do the required work in the caller? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Apr 6, 2024 at 12:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot() > is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to > ensure that there is no other active slot user? Is it sufficient to > check inactive_since for the same? If so, we need some comments to > explain the same. I removed the separate functions and with minimal changes, I've now placed the RS_INVAL_INACTIVE_TIMEOUT logic into InvalidatePossiblyObsoleteSlot and use that even in CheckPointReplicationSlots. > Can we avoid introducing the new functions like > SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we > do the required work in the caller? Hm. Removed them now. Please see the attached v38 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Sat, Apr 6, 2024 at 5:10 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Please see the attached v38 patch. Hi, thanks everyone for reviewing the design and patches so far. Here I'm with the v39 patches implementing inactive timeout based (0001) and XID age based (0002) invalidation mechanisms. I'm quoting the hackers who are okay with inactive timeout based invalidation mechanism: Bertrand Drouvot - https://www.postgresql.org/message-id/ZgL0N%2BxVJNkyqsKL%40ip-10-97-1-34.eu-west-3.compute.internal and https://www.postgresql.org/message-id/ZgPHDAlM79iLtGIH%40ip-10-97-1-34.eu-west-3.compute.internal Amit Kapila - https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com Nathan Bossart - https://www.postgresql.org/message-id/20240325195443.GA2923888%40nathanxps13 Robert Haas - https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com I'm quoting the hackers who are okay with XID age based invalidation mechanism: Nathan Bossart - https://www.postgresql.org/message-id/20240326150918.GB3181099%40nathanxps13 and https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13 Alvaro Herrera - https://www.postgresql.org/message-id/202403261539.xcjfle7sksz7%40alvherre.pgsql Bertrand Drouvot - https://www.postgresql.org/message-id/ZgPHDAlM79iLtGIH%40ip-10-97-1-34.eu-west-3.compute.internal Amit Kapila - https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com There was a point raised by Robert https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com for XID age based invalidation. An issue related to vacuum_defer_cleanup_age https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=be504a3e974d75be6f95c8f9b7367126034f2d12 led to the removal of the GUC https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1118cd37eb61e6a2428f457a8b2026a7bb3f801a. The same issue may not happen for the XID age based invaliation. This is because the XID age is not calculated using FullTransactionId but using TransactionId as the slot's xmin and catalog_xmin are tracked as TransactionId. There was a point raised by Amit https://www.postgresql.org/message-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ%2Bg%40mail.gmail.com on when to do the XID age based invalidation - whether in checkpointer or when vacuum is being run or whenever ComputeXIDHorizons gets called or in autovacuum process. For now, I've chosen the design to do these new invalidation checks in two places - 1) whenever the slot is acquired and the slot acquisition errors out if invalidated, 2) during checkpoint. However, I'm open to suggestions on this. I've also verified the case whether the replication_slot_xid_age setting can help in case of server inching towards the XID wraparound. I've created a primary and streaming standby setup with hot_standby_feedback set to on (so that the slot gets an xmin). Then, I've set replication_slot_xid_age to 2 billion on the primary, and used xid_wraparound extension to reach XID wraparound on the primary. Once I start receiving the WARNINGs about VACUUM, I did a checkpoint after which the slot got invalidated enabling my VACUUM to freeze XIDs saving my database from XID wraparound problem. Thanks a lot Masahiko Sawada for an offlist chat about the XID age calculation logic. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Masahiko Sawada
Date:
Hi, On Thu, Apr 4, 2024 at 9:23 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > Thanks for the changes. v34-0001 LGTM. > > > > I was doing a final review before pushing 0001 and found that > > 'inactive_since' could be set twice during startup after promotion, > > once while restoring slots and then via ShutDownSlotSync(). The reason > > is that ShutDownSlotSync() will be invoked in normal startup on > > primary though it won't do anything apart from setting inactive_since > > if we have synced slots. I think you need to check 'StandbyMode' in > > update_synced_slots_inactive_since() and return if the same is not > > set. We can't use 'InRecovery' flag as that will be set even during > > crash recovery. > > > > Can you please test this once unless you don't agree with the above theory? > > Nice catch. I've verified that update_synced_slots_inactive_since is > called even for normal server startups/crash recovery. I've added a > check to exit if the StandbyMode isn't set. > > Please find the attached v35 patch. > The documentation says about both 'active' and 'inactive_since' columns of pg_replication_slots say: --- active bool True if this slot is currently actively being used inactive_since timestamptz The time since the slot has become inactive. NULL if the slot is currently being used. Note that for slots on the standby that are being synced from a primary server (whose synced field is true), the inactive_since indicates the last synchronization (see Section 47.2.3) time. --- When reading the description I thought if 'active' is true, 'inactive_since' is NULL, but it doesn't seem to apply for temporary slots. Since we don't reset the active_pid field of temporary slots when the release, the 'active' is still true in the view but 'inactive_since' is not NULL. Do you think we need to mention it in the documentation? As for the timeout-based slot invalidation feature, we could end up invalidating the temporary slots even if they are shown as active, which could confuse users. Do we want to somehow deal with it? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
On Mon, Apr 22, 2024 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Please find the attached v35 patch. > > The documentation says about both 'active' and 'inactive_since' > columns of pg_replication_slots say: > > --- > active bool > True if this slot is currently actively being used > > inactive_since timestamptz > The time since the slot has become inactive. NULL if the slot is > currently being used. Note that for slots on the standby that are > being synced from a primary server (whose synced field is true), the > inactive_since indicates the last synchronization (see Section 47.2.3) > time. > --- > > When reading the description I thought if 'active' is true, > 'inactive_since' is NULL, but it doesn't seem to apply for temporary > slots. Right. > Since we don't reset the active_pid field of temporary slots > when the release, the 'active' is still true in the view but > 'inactive_since' is not NULL. Right. inactive_since is reset whenever the temporary slot is acquired again within the same backend that created the temporary slot. > Do you think we need to mention it in > the documentation? I think that's the reason we dropped "active" from the statement. It was earlier "NULL if the slot is currently actively being used.". But, per Bertrand's comment https://www.postgresql.org/message-id/ZehE2IJcsetSJMHC%40ip-10-97-1-34.eu-west-3.compute.internal changed it to ""NULL if the slot is currently being used.". Temporary slots retain the active = true and active_pid = <pid of the backend that created it> even when the slot is not being used until the lifetime of the backend process. We haven't tied active or active_pid flags to inactive_since, doing so now to represent the temporary slot behaviour for active and active_pid will confuse users more. As far as the inactive_since of a slot is concerned, it is set to 0 when the slot is being used (acquired) and set to current timestamp when the slot is not being used (released). > As for the timeout-based slot invalidation feature, we could end up > invalidating the temporary slots even if they are shown as active, > which could confuse users. Do we want to somehow deal with it? Yes. As long as the temporary slot is lying unused holding up resources for more than the specified replication_slot_inactive_timeout, it is bound to get invalidated. This keeps behaviour consistent and less-confusing to the users. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Apr 25, 2024 at 11:11 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, Apr 22, 2024 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > Please find the attached v35 patch. > > > > The documentation says about both 'active' and 'inactive_since' > > columns of pg_replication_slots say: > > > > --- > > active bool > > True if this slot is currently actively being used > > > > inactive_since timestamptz > > The time since the slot has become inactive. NULL if the slot is > > currently being used. Note that for slots on the standby that are > > being synced from a primary server (whose synced field is true), the > > inactive_since indicates the last synchronization (see Section 47.2.3) > > time. > > --- > > > > When reading the description I thought if 'active' is true, > > 'inactive_since' is NULL, but it doesn't seem to apply for temporary > > slots. > > Right. > > > Since we don't reset the active_pid field of temporary slots > > when the release, the 'active' is still true in the view but > > 'inactive_since' is not NULL. > > Right. inactive_since is reset whenever the temporary slot is acquired > again within the same backend that created the temporary slot. > > > Do you think we need to mention it in > > the documentation? > > I think that's the reason we dropped "active" from the statement. It > was earlier "NULL if the slot is currently actively being used.". But, > per Bertrand's comment > https://www.postgresql.org/message-id/ZehE2IJcsetSJMHC%40ip-10-97-1-34.eu-west-3.compute.internal > changed it to ""NULL if the slot is currently being used.". > > Temporary slots retain the active = true and active_pid = <pid of the > backend that created it> even when the slot is not being used until > the lifetime of the backend process. We haven't tied active or > active_pid flags to inactive_since, doing so now to represent the > temporary slot behaviour for active and active_pid will confuse users > more. > This is true and it's probably easy for us to understand as we developed this feature but the same may not be true for others. I wonder if we can be explicit about the difference of active/inactive_since by adding something like the following for inactive_since: Note that this field is not related to the active flag as temporary slots can remain active till the session ends even when they are not being used. Sawada-San, do you have any suggestions on the wording? > As far as the inactive_since of a slot is concerned, it is set > to 0 when the slot is being used (acquired) and set to current > timestamp when the slot is not being used (released). > > > As for the timeout-based slot invalidation feature, we could end up > > invalidating the temporary slots even if they are shown as active, > > which could confuse users. Do we want to somehow deal with it? > > Yes. As long as the temporary slot is lying unused holding up > resources for more than the specified > replication_slot_inactive_timeout, it is bound to get invalidated. > This keeps behaviour consistent and less-confusing to the users. > Agreed. We may want to add something in the docs for this to avoid confusion with the active flag. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi, On Sat, Apr 13, 2024 at 9:36 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > There was a point raised by Amit > https://www.postgresql.org/message-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ%2Bg%40mail.gmail.com > on when to do the XID age based invalidation - whether in checkpointer > or when vacuum is being run or whenever ComputeXIDHorizons gets called > or in autovacuum process. For now, I've chosen the design to do these > new invalidation checks in two places - 1) whenever the slot is > acquired and the slot acquisition errors out if invalidated, 2) during > checkpoint. However, I'm open to suggestions on this. Here are my thoughts on when to do the XID age invalidation. In all the patches sent so far, the XID age invalidation happens in two places - one during the slot acquisition, and another during the checkpoint. As the suggestion is to do it during the vacuum (manual and auto), so that even if the checkpoint isn't happening in the database for whatever reasons, a vacuum command or autovacuum can invalidate the slots whose XID is aged. An idea is to check for XID age based invalidation for all the slots in ComputeXidHorizons() before it reads replication_slot_xmin and replication_slot_catalog_xmin, and obviously before the proc array lock is acquired. A potential problem with this approach is that the invalidation check can become too aggressive as XID horizons are computed from many places. Another idea is to check for XID age based invalidation for all the slots in higher levels than ComputeXidHorizons(), for example in vacuum() which is an entry point for both vacuum command and autovacuum. This approach seems similar to vacuum_failsafe_age GUC which checks each relation for the failsafe age before vacuum gets triggered on it. Does anyone see any issues or risks with the above two approaches or have any other ideas? Thoughts? I attached v40 patches here. I reworded some of the ERROR messages, and did some code clean-up. Note that I haven't implemented any of the above approaches yet. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
On Mon, Jun 17, 2024 at 05:55:04PM +0530, Bharath Rupireddy wrote: > Here are my thoughts on when to do the XID age invalidation. In all > the patches sent so far, the XID age invalidation happens in two > places - one during the slot acquisition, and another during the > checkpoint. As the suggestion is to do it during the vacuum (manual > and auto), so that even if the checkpoint isn't happening in the > database for whatever reasons, a vacuum command or autovacuum can > invalidate the slots whose XID is aged. +1. IMHO this is a principled choice. The similar max_slot_wal_keep_size parameter is considered where it arguably matters most: when we are trying to remove/recycle WAL segments. Since this parameter is intended to prevent the server from running out of space, it makes sense that we'd apply it at the point where we are trying to free up space. The proposed max_slot_xid_age parameter is intended to prevent the server from running out of transaction IDs, so it follows that we'd apply it at the point where we reclaim them, which happens to be vacuum. > An idea is to check for XID age based invalidation for all the slots > in ComputeXidHorizons() before it reads replication_slot_xmin and > replication_slot_catalog_xmin, and obviously before the proc array > lock is acquired. A potential problem with this approach is that the > invalidation check can become too aggressive as XID horizons are > computed from many places. > > Another idea is to check for XID age based invalidation for all the > slots in higher levels than ComputeXidHorizons(), for example in > vacuum() which is an entry point for both vacuum command and > autovacuum. This approach seems similar to vacuum_failsafe_age GUC > which checks each relation for the failsafe age before vacuum gets > triggered on it. I don't presently have any strong opinion on where this logic should go, but in general, I think we should only invalidate slots if invalidating them would allow us to advance the vacuum cutoff. If the cutoff is held back by something else, I don't see a point in invalidating slots because we'll just be breaking replication in return for no additional reclaimed transaction IDs. -- nathan
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi, On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Here are my thoughts on when to do the XID age invalidation. In all > the patches sent so far, the XID age invalidation happens in two > places - one during the slot acquisition, and another during the > checkpoint. As the suggestion is to do it during the vacuum (manual > and auto), so that even if the checkpoint isn't happening in the > database for whatever reasons, a vacuum command or autovacuum can > invalidate the slots whose XID is aged. > > An idea is to check for XID age based invalidation for all the slots > in ComputeXidHorizons() before it reads replication_slot_xmin and > replication_slot_catalog_xmin, and obviously before the proc array > lock is acquired. A potential problem with this approach is that the > invalidation check can become too aggressive as XID horizons are > computed from many places. > > Another idea is to check for XID age based invalidation for all the > slots in higher levels than ComputeXidHorizons(), for example in > vacuum() which is an entry point for both vacuum command and > autovacuum. This approach seems similar to vacuum_failsafe_age GUC > which checks each relation for the failsafe age before vacuum gets > triggered on it. I am attaching the patches implementing the idea of invalidating replication slots during vacuum when current slot xmin limits (procArray->replication_slot_xmin and procArray->replication_slot_catalog_xmin) are aged as per the new XID age GUC. When either of these limits are aged, there must be at least one replication slot that is aged, because the xmin limits, after all, are the minimum of xmin or catalog_xmin of all replication slots. In this approach, the new XID age GUC will help vacuum when needed, because the current slot xmin limits are recalculated after invalidating replication slots that are holding xmins for longer than the age. The code is placed in vacuum() which is common for both vacuum command and autovacuum, and gets executed only once every vacuum cycle to not be too aggressive in invalidating. However, there might be some concerns with this approach like the following: 1) Adding more code to vacuum might not be acceptable 2) What if invalidation of replication slots emits an error, will it block vacuum forever? Currently, InvalidateObsoleteReplicationSlots() is also called as part of the checkpoint, and emitting ERRORs from within is avoided already. Therefore, there is no concern here for now. 3) What if there are more replication slots to be invalidated, will it delay the vacuum? If yes, by how much? <<TODO>> 4) Will the invalidation based on just current replication slot xmin limits suffice irrespective of vacuum cutoffs? IOW, if the replication slots are invalidated but vacuum isn't going to do any work because vacuum cutoffs are not yet met? Is the invalidation work wasteful here? 5) Is it okay to take just one more time the proc array lock to get current replication slot xmin limits via ProcArrayGetReplicationSlotXmin() once every vacuum cycle? <<TODO>> 6) Vacuum command can't be run on the standby in recovery. So, to help invalidate replication slots on the standby, I have for now let the checkpointer also do the XID age based invalidation. I know invalidating both in checkpointer and vacuum may not be a great idea, but I'm open to thoughts. Following are some of the alternative approaches which IMHO don't help vacuum when needed: a) Let the checkpointer do the XID age based invalidation, and call it out in the documentation that if the checkpoint doesn't happen, the new GUC doesn't help even if the vacuum is run. This has been the approach until v40 patch. b) Checkpointer and/or other backends add an autovacuum work item via AutoVacuumRequestWork(), and autovacuum when it gets to it will invalidate the replication slots. But, what to do for the vacuum command here? Please find the attached v41 patches implementing the idea of vacuum doing the invalidation. Thoughts? Thanks to Sawada-san for a detailed off-list discussion. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nathan Bossart
Date:
On Mon, Jun 24, 2024 at 11:30:00AM +0530, Bharath Rupireddy wrote: > 6) Vacuum command can't be run on the standby in recovery. So, to help > invalidate replication slots on the standby, I have for now let the > checkpointer also do the XID age based invalidation. I know > invalidating both in checkpointer and vacuum may not be a great idea, > but I'm open to thoughts. Hm. I hadn't considered this angle. > a) Let the checkpointer do the XID age based invalidation, and call it > out in the documentation that if the checkpoint doesn't happen, the > new GUC doesn't help even if the vacuum is run. This has been the > approach until v40 patch. My first reaction is that this is probably okay. I guess you might run into problems if you set max_slot_xid_age to 2B and checkpoint_timeout to 1 day, but even in that case your transaction ID usage rate would need to be pretty high for wraparound to occur. -- nathan
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Ajin Cherian
Date:
On Mon, Jun 24, 2024 at 4:01 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
Hi,
On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Here are my thoughts on when to do the XID age invalidation. In all
> the patches sent so far, the XID age invalidation happens in two
> places - one during the slot acquisition, and another during the
> checkpoint. As the suggestion is to do it during the vacuum (manual
> and auto), so that even if the checkpoint isn't happening in the
> database for whatever reasons, a vacuum command or autovacuum can
> invalidate the slots whose XID is aged.
>
> An idea is to check for XID age based invalidation for all the slots
> in ComputeXidHorizons() before it reads replication_slot_xmin and
> replication_slot_catalog_xmin, and obviously before the proc array
> lock is acquired. A potential problem with this approach is that the
> invalidation check can become too aggressive as XID horizons are
> computed from many places.
>
> Another idea is to check for XID age based invalidation for all the
> slots in higher levels than ComputeXidHorizons(), for example in
> vacuum() which is an entry point for both vacuum command and
> autovacuum. This approach seems similar to vacuum_failsafe_age GUC
> which checks each relation for the failsafe age before vacuum gets
> triggered on it.
I am attaching the patches implementing the idea of invalidating
replication slots during vacuum when current slot xmin limits
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) are aged as per the new XID
age GUC. When either of these limits are aged, there must be at least
one replication slot that is aged, because the xmin limits, after all,
are the minimum of xmin or catalog_xmin of all replication slots. In
this approach, the new XID age GUC will help vacuum when needed,
because the current slot xmin limits are recalculated after
invalidating replication slots that are holding xmins for longer than
the age. The code is placed in vacuum() which is common for both
vacuum command and autovacuum, and gets executed only once every
vacuum cycle to not be too aggressive in invalidating.
However, there might be some concerns with this approach like the following:
1) Adding more code to vacuum might not be acceptable
2) What if invalidation of replication slots emits an error, will it
block vacuum forever? Currently, InvalidateObsoleteReplicationSlots()
is also called as part of the checkpoint, and emitting ERRORs from
within is avoided already. Therefore, there is no concern here for
now.
3) What if there are more replication slots to be invalidated, will it
delay the vacuum? If yes, by how much? <<TODO>>
4) Will the invalidation based on just current replication slot xmin
limits suffice irrespective of vacuum cutoffs? IOW, if the replication
slots are invalidated but vacuum isn't going to do any work because
vacuum cutoffs are not yet met? Is the invalidation work wasteful
here?
5) Is it okay to take just one more time the proc array lock to get
current replication slot xmin limits via
ProcArrayGetReplicationSlotXmin() once every vacuum cycle? <<TODO>>
6) Vacuum command can't be run on the standby in recovery. So, to help
invalidate replication slots on the standby, I have for now let the
checkpointer also do the XID age based invalidation. I know
invalidating both in checkpointer and vacuum may not be a great idea,
but I'm open to thoughts.
Following are some of the alternative approaches which IMHO don't help
vacuum when needed:
a) Let the checkpointer do the XID age based invalidation, and call it
out in the documentation that if the checkpoint doesn't happen, the
new GUC doesn't help even if the vacuum is run. This has been the
approach until v40 patch.
b) Checkpointer and/or other backends add an autovacuum work item via
AutoVacuumRequestWork(), and autovacuum when it gets to it will
invalidate the replication slots. But, what to do for the vacuum
command here?
Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.
Thoughts?
Thanks to Sawada-san for a detailed off-list discussion.
The patch no longer applies on HEAD, please rebase.
regards,
Ajin Cherian
Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Masahiko Sawada
Date:
On Tue, Jul 9, 2024 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Mon, Jun 24, 2024 at 11:30:00AM +0530, Bharath Rupireddy wrote: > > 6) Vacuum command can't be run on the standby in recovery. So, to help > > invalidate replication slots on the standby, I have for now let the > > checkpointer also do the XID age based invalidation. I know > > invalidating both in checkpointer and vacuum may not be a great idea, > > but I'm open to thoughts. > > Hm. I hadn't considered this angle. Another idea would be to let the startup process do slot invalidation when replaying a RUNNING_XACTS record. Since a RUNNING_XACTS record has the latest XID on the primary, I think the startup process can compare it to the slot-xmin, and invalidate slots which are older than the age limit. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Ajin Cherian
Date:
On Mon, Jun 24, 2024 at 4:01 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
Hi,
On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.
Thoughts?
Some minor comments on the patch:
1.
+ /*
+ * Release the lock if it's not yet to keep the cleanup path on
+ * error happy.
+ */
I suggest rephrasing to: " "Release the lock if it hasn't been already to ensure smooth cleanup on error."
2.
elog(DEBUG1, "performing replication slot invalidation");
Probably change it to "performing replication slot invalidation checks" as we might not actually invalidate any slot here.
3.
In CheckPointReplicationSlots()
+ invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+ 0,
+ InvalidOid,
+ InvalidTransactionId);
+
+ if (invalidated)
+ {
+ /*
+ * If any slots have been invalidated, recalculate the resource
+ * limits.
+ */
+ ReplicationSlotsComputeRequiredXmin(false);
+ ReplicationSlotsComputeRequiredLSN();
+ }
Is this calculation of resource limits really required here when the same is already done inside InvalidateObsoleteReplicationSlots()
1.
+ /*
+ * Release the lock if it's not yet to keep the cleanup path on
+ * error happy.
+ */
I suggest rephrasing to: " "Release the lock if it hasn't been already to ensure smooth cleanup on error."
2.
elog(DEBUG1, "performing replication slot invalidation");
Probably change it to "performing replication slot invalidation checks" as we might not actually invalidate any slot here.
3.
In CheckPointReplicationSlots()
+ invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+ 0,
+ InvalidOid,
+ InvalidTransactionId);
+
+ if (invalidated)
+ {
+ /*
+ * If any slots have been invalidated, recalculate the resource
+ * limits.
+ */
+ ReplicationSlotsComputeRequiredXmin(false);
+ ReplicationSlotsComputeRequiredLSN();
+ }
Is this calculation of resource limits really required here when the same is already done inside InvalidateObsoleteReplicationSlots()
regards,
Ajin Cherian
Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Aug 26, 2024 at 11:44 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > Few comments on 0001: 1. @@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) " name slot \"%s\" already exists on the standby", remote_slot->name)); + /* + * Skip the sync if the local slot is already invalidated. We do this + * beforehand to avoid slot acquire and release. + */ + if (slot->data.invalidated != RS_INVAL_NONE) + return false; + /* * The slot has been synchronized before. I was wondering why you have added this new check as part of this patch. If you see the following comments in the related code, you will know why we haven't done this previously. /* * The slot has been synchronized before. * * It is important to acquire the slot here before checking * invalidation. If we don't acquire the slot first, there could be a * race condition that the local slot could be invalidated just after * checking the 'invalidated' flag here and we could end up * overwriting 'invalidated' flag to remote_slot's value. See * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly * if the slot is not acquired by other processes. * * XXX: If it ever turns out that slot acquire/release is costly for * cases when none of the slot properties is changed then we can do a * pre-check to ensure that at least one of the slot properties is * changed before acquiring the slot. */ ReplicationSlotAcquire(remote_slot->name, true); We need some modifications in these comments if you want to add a pre-check here. 2. @@ -1907,6 +2033,31 @@ CheckPointReplicationSlots(bool is_shutdown) SaveSlotToPath(s, path, LOG); } LWLockRelease(ReplicationSlotAllocationLock); + + elog(DEBUG1, "performing replication slot invalidation checks"); + + /* + * Note that we will make another pass over replication slots for + * invalidations to keep the code simple. The assumption here is that the + * traversal over replication slots isn't that costly even with hundreds + * of replication slots. If it ever turns out that this assumption is + * wrong, we might have to put the invalidation check logic in the above + * loop, for that we might have to do the following: + * + * - Acqure ControlLock lock once before the loop. + * + * - Call InvalidatePossiblyObsoleteSlot for each slot. + * + * - Handle the cases in which ControlLock gets released just like + * InvalidateObsoleteReplicationSlots does. + * + * - Avoid saving slot info to disk two times for each invalidated slot. + * + * XXX: Should we move inactive_timeout inavalidation check closer to + * wal_removed in CreateCheckPoint and CreateRestartPoint? + */ + InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, + 0, InvalidOid, InvalidTransactionId); Why do we want to call this for shutdown case (when is_shutdown is true)? I understand trying to invalidate slots during regular checkpoint but not sure if we need it at the time of shutdown. The other point is can we try to check the performance impact with 100s of slots as mentioned in the code comments? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Thu, Aug 29, 2024 at 11:31 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Thanks for looking into this. > > On Mon, Aug 26, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Few comments on 0001: > > 1. > > @@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid > > > > + /* > > + * Skip the sync if the local slot is already invalidated. We do this > > + * beforehand to avoid slot acquire and release. > > + */ > > > > I was wondering why you have added this new check as part of this > > patch. If you see the following comments in the related code, you will > > know why we haven't done this previously. > > Removed. Can deal with optimization separately. > > > 2. > > + */ > > + InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, > > + 0, InvalidOid, InvalidTransactionId); > > > > Why do we want to call this for shutdown case (when is_shutdown is > > true)? I understand trying to invalidate slots during regular > > checkpoint but not sure if we need it at the time of shutdown. > > Changed it to invalidate only for non-shutdown checkpoints. inactive_timeout invalidation isn't critical for shutdown unlikewal_removed which can help shutdown by freeing up some disk space. > > > The > > other point is can we try to check the performance impact with 100s of > > slots as mentioned in the code comments? > > I first checked how much does the wal_removed invalidation check add to the checkpoint (see 2nd and 3rd column). I thenchecked how much inactive_timeout invalidation check adds to the checkpoint (see 4th column), it is not more than wal_removeinvalidation check. I then checked how much the wal_removed invalidation check adds for replication slots thathave already been invalidated due to inactive_timeout (see 5th column), looks like not much. > > | # of slots | HEAD (no invalidation) ms | HEAD (wal_removed) ms | PATCHED (inactive_timeout) ms | PATCHED (inactive_timeout+wal_removed)ms | > |------------|----------------------------|-----------------------|-------------------------------|------------------------------------------| > | 100 | 18.591 | 370.586 | 359.299 | 373.882 | > | 1000 | 15.722 | 4834.901 | 5081.751 | 5072.128 | > | 10000 | 19.261 | 59801.062 | 61270.406 | 60270.099 | > > Having said that, I'm okay to implement the optimization specified. Thoughts? > The other possibility is to try invalidating due to timeout along with wal_removed case during checkpoint. The idea is that if the slot can be invalidated due to WAL then fine, otherwise check if it can be invalidated due to timeout. This can avoid looping the slots and doing similar work multiple times during the checkpoint. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Sat, Aug 31, 2024 at 1:45 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Please find the attached v44 patch with the above changes. I will > include the 0002 xid_age based invalidation patch later. > It is better to get the 0001 reviewed and committed first. We can discuss about 0002 afterwards as 0001 is in itself a complete and separate patch that can be committed. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi, my previous review posts did not cover the test code. Here are my review comments for the v44-0001 test code ====== TEST CASE #1 1. +# Wait for the inactive replication slot to be invalidated. +$standby1->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = 'lsub1_sync_slot' AND + invalidation_reason = 'inactive_timeout'; +]) + or die + "Timed out while waiting for lsub1_sync_slot invalidation to be synced on standby"; + Is that comment correct? IIUC the synced slot should *already* be invalidated from the primary, so here we are not really "waiting" for it to be invalidated; Instead, we are just "confirming" that the synchronized slot is already invalidated with the correct reason as expected. ~~~ 2. +# Synced slot mustn't get invalidated on the standby even after a checkpoint, +# it must sync invalidation from the primary. So, we must not see the slot's +# invalidation message in server log. +$standby1->safe_psql('postgres', "CHECKPOINT"); +ok( !$standby1->log_contains( + "invalidating obsolete replication slot \"lsub1_sync_slot\"", + $standby1_logstart), + 'check that syned lsub1_sync_slot has not been invalidated on the standby' +); + This test case seemed bogus, for a couple of reasons: 2a. IIUC this 'lsub1_sync_slot' is the same one that is already invalid (from the primary), so nobody should be surprised that an already invalid slot doesn't get flagged as invalid again. i.e. Shouldn't your test scenario here be done using a valid synced slot? 2b. AFAICT it was only moments above this CHECKPOINT where you assigned the standby inactivity timeout to 2s. So even if there was some bug invalidating synced slots I don't think you gave it enough time to happen -- e.g. I doubt 2s has elapsed yet. ~ 3. +# Stop standby to make the standby's replication slot on the primary inactive +$standby1->stop; + +# Wait for the standby's replication slot to become inactive +wait_for_slot_invalidation($primary, 'sb1_slot', $logstart, + $inactive_timeout); This seems a bit tricky. Both these (the stop and the wait) seem to belong together, so I think maybe a single bigger explanatory comment covering both parts would help for understanding. ====== TEST CASE #2 4. +# Stop subscriber to make the replication slot on publisher inactive +$subscriber->stop; + +# Wait for the replication slot to become inactive and then invalidated due to +# timeout. +wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart, + $inactive_timeout); IIUC, this is just like comment #3 above. Both these (the stop and the wait) seem to belong together, so I think maybe a single bigger explanatory comment covering both parts would help for understanding. ~~~ 5. +# Testcase end: Invalidate logical subscriber's slot due to +# replication_slot_inactive_timeout. +# ============================================================================= IMO the rest of the comment after "Testcase end" isn't very useful. ====== sub wait_for_slot_invalidation 6. +sub wait_for_slot_invalidation +{ An explanatory header comment for this subroutine would be helpful. ~~~ 7. + # Wait for the replication slot to become inactive + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND active = 'f'; + ]) + or die + "Timed out while waiting for slot $slot_name to become inactive on node $name"; + + # Wait for the replication slot info to be updated + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE inactive_since IS NOT NULL + AND slot_name = '$slot_name' AND active = 'f'; + ]) + or die + "Timed out while waiting for info of slot $slot_name to be updated on node $name"; + Why are there are 2 separate poll_query_until's here? Can't those be combined into just one? ~~~ 8. + # Sleep at least $inactive_timeout duration to avoid multiple checkpoints + # for the slot to get invalidated. + sleep($inactive_timeout); + Maybe this special sleep to prevent too many CHECKPOINTs should be moved to be inside the other subroutine, which is actually doing those CHECKPOINTs. ~~~ 9. + # Wait for the inactive replication slot to be invalidated + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot_name' AND + invalidation_reason = 'inactive_timeout'; + ]) + or die + "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name"; + The comment seems misleading. IIUC you are not "waiting" for the invalidation here, because it is the other subroutine doing the waiting for the invalidation message in the logs. Instead, here I think you are just confirming the 'invalidation_reason' got set correctly. The comment should say what it is really doing. ====== sub check_for_slot_invalidation_in_server_log 10. +# Check for invalidation of slot in server log +sub check_for_slot_invalidation_in_server_log +{ I think the main function of this subroutine is the CHECKPOINT and the waiting for the server log to say invalidation happened. It is doing a loop of a) CHECKPOINT then b) inspecting the server log for the slot invalidation, and c) waiting for a bit. Repeat 10 times. A comment describing the logic for this subroutine would be helpful. The most important side-effect of this function is the CHECKPOINT because without that nothing will ever get invalidated due to inactivity, but this key point is not obvious from the subroutine name. IMO it would be better to name this differently to reflect what it is really doing: e.g. "CHECKPOINT_and_wait_for_slot_invalidation_in_server_log" ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Sat, Aug 31, 2024 at 1:45 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > > Please find the attached v44 patch with the above changes. I will > include the 0002 xid_age based invalidation patch later. > Thanks for the patch Bharath. My review and testing is WIP, but please find few comments and queries: 1) I see that ReplicationSlotAlter() will error out if the slot is invalidated due to timeout. I have not tested it myself, but do you know if slot-alter errors out for other invalidation causes as well? Just wanted to confirm that the behaviour is consistent for all invalidation causes. 2) When a slot is invalidated, and we try to use that slot, it gives this msg: ERROR: can no longer get changes from replication slot "mysubnew1_2" DETAIL: The slot became invalid because it was inactive since 2024-09-03 14:23:34.094067+05:30, which is more than 600 seconds ago. HINT: You might need to increase "replication_slot_inactive_timeout.". Isn't HINT misleading? Even if we increase it now, the slot can not be reused again. 3) When the slot is invalidated, the' inactive_since' still keeps on changing when there is a subscriber trying to start replication continuously. I think ReplicationSlotAcquire() keeps on failing and thus Release keeps on setting it again and again. Shouldn't we stop setting/chnaging 'inactive_since' once the slot is invalidated already, otherwise it will be misleading. postgres=# select failover,synced,inactive_since,invalidation_reason from pg_replication_slots; failover | synced | inactive_since | invalidation_reason ----------+--------+----------------------------------+--------------------- t | f | 2024-09-03 14:23:.. | inactive_timeout after sometime: failover | synced | inactive_since | invalidation_reason ----------+--------+----------------------------------+--------------------- t | f | 2024-09-03 14:26:..| inactive_timeout 4) src/sgml/config.sgml: 4a) + A value of zero (which is default) disables the timeout mechanism. Better will be: A value of zero (which is default) disables the inactive timeout invalidation mechanism . or A value of zero (which is default) disables the slot invalidation due to the inactive timeout mechanism. i.e. rephrase to indicate that invalidation is disabled. 4b) 'synced' and inactive_since should point to pg_replication_slots: example: <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> 5) src/sgml/system-views.sgml: + ..the slot has been inactive for longer than the duration specified by replication_slot_inactive_timeout parameter. Better to have: ..the slot has been inactive for a time longer than the duration specified by the replication_slot_inactive_timeout parameter. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > 1) > I see that ReplicationSlotAlter() will error out if the slot is > invalidated due to timeout. I have not tested it myself, but do you > know if slot-alter errors out for other invalidation causes as well? > Just wanted to confirm that the behaviour is consistent for all > invalidation causes. I was able to test this and as anticipated behavior is different. When slot is invalidated due to say 'wal_removed', I am still able to do 'alter' of that slot. Please see: Pub: slot_name | failover | synced | inactive_since | invalidation_reason -------------+----------+--------+----------------------------------+--------------------- mysubnew1_1 | t | f | 2024-09-04 08:58:12.802278+05:30 | wal_removed Sub: newdb1=# alter subscription mysubnew1_1 disable; ALTER SUBSCRIPTION newdb1=# alter subscription mysubnew1_1 set (failover=false); ALTER SUBSCRIPTION Pub: (failover altered) slot_name | failover | synced | inactive_since | invalidation_reason -------------+----------+--------+----------------------------------+--------------------- mysubnew1_1 | f | f | 2024-09-04 08:58:47.824471+05:30 | wal_removed while when invalidation_reason is 'inactive_timeout', it fails: Pub: slot_name | failover | synced | inactive_since | invalidation_reason -------------+----------+--------+----------------------------------+--------------------- mysubnew1_1 | t | f | 2024-09-03 14:30:57.532206+05:30 | inactive_timeout Sub: newdb1=# alter subscription mysubnew1_1 disable; ALTER SUBSCRIPTION newdb1=# alter subscription mysubnew1_1 set (failover=false); ERROR: could not alter replication slot "mysubnew1_1": ERROR: can no longer get changes from replication slot "mysubnew1_1" DETAIL: The slot became invalid because it was inactive since 2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago. HINT: You might need to increase "replication_slot_inactive_timeout.". I think the behavior should be same. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > 1) It is related to one of my previous comments (pt 3 in [1]) where I stated that inactive_since should not keep on changing once a slot is invalidated. Below is one side effect if inactive_since keeps on changing: postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1', pg_current_wal_lsn()); ERROR: can no longer get changes from replication slot "mysubnew1_1" DETAIL: The slot became invalid because it was inactive since 2024-09-04 10:03:56.68053+05:30, which is more than 10 seconds ago. HINT: You might need to increase "replication_slot_inactive_timeout.". postgres=# select now(); now --------------------------------- 2024-09-04 10:04:00.26564+05:30 'DETAIL' gives wrong information, we are not past 10-seconds. This is because inactive_since got updated even in ERROR scenario. 2) One more issue in this message is, once I set replication_slot_inactive_timeout to a bigger value, it becomes more misleading. This is because invalidation was done in the past using previous value while message starts showing new value: ALTER SYSTEM SET replication_slot_inactive_timeout TO '36h'; --see 129600 secs in DETAIL and the current time. postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1', pg_current_wal_lsn()); ERROR: can no longer get changes from replication slot "mysubnew1_1" DETAIL: The slot became invalid because it was inactive since 2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds ago. postgres=# select now(); now ---------------------------------- 2024-09-04 10:07:35.201894+05:30 I feel we should change this message itself. ~~~~~ When invalidation is due to wal_removed, we get a way simpler message: newdb1=# SELECT * FROM pg_replication_slot_advance('mysubnew1_2', pg_current_wal_lsn()); ERROR: replication slot "mysubnew1_2" cannot be advanced DETAIL: This slot has never previously reserved WAL, or it has been invalidated. This message does not mention 'max_slot_wal_keep_size'. We should have a similar message for our case. Thoughts? [1]: https://www.postgresql.org/message-id/CAJpy0uC8Dg-0JS3NRUwVUemgz5Ar2v3_EQQFXyAigWSEQ8U47Q%40mail.gmail.com thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Sep 4, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > 1) > It is related to one of my previous comments (pt 3 in [1]) where I > stated that inactive_since should not keep on changing once a slot is > invalidated. > Agreed. Updating the inactive_since for a slot that is already invalid is misleading. > > > 2) > One more issue in this message is, once I set > replication_slot_inactive_timeout to a bigger value, it becomes more > misleading. This is because invalidation was done in the past using > previous value while message starts showing new value: > > ALTER SYSTEM SET replication_slot_inactive_timeout TO '36h'; > > --see 129600 secs in DETAIL and the current time. > postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1', > pg_current_wal_lsn()); > ERROR: can no longer get changes from replication slot "mysubnew1_1" > DETAIL: The slot became invalid because it was inactive since > 2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds > ago. > postgres=# select now(); > now > ---------------------------------- > 2024-09-04 10:07:35.201894+05:30 > > I feel we should change this message itself. > +1. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > 1) > > I see that ReplicationSlotAlter() will error out if the slot is > > invalidated due to timeout. I have not tested it myself, but do you > > know if slot-alter errors out for other invalidation causes as well? > > Just wanted to confirm that the behaviour is consistent for all > > invalidation causes. > > I was able to test this and as anticipated behavior is different. When > slot is invalidated due to say 'wal_removed', I am still able to do > 'alter' of that slot. > Please see: > > Pub: > slot_name | failover | synced | inactive_since | > invalidation_reason > -------------+----------+--------+----------------------------------+--------------------- > mysubnew1_1 | t | f | 2024-09-04 08:58:12.802278+05:30 | > wal_removed > > Sub: > newdb1=# alter subscription mysubnew1_1 disable; > ALTER SUBSCRIPTION > > newdb1=# alter subscription mysubnew1_1 set (failover=false); > ALTER SUBSCRIPTION > > Pub: (failover altered) > slot_name | failover | synced | inactive_since | > invalidation_reason > -------------+----------+--------+----------------------------------+--------------------- > mysubnew1_1 | f | f | 2024-09-04 08:58:47.824471+05:30 | > wal_removed > > > while when invalidation_reason is 'inactive_timeout', it fails: > > Pub: > slot_name | failover | synced | inactive_since | > invalidation_reason > -------------+----------+--------+----------------------------------+--------------------- > mysubnew1_1 | t | f | 2024-09-03 14:30:57.532206+05:30 | > inactive_timeout > > Sub: > newdb1=# alter subscription mysubnew1_1 disable; > ALTER SUBSCRIPTION > > newdb1=# alter subscription mysubnew1_1 set (failover=false); > ERROR: could not alter replication slot "mysubnew1_1": ERROR: can no > longer get changes from replication slot "mysubnew1_1" > DETAIL: The slot became invalid because it was inactive since > 2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago. > HINT: You might need to increase "replication_slot_inactive_timeout.". > > I think the behavior should be same. > We should not allow the invalid replication slot to be altered irrespective of the reason unless there is any benefit. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi, Thanks for reviewing. On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > 1) > I see that ReplicationSlotAlter() will error out if the slot is > invalidated due to timeout. I have not tested it myself, but do you > know if slot-alter errors out for other invalidation causes as well? > Just wanted to confirm that the behaviour is consistent for all > invalidation causes. Will respond to Amit's comment soon. > 2) > When a slot is invalidated, and we try to use that slot, it gives this msg: > > ERROR: can no longer get changes from replication slot "mysubnew1_2" > DETAIL: The slot became invalid because it was inactive since > 2024-09-03 14:23:34.094067+05:30, which is more than 600 seconds ago. > HINT: You might need to increase "replication_slot_inactive_timeout.". > > Isn't HINT misleading? Even if we increase it now, the slot can not be > reused again. > > Below is one side effect if inactive_since keeps on changing: > > postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1', > pg_current_wal_lsn()); > ERROR: can no longer get changes from replication slot "mysubnew1_1" > DETAIL: The slot became invalid because it was inactive since > 2024-09-04 10:03:56.68053+05:30, which is more than 10 seconds ago. > HINT: You might need to increase "replication_slot_inactive_timeout.". > > postgres=# select now(); > now > --------------------------------- > 2024-09-04 10:04:00.26564+05:30 > > 'DETAIL' gives wrong information, we are not past 10-seconds. This is > because inactive_since got updated even in ERROR scenario. > > ERROR: can no longer get changes from replication slot "mysubnew1_1" > DETAIL: The slot became invalid because it was inactive since > 2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds > ago. > postgres=# select now(); > now > ---------------------------------- > 2024-09-04 10:07:35.201894+05:30 > > I feel we should change this message itself. Removed the hint and corrected the detail message as following: errmsg("can no longer get changes from replication slot \"%s\"", NameStr(s->data.name)), errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".", "replication_slot_inactive_timeout."))); > 3) > When the slot is invalidated, the' inactive_since' still keeps on > changing when there is a subscriber trying to start replication > continuously. I think ReplicationSlotAcquire() keeps on failing and > thus Release keeps on setting it again and again. Shouldn't we stop > setting/chnaging 'inactive_since' once the slot is invalidated > already, otherwise it will be misleading. > > postgres=# select failover,synced,inactive_since,invalidation_reason > from pg_replication_slots; > > failover | synced | inactive_since | invalidation_reason > ----------+--------+----------------------------------+--------------------- > t | f | 2024-09-03 14:23:.. | inactive_timeout > > after sometime: > failover | synced | inactive_since | invalidation_reason > ----------+--------+----------------------------------+--------------------- > t | f | 2024-09-03 14:26:..| inactive_timeout Changed it to not update inactive_since for slots invalidated due to inactive timeout. > 4) > src/sgml/config.sgml: > > 4a) > + A value of zero (which is default) disables the timeout mechanism. > > Better will be: > A value of zero (which is default) disables the inactive timeout > invalidation mechanism . Changed. > 4b) > 'synced' and inactive_since should point to pg_replication_slots: > > example: > <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> Modified. > 5) > src/sgml/system-views.sgml: > + ..the slot has been inactive for longer than the duration specified > by replication_slot_inactive_timeout parameter. > > Better to have: > ..the slot has been inactive for a time longer than the duration > specified by the replication_slot_inactive_timeout parameter. Changed it to the following to be consistent with the config.sgml. <literal>inactive_timeout</literal> means that the slot has been inactive for longer than the amount of time specified by the <xref linkend="guc-replication-slot-inactive-timeout"/> parameter. Please find the v45 patch posted upthread at https://www.postgresql.org/message-id/CALj2ACWXQT3_HY40ceqKf1DadjLQP6b1r%3D0sZRh-xhAOd-b0pA%40mail.gmail.com for the changes. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Thu, Sep 5, 2024 at 9:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > 1) > > > I see that ReplicationSlotAlter() will error out if the slot is > > > invalidated due to timeout. I have not tested it myself, but do you > > > know if slot-alter errors out for other invalidation causes as well? > > > Just wanted to confirm that the behaviour is consistent for all > > > invalidation causes. > > > > I was able to test this and as anticipated behavior is different. When > > slot is invalidated due to say 'wal_removed', I am still able to do > > 'alter' of that slot. > > Please see: > > > > Pub: > > slot_name | failover | synced | inactive_since | > > invalidation_reason > > -------------+----------+--------+----------------------------------+--------------------- > > mysubnew1_1 | t | f | 2024-09-04 08:58:12.802278+05:30 | > > wal_removed > > > > Sub: > > newdb1=# alter subscription mysubnew1_1 disable; > > ALTER SUBSCRIPTION > > > > newdb1=# alter subscription mysubnew1_1 set (failover=false); > > ALTER SUBSCRIPTION > > > > Pub: (failover altered) > > slot_name | failover | synced | inactive_since | > > invalidation_reason > > -------------+----------+--------+----------------------------------+--------------------- > > mysubnew1_1 | f | f | 2024-09-04 08:58:47.824471+05:30 | > > wal_removed > > > > > > while when invalidation_reason is 'inactive_timeout', it fails: > > > > Pub: > > slot_name | failover | synced | inactive_since | > > invalidation_reason > > -------------+----------+--------+----------------------------------+--------------------- > > mysubnew1_1 | t | f | 2024-09-03 14:30:57.532206+05:30 | > > inactive_timeout > > > > Sub: > > newdb1=# alter subscription mysubnew1_1 disable; > > ALTER SUBSCRIPTION > > > > newdb1=# alter subscription mysubnew1_1 set (failover=false); > > ERROR: could not alter replication slot "mysubnew1_1": ERROR: can no > > longer get changes from replication slot "mysubnew1_1" > > DETAIL: The slot became invalid because it was inactive since > > 2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago. > > HINT: You might need to increase "replication_slot_inactive_timeout.". > > > > I think the behavior should be same. > > > > We should not allow the invalid replication slot to be altered > irrespective of the reason unless there is any benefit. > Okay, then I think we need to change the existing behaviour of the other invalidation causes which still allow alter-slot. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi, On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > > We should not allow the invalid replication slot to be altered > > irrespective of the reason unless there is any benefit. > > Okay, then I think we need to change the existing behaviour of the > other invalidation causes which still allow alter-slot. +1. Perhaps, track it in a separate thread? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Sep 9, 2024 at 10:26 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > We should not allow the invalid replication slot to be altered > > > irrespective of the reason unless there is any benefit. > > > > Okay, then I think we need to change the existing behaviour of the > > other invalidation causes which still allow alter-slot. > > +1. Perhaps, track it in a separate thread? I think so. It does not come under the scope of this thread. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Sun, Sep 8, 2024 at 5:25 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > > Please find the v45 patch. Addressed above and Shveta's review comments [1]. > Thanks for the patch. Please find my comments: 1) src/sgml/config.sgml: + Synced slots are always considered to be inactive because they don't perform logical decoding to produce changes. It is better we avoid such a statement, as internally we use logical decoding to advance restart-lsn, see 'LogicalSlotAdvanceAndCheckSnapState' called form slotsync.c. <Also see related comment 6 below> 2) src/sgml/config.sgml: + disables the inactive timeout invalidation mechanism + Slot invalidation due to inactivity timeout occurs during checkpoint. Either have 'inactive' at both the places or 'inactivity'. 3) slot.c: +static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause, + ReplicationSlot *s, + XLogRecPtr oldestLSN, + Oid dboid, + TransactionId snapshotConflictHorizon, + bool *invalidated); +static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s); I think, we do not need above 2 declarations. The code compile fine without these as the usage is later than the definition. 4) + /* + * An error is raised if error_if_invalid is true and the slot has been + * invalidated previously. + */ + if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT) The comment is generic while the 'if condition' is specific to one invalidation cause. Even though I feel it can be made generic test for all invalidation causes but that is not under scope of this thread and needs more testing/analysis. For the time being, we can make comment specific to the concerned invalidation cause. The header of function will also need the same change. 5) SlotInactiveTimeoutCheckAllowed(): + * Check if inactive timeout invalidation mechanism is disabled or slot is + * currently being used or server is in recovery mode or slot on standby is + * currently being synced from the primary. + * These comments say exact opposite of what we are checking in code. Since the function name has 'Allowed' in it, we should be putting comments which say what allows it instead of what disallows it. 6) + * Synced slots are always considered to be inactive because they don't + * perform logical decoding to produce changes. + */ +static inline bool +SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s) Perhaps we should avoid mentioning logical decoding here. When slots are synced, they are performing decoding and their inactive_since is changing continuously. A better way to make this statement will be: We want to ensure that the slots being synchronized are not invalidated, as they need to be preserved for future use when the standby server is promoted to the primary. This is necessary for resuming logical replication from the new primary server. <Rephrase if needed> 7) InvalidatePossiblyObsoleteSlot() we are calling SlotInactiveTimeoutCheckAllowed() twice in this function. We shall optimize. At the first usage place, shall we simply get timestamp when cause is RS_INVAL_INACTIVE_TIMEOUT without checking SlotInactiveTimeoutCheckAllowed() as IMO it does not seem a performance critical section. Or if we retain check at first place, then at the second place we can avoid calling it again based on whether 'now' is NULL or not. thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Sep 9, 2024 at 10:28 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Sep 9, 2024 at 10:26 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Hi, > > > > On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > We should not allow the invalid replication slot to be altered > > > > irrespective of the reason unless there is any benefit. > > > > > > Okay, then I think we need to change the existing behaviour of the > > > other invalidation causes which still allow alter-slot. > > > > +1. Perhaps, track it in a separate thread? > > I think so. It does not come under the scope of this thread. > It makes sense to me as well. But let's go ahead and get that sorted out first. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi, On Mon, Sep 9, 2024 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > We should not allow the invalid replication slot to be altered > > > > > irrespective of the reason unless there is any benefit. > > > > > > > > Okay, then I think we need to change the existing behaviour of the > > > > other invalidation causes which still allow alter-slot. > > > > > > +1. Perhaps, track it in a separate thread? > > > > I think so. It does not come under the scope of this thread. > > It makes sense to me as well. But let's go ahead and get that sorted out first. Moved the discussion to new thread - https://www.postgresql.org/message-id/CALj2ACW4fSOMiKjQ3%3D2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ%40mail.gmail.com. Please have a look. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Tue, Sep 10, 2024 at 12:13 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, Sep 9, 2024 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > We should not allow the invalid replication slot to be altered > > > > > > irrespective of the reason unless there is any benefit. > > > > > > > > > > Okay, then I think we need to change the existing behaviour of the > > > > > other invalidation causes which still allow alter-slot. > > > > > > > > +1. Perhaps, track it in a separate thread? > > > > > > I think so. It does not come under the scope of this thread. > > > > It makes sense to me as well. But let's go ahead and get that sorted out first. > > Moved the discussion to new thread - > https://www.postgresql.org/message-id/CALj2ACW4fSOMiKjQ3%3D2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ%40mail.gmail.com. > Please have a look. > That is pushed now. Please send the rebased patch after addressing the pending comments. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi, Thanks for reviewing. On Mon, Sep 9, 2024 at 10:54 AM shveta malik <shveta.malik@gmail.com> wrote: > > 2) > src/sgml/config.sgml: > > + disables the inactive timeout invalidation mechanism > > + Slot invalidation due to inactivity timeout occurs during checkpoint. > > Either have 'inactive' at both the places or 'inactivity'. Used "inactive timeout". > 3) > slot.c: > +static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause > cause, > + ReplicationSlot *s, > + XLogRecPtr oldestLSN, > + Oid dboid, > + TransactionId snapshotConflictHorizon, > + bool *invalidated); > +static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s); > > I think, we do not need above 2 declarations. The code compile fine > without these as the usage is later than the definition. Hm, it's a usual practice that I follow irrespective of the placement of function declarations. Since it was brought up, I removed the declarations. > 4) > + /* > + * An error is raised if error_if_invalid is true and the slot has been > + * invalidated previously. > + */ > + if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT) > > The comment is generic while the 'if condition' is specific to one > invalidation cause. Even though I feel it can be made generic test for > all invalidation causes but that is not under scope of this thread and > needs more testing/analysis. Right. > For the time being, we can make comment > specific to the concerned invalidation cause. The header of function > will also need the same change. Adjusted the comment, but left the variable name error_if_invalid as is. Didn't want to make it long, one can look at the code to understand what it is used for. > 5) > SlotInactiveTimeoutCheckAllowed(): > > + * Check if inactive timeout invalidation mechanism is disabled or slot is > + * currently being used or server is in recovery mode or slot on standby is > + * currently being synced from the primary. > + * > > These comments say exact opposite of what we are checking in code. > Since the function name has 'Allowed' in it, we should be putting > comments which say what allows it instead of what disallows it. Modified. > 1) > src/sgml/config.sgml: > > + Synced slots are always considered to be inactive because they > don't perform logical decoding to produce changes. > > It is better we avoid such a statement, as internally we use logical > decoding to advance restart-lsn, see > 'LogicalSlotAdvanceAndCheckSnapState' called form slotsync.c. > <Also see related comment 6 below> > > 6) > > + * Synced slots are always considered to be inactive because they don't > + * perform logical decoding to produce changes. > + */ > +static inline bool > +SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s) > > Perhaps we should avoid mentioning logical decoding here. When slots > are synced, they are performing decoding and their inactive_since is > changing continuously. A better way to make this statement will be: > > We want to ensure that the slots being synchronized are not > invalidated, as they need to be preserved for future use when the > standby server is promoted to the primary. This is necessary for > resuming logical replication from the new primary server. > <Rephrase if needed> They are performing logical decoding, but not producing the changes for the clients to consume. So, IMO, the accompanying "to produce changes" next to the "logical decoding" is good here. > 7) > > InvalidatePossiblyObsoleteSlot() > > we are calling SlotInactiveTimeoutCheckAllowed() twice in this > function. We shall optimize. > > At the first usage place, shall we simply get timestamp when cause is > RS_INVAL_INACTIVE_TIMEOUT without checking > SlotInactiveTimeoutCheckAllowed() as IMO it does not seem a > performance critical section. Or if we retain check at first place, > then at the second place we can avoid calling it again based on > whether 'now' is NULL or not. Getting a current timestamp can get costlier on platforms that use various clock sources, so assigning 'now' unconditionally isn't the way IMO. Using the inline function in two places improves the readability. Can optimize it if there's any performance impact of calling the inline function in two places. Will post the new patch version soon. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Please find the attached v46 patch having changes for the above review > comments and your test review comments and Shveta's review comments. > -ReplicationSlotAcquire(const char *name, bool nowait) +ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid) { ReplicationSlot *s; int active_pid; @@ -615,6 +620,22 @@ retry: /* We made this slot active, so it's ours now. */ MyReplicationSlot = s; + /* + * An error is raised if error_if_invalid is true and the slot has been + * previously invalidated due to inactive timeout. + */ + if (error_if_invalid && + s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT) + { + Assert(s->inactive_since > 0); + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("can no longer get changes from replication slot \"%s\"", + NameStr(s->data.name)), + errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".", + "replication_slot_inactive_timeout"))); + } Why raise the ERROR just for timeout invalidation here and why not if the slot is invalidated for other reasons? This raises the question of what happens before this patch if the invalid slot is used from places where we call ReplicationSlotAcquire(). I did a brief code analysis and found that for StartLogicalReplication(), even if the error won't occur in ReplicationSlotAcquire(), it would have been caught in CreateDecodingContext(). I think that is where we should also add this new error. Similarly, pg_logical_slot_get_changes_guts() and other logical replication functions should be calling CreateDecodingContext() which can raise the new ERROR. I am not sure about how the invalid slots are handled during physical replication, please check the behavior of that before this patch. -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Bharath Rupireddy
Date:
Hi,
Thanks for looking into this.
On Mon, Sep 16, 2024 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Why raise the ERROR just for timeout invalidation here and why not if
> the slot is invalidated for other reasons? This raises the question of
> what happens before this patch if the invalid slot is used from places
> where we call ReplicationSlotAcquire(). I did a brief code analysis
> and found that for StartLogicalReplication(), even if the error won't
> occur in ReplicationSlotAcquire(), it would have been caught in
> CreateDecodingContext(). I think that is where we should also add this
> new error. Similarly, pg_logical_slot_get_changes_guts() and other
> logical replication functions should be calling
> CreateDecodingContext() which can raise the new ERROR. I am not sure
> about how the invalid slots are handled during physical replication,
> please check the behavior of that before this patch.
When physical slots are invalidated due to wal_removed reason, the failure happens at a much later point for the streaming standbys while reading the requested WAL files like the following:
2024-09-16 16:29:52.416 UTC [876059] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000005 has already been removed
2024-09-16 16:29:52.416 UTC [872418] LOG: waiting for WAL to become available at 0/5002000
At this point, despite the slot being invalidated, its wal_status can still come back to 'unreserved' even from 'lost', and the standby can catch up if removed WAL files are copied either by manually or by a tool/script to the primary's pg_wal directory. IOW, the physical slots invalidated due to wal_removed are *somehow* recoverable unlike the logical slots.
Thanks for looking into this.
On Mon, Sep 16, 2024 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Why raise the ERROR just for timeout invalidation here and why not if
> the slot is invalidated for other reasons? This raises the question of
> what happens before this patch if the invalid slot is used from places
> where we call ReplicationSlotAcquire(). I did a brief code analysis
> and found that for StartLogicalReplication(), even if the error won't
> occur in ReplicationSlotAcquire(), it would have been caught in
> CreateDecodingContext(). I think that is where we should also add this
> new error. Similarly, pg_logical_slot_get_changes_guts() and other
> logical replication functions should be calling
> CreateDecodingContext() which can raise the new ERROR. I am not sure
> about how the invalid slots are handled during physical replication,
> please check the behavior of that before this patch.
When physical slots are invalidated due to wal_removed reason, the failure happens at a much later point for the streaming standbys while reading the requested WAL files like the following:
2024-09-16 16:29:52.416 UTC [876059] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000005 has already been removed
2024-09-16 16:29:52.416 UTC [872418] LOG: waiting for WAL to become available at 0/5002000
At this point, despite the slot being invalidated, its wal_status can still come back to 'unreserved' even from 'lost', and the standby can catch up if removed WAL files are copied either by manually or by a tool/script to the primary's pg_wal directory. IOW, the physical slots invalidated due to wal_removed are *somehow* recoverable unlike the logical slots.
IIUC, the invalidation of a slot implies that it is not guaranteed to hold any resources like WAL and XMINs. Does it also imply that the slot must be unusable?
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > > Please find the attached v46 patch having changes for the above review > comments and your test review comments and Shveta's review comments. > Thanks for addressing comments. Is there a reason that we don't support this invalidation on hot standby for non-synced slots? Shouldn't we support this time-based invalidation there too just like other invalidations? thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Sep 18, 2024 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Hi, > > > > > > Please find the attached v46 patch having changes for the above review > > comments and your test review comments and Shveta's review comments. > > > > Thanks for addressing comments. > > Is there a reason that we don't support this invalidation on hot > standby for non-synced slots? Shouldn't we support this time-based > invalidation there too just like other invalidations? > Now since we are not changing inactive_since once it is invalidated, we are not even initializing it during restart; and thus later when someone tries to use slot, it leads to assert in ReplicationSlotAcquire() ( Assert(s->inactive_since > 0); Steps: --Disable logical subscriber and let the slot on publisher gets invalidated due to inactive_timeout. --Enable the logical subscriber again. --Restart publisher. a) We should initialize inactive_since when ReplicationSlotSetInactiveSince() is called from RestoreSlotFromDisk() even though it is invalidated. b) And shall we mention in the doc of 'active_since', that once the slot is invalidated, this value will remain unchanged until we shutdown the server. On server restart, it is initialized to start time. Thought? thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Please find the attached v46 patch having changes for the above review > > > comments and your test review comments and Shveta's review comments. > > > When the synced slot is marked as 'inactive_timeout' invalidated on hot standby due to invalidation of publisher 's failover slot, the former starts showing NULL' inactive_since'. Is this intentional behaviour? I feel inactive_since should be non-NULL here too? Thoughts? physical standby: postgres=# select slot_name, inactive_since, invalidation_reason, failover, synced from pg_replication_slots; slot_name | inactive_since | invalidation_reason | failover | synced -------------+----------------------------------+---------------------+----------+-------- sub2 | 2024-09-18 15:20:04.364998+05:30 | | t | t sub3 | 2024-09-18 15:20:04.364953+05:30 | | t | t After sync of invalidation_reason: slot_name | inactive_since | invalidation_reason | failover | synced -------------+----------------------------------+---------------------+----------+-------- sub2 | | inactive_timeout | t | t sub3 | | inactive_timeout | t | t thanks shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Sep 16, 2024 at 10:41 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Thanks for looking into this. > > On Mon, Sep 16, 2024 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Why raise the ERROR just for timeout invalidation here and why not if > > the slot is invalidated for other reasons? This raises the question of > > what happens before this patch if the invalid slot is used from places > > where we call ReplicationSlotAcquire(). I did a brief code analysis > > and found that for StartLogicalReplication(), even if the error won't > > occur in ReplicationSlotAcquire(), it would have been caught in > > CreateDecodingContext(). I think that is where we should also add this > > new error. Similarly, pg_logical_slot_get_changes_guts() and other > > logical replication functions should be calling > > CreateDecodingContext() which can raise the new ERROR. I am not sure > > about how the invalid slots are handled during physical replication, > > please check the behavior of that before this patch. > > When physical slots are invalidated due to wal_removed reason, the failure happens at a much later point for the streamingstandbys while reading the requested WAL files like the following: > > 2024-09-16 16:29:52.416 UTC [876059] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000005has already been removed > 2024-09-16 16:29:52.416 UTC [872418] LOG: waiting for WAL to become available at 0/5002000 > > At this point, despite the slot being invalidated, its wal_status can still come back to 'unreserved' even from 'lost',and the standby can catch up if removed WAL files are copied either by manually or by a tool/script to the primary'spg_wal directory. IOW, the physical slots invalidated due to wal_removed are *somehow* recoverable unlike the logicalslots. > > IIUC, the invalidation of a slot implies that it is not guaranteed to hold any resources like WAL and XMINs. Does it alsoimply that the slot must be unusable? > If we can't hold the dead rows against xmin of the invalid slot, then how can we make it usable even after copying the required WAL? -- With Regards, Amit Kapila.
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
shveta malik
Date:
On Wed, Sep 18, 2024 at 3:31 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Please find the attached v46 patch having changes for the above review > > > > comments and your test review comments and Shveta's review comments. > > > > > When we promote hot standby with synced logical slots to become new primary, the logical slots are never invalidated with 'inactive_timeout' on new primary. It seems the check in SlotInactiveTimeoutCheckAllowed() is wrong. We should allow invalidation of slots on primary even if they are marked as 'synced'. Please see [4]. I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3]. Once all these are addressed, I can continue reviewing further. [1]: https://www.postgresql.org/message-id/CAJpy0uAwxc49Dz6t%3D-y_-z-MU%2BA4RWX4BR3Zri_jj2qgGMq_8g%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CAJpy0uC6nN3SLbEuCvz7-CpaPdNdXxH%3DfeW5MhYQch-JWV0tLg%40mail.gmail.com [3]: https://www.postgresql.org/message-id/CAJpy0uBXXJC6f04%2BFU1axKaU%2Bp78wN0SEhUNE9XoqbjXj%3Dhhgw%40mail.gmail.com [4]: -------------------- postgres=# select pg_is_in_recovery(); -------- f postgres=# show replication_slot_inactive_timeout; replication_slot_inactive_timeout ----------------------------------- 10s postgres=# select slot_name, inactive_since, invalidation_reason, synced from pg_replication_slots; slot_name | inactive_since | invalidation_reason | synced -------------+----------------------------------+---------------------+----------+-------- mysubnew1_1 | 2024-09-19 09:04:09.714283+05:30 | | t postgres=# select now(); now ---------------------------------- 2024-09-19 09:06:28.871354+05:30 postgres=# checkpoint; CHECKPOINT postgres=# select slot_name, inactive_since, invalidation_reason, synced from pg_replication_slots; slot_name | inactive_since | invalidation_reason | synced -------------+----------------------------------+---------------------+----------+-------- mysubnew1_1 | 2024-09-19 09:04:09.714283+05:30 | | t -------------------- thanks Shveta
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Wed, Sep 18, 2024 at 3:31 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Please find the attached v46 patch having changes for the above review > > > > comments and your test review comments and Shveta's review comments. > > > > > > When the synced slot is marked as 'inactive_timeout' invalidated on > hot standby due to invalidation of publisher 's failover slot, the > former starts showing NULL' inactive_since'. Is this intentional > behaviour? I feel inactive_since should be non-NULL here too? > Thoughts? > > physical standby: > postgres=# select slot_name, inactive_since, invalidation_reason, > failover, synced from pg_replication_slots; > slot_name | inactive_since | > invalidation_reason | failover | synced > -------------+----------------------------------+---------------------+----------+-------- > sub2 | 2024-09-18 15:20:04.364998+05:30 | | t | t > sub3 | 2024-09-18 15:20:04.364953+05:30 | | t | t > > After sync of invalidation_reason: > > slot_name | inactive_since | invalidation_reason | > failover | synced > -------------+----------------------------------+---------------------+----------+-------- > sub2 | | inactive_timeout | t | t > sub3 | | inactive_timeout | t | t > > For synced slots on the standby, inactive_since indicates the last synchronization time rather than the time the slot became inactive (see doc - https://www.postgresql.org/docs/devel/view-pg-replication-slots.html). In the reported case above, once a synced slot is invalidated we don't even keep the last synchronization time for it. This is because when a synced slot on the standby is marked invalid, inactive_since is reset to NULL each time the slot-sync worker acquires a lock on it. This lock acquisition before checking invalidation is done to avoid certain race conditions and will activate the slot temporarily, resetting inactive_since. Later, the slot-sync worker updates inactive_since for all synced slots to the current synchronization time. However, for invalid slots, this update is skipped, as per the patch’s design. If we want to preserve the inactive_since value for the invalid synced slots on standby, we need to clarify the time it should display. Here are three possible approaches: 1) Copy the primary's inactive_since upon invalidation: When a slot becomes invalid on the primary, the slot-sync worker could copy the primary slot’s inactive_since to the standby slot and retain it, by preventing future updates on the standby. 2) Use the current time of standby when the synced slot is marked invalid for the first time and do not update it in subsequent sync cycles if the slot is invalid. Approach (2) seems more reasonable to me, however, Both 1) & 2) approaches contradicts the purpose of inactive_since, as it no longer represents either the true "last sync time" or the "time slot became inactive" because the slot-sync worker acquires locks periodically for syncing, and keeps activating the slot. 3) Continuously update inactive_since for invalid synced slots as well: Treat invalid synced slots like valid ones by updating inactive_since with each sync cycle. This way, we can keep the "last sync time" in the inactive_since. However, this could confuse users when "invalidation_reason=inactive_timeout" is set for a synced slot on standby but inactive_since would reflect sync time rather than the time slot became inactive. IIUC, on the primary, when invalidation_reason=inactive_timeout for a slot, the inactive_since represents the actual time the slot became inactive before getting invalidated, unless the primary is restarted. Thoughts? -- Thanks, Nisha
On Thu, 7 Nov 2024 at 15:33, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Please find the attached v46 patch having changes for the above review > > comments and your test review comments and Shveta's review comments. > > > Hi, > > I’ve reviewed this thread and am interested in working on the > remaining tasks and comments, as well as the future review comments. > However, Bharath, please let me know if you'd prefer to continue with > it. > > Attached the rebased v47 patch, which also addresses Peter’s comments > #2, #3, and #4 at [1]. I will try addressing other comments as well in > next versions. The following crash occurs while upgrading: 2024-11-13 14:19:45.955 IST [44539] LOG: checkpoint starting: time TRAP: failed Assert("!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)"), File: "slot.c", Line: 1793, PID: 44539 postgres: checkpointer (ExceptionalCondition+0xbb)[0x555555e305bd] postgres: checkpointer (+0x63ab04)[0x555555b8eb04] postgres: checkpointer (InvalidateObsoleteReplicationSlots+0x149)[0x555555b8ee5f] postgres: checkpointer (CheckPointReplicationSlots+0x267)[0x555555b8f125] postgres: checkpointer (+0x1f3ee8)[0x555555747ee8] postgres: checkpointer (CreateCheckPoint+0x78f)[0x5555557475ee] postgres: checkpointer (CheckpointerMain+0x632)[0x555555b2f1e7] postgres: checkpointer (postmaster_child_launch+0x119)[0x555555b30892] postgres: checkpointer (+0x5e2dc8)[0x555555b36dc8] postgres: checkpointer (PostmasterMain+0x14bd)[0x555555b33647] postgres: checkpointer (+0x487f2e)[0x5555559dbf2e] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7ffff6c29d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7ffff6c29e40] postgres: checkpointer (_start+0x25)[0x555555634c25] 2024-11-13 14:19:45.967 IST [44538] LOG: checkpointer process (PID 44539) was terminated by signal 6: Aborted This can happen in the following case: 1) Setup a logical replication cluster with enough data so that it will take at least few minutes to upgrade 2) Stop the publisher node 3) Configure replication_slot_inactive_timeout and checkpoint_timeout to 30 seconds 4) Upgrade the publisher node. This is happening because logical replication slots are getting invalidated during upgrade and there is an assertion which checks that the slots are not invalidated. I feel this can be fixed by having a function similar to check_max_slot_wal_keep_size which will make sure that replication_slot_inactive_timeout is 0 during upgrade. Regards, Vignesh
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Wed, Sep 18, 2024 at 12:22 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Hi, > > > > > > Please find the attached v46 patch having changes for the above review > > comments and your test review comments and Shveta's review comments. > > > > Thanks for addressing comments. > > Is there a reason that we don't support this invalidation on hot > standby for non-synced slots? Shouldn't we support this time-based > invalidation there too just like other invalidations? > I don’t see any reason to *not* support this invalidation on hot standby for non-synced slots. Therefore, I’ve added the same in v48. -- Thanks, Nisha
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha. Thanks for the recent patch updates. Here are my review comments for the latest patch v48-0001. ====== Commit message 1. Till now, postgres has the ability to invalidate inactive replication slots based on the amount of WAL (set via max_slot_wal_keep_size GUC) that will be needed for the slots in case they become active. However, choosing a default value for this GUC is a bit tricky. Because the amount of WAL a database generates, and the allocated storage for instance will vary greatly in production, making it difficult to pin down a one-size-fits-all value. ~ What do the words "for instance" mean here? Did it mean "per instance" or "(for example)" or something else? ====== doc/src/sgml/system-views.sgml 2. <para> The time since the slot has become inactive. - <literal>NULL</literal> if the slot is currently being used. - Note that for slots on the standby that are being synced from a + <literal>NULL</literal> if the slot is currently being used. Once the + slot is invalidated, this value will remain unchanged until we shutdown + the server. Note that for slots on the standby that are being synced from a primary server (whose <structfield>synced</structfield> field is <literal>true</literal>), the Is this change related to the new inactivity timeout feature or are you just clarifying the existing behaviour of the 'active_since' field. Note there is already another thread [1] created to patch/clarify this same field. So if you are just clarifying existing behavior then IMO it would be better if you can to try and get your desired changes included there quickly before that other patch gets pushed. ~~~ 3. + <para> + <literal>inactive_timeout</literal> means that the slot has been + inactive for longer than the amount of time specified by the + <xref linkend="guc-replication-slot-inactive-timeout"/> parameter. + </para> Maybe there is a slightly shorter/simpler way to express this. For example, BEFORE inactive_timeout means that the slot has been inactive for longer than the amount of time specified by the replication_slot_inactive_timeout parameter. SUGGESTION inactive_timeout means that the slot has remained inactive beyond the duration specified by the replication_slot_inactive_timeout parameter. ====== src/backend/replication/slot.c 4. +int replication_slot_inactive_timeout = 0; IMO it would be more informative to give the units in the variable name (but not in the GUC name). e.g. 'replication_slot_inactive_timeout_secs'. ~~~ ReplicationSlotAcquire: 5. + * + * An error is raised if error_if_invalid is true and the slot has been + * invalidated previously. */ void -ReplicationSlotAcquire(const char *name, bool nowait) +ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid) This function comment makes it seem like "invalidated previously" might mean *any* kind of invalidation, but later in the body of the function we find the logic is really only used for inactive timeout. + /* + * An error is raised if error_if_invalid is true and the slot has been + * previously invalidated due to inactive timeout. + */ So, I think a better name for that parameter might be 'error_if_inactive_timeout' OTOH, if it really is supposed to erro for *any* kind of invalidation then there needs to be more ereports. ~~~ 6. + errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".", This errdetail message seems quite long. I think it can be shortened like below and still retain exactly the same meaning: BEFORE: This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\". SUGGESTION: This slot has been invalidated due to inactivity exceeding the time limit set by "%s". ~~~ ReportSlotInvalidation: 7. + case RS_INVAL_INACTIVE_TIMEOUT: + Assert(inactive_since > 0); + appendStringInfo(&err_detail, + _("The slot has been inactive since %s for longer than the amount of time specified by \"%s\"."), + timestamptz_to_str(inactive_since), + "replication_slot_inactive_timeout"); + break; Here also as in the above review comment #6 I think the message can be shorter and still say the same thing BEFORE: _("The slot has been inactive since %s for longer than the amount of time specified by \"%s\"."), SUGGESTION: _("The slot has been inactive since %s, exceeding the time limit set by \"%s\"."), ~~~ SlotInactiveTimeoutCheckAllowed: 8. +/* + * Is this replication slot allowed for inactive timeout invalidation check? + * + * Inactive timeout invalidation is allowed only when: + * + * 1. Inactive timeout is set + * 2. Slot is inactive + * 3. Server is in recovery and slot is not being synced from the primary + * + * Note that the inactive timeout invalidation mechanism is not + * applicable for slots on the standby server that are being synced + * from the primary server (i.e., standby slots having 'synced' field 'true'). + * Synced slots are always considered to be inactive because they don't + * perform logical decoding to produce changes. + */ 8a. Somehow that first sentence seems strange. Would it be better to write it like: SUGGESTION Can this replication slot timeout due to inactivity? ~ 8b. AFAICT that reason 3 ("Server is in recovery and slot is not being synced from the primary") seems not quite worded right... Should it say more like: The slot is not being synced from the primary while the server is in recovery or maybe like: The slot is not currently being synced from the primary (e.g. not 'synced' is true when server is in recovery) ~ 8c. Similarly, I think something about that "Note that the inactive timeout invalidation mechanism is not applicable..." paragraph needs tweaking because IMO that should also now be saying something about 'RecoveryInProgress'. ~~~ 9. +static inline bool +SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s) Maybe the function name should be 'IsSlotInactiveTimeoutPossible' or something better. ~~~ InvalidatePossiblyObsoleteSlot: 10. break; + case RS_INVAL_INACTIVE_TIMEOUT: + + /* + * Check if the slot needs to be invalidated due to + * replication_slot_inactive_timeout GUC. + */ Since there are no other blank lines anywhere in this switch, the introduction of this one in v48 looks out of place to me. IMO it would be more readable if a blank line followed each/every of the breaks, but then that is not a necessary change for this patch so... ~~~ 11. + /* + * Invalidation due to inactive timeout implies that + * no one is using the slot. + */ + Assert(s->active_pid == 0); Given this assertion, does it mean that "(s->active_pid == 0)" should have been another condition done up-front in the function 'SlotInactiveTimeoutCheckAllowed'? ~~~ 12. /* - * If the slot can be acquired, do so and mark it invalidated - * immediately. Otherwise we'll signal the owning process, below, and - * retry. + * If the slot can be acquired, do so and mark it as invalidated. If + * the slot is already ours, mark it as invalidated. Otherwise, we'll + * signal the owning process below and retry. */ - if (active_pid == 0) + if (active_pid == 0 || + (MyReplicationSlot == s && + active_pid == MyProcPid)) I wasn't sure how this change belongs to this patch, because the logic of the previous review comment said for the case of invalidation due to inactivity that active_id must be 0. e.g. Assert(s->active_pid == 0); ~~~ RestoreSlotFromDisk: 13. - slot->inactive_since = GetCurrentTimestamp(); + slot->inactive_since = now; In v47 this assignment used to call the function 'ReplicationSlotSetInactiveSince'. I recognise there is a very subtle difference between direct assignment and the function, because the function will skip assignment if the slot is already invalidated. Anyway, if you are *deliberately* not wanting to call ReplicationSlotSetInactiveSince here then I think this assignment should be commented to explain the reason why not, otherwise someone in the future might be tempted to think it was just an oversight and add the call back in that you don't want. ====== src/test/recovery/t/050_invalidate_slots.pl 14. +# Despite inactive timeout being set, the synced slot won't get invalidated on +# its own on the standby. So, we must not see invalidation message in server +# log. +$standby1->safe_psql('postgres', "CHECKPOINT"); +is( $standby1->safe_psql( + 'postgres', + q{SELECT count(*) = 1 FROM pg_replication_slots + WHERE slot_name = 'sync_slot1' + AND invalidation_reason IS NULL;} + ), + "t", + 'check that synced slot sync_slot1 has not been invalidated on standby'); + But, now, we are confirming this by another way -- not checking the logs here, so the comment "So, we must not see invalidation message in server log." is no longer appropriate here. ====== [1] https://www.postgresql.org/message-id/flat/CAA4eK1JQFdssaBBh-oQskpKM-UpG8jPyUdtmGWa_0qCDy%2BK7_A%40mail.gmail.com#ab98379f220288ed40d34f8c2a21cf96 Kind Regards, Peter Smith. Fujitsu Australia
On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote: > > Please find the v48 patch attached. > > On Thu, Sep 19, 2024 at 9:40 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > When we promote hot standby with synced logical slots to become new > > primary, the logical slots are never invalidated with > > 'inactive_timeout' on new primary. It seems the check in > > SlotInactiveTimeoutCheckAllowed() is wrong. We should allow > > invalidation of slots on primary even if they are marked as 'synced'. > > fixed. > > > I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3]. > > Once all these are addressed, I can continue reviewing further. > > > > Fixed issues reported in [1], [2]. Few comments: 1) Since we don't change the value of now in ReplicationSlotSetInactiveSince, the function parameter can be passed by value: +/* + * Set slot's inactive_since property unless it was previously invalidated. + */ +static inline void +ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now, + bool acquire_lock) +{ + if (s->data.invalidated != RS_INVAL_NONE) + return; + + if (acquire_lock) + SpinLockAcquire(&s->mutex); + + s->inactive_since = *now; 2) Currently it allows a minimum value of less than 1 second like in milliseconds, I feel we can have some minimum value at least something like checkpoint_timeout: diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index 8a67f01200..367f510118 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, + { + {"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING, + gettext_noop("Sets the amount of time a replication slot can remain inactive before " + "it will be invalidated."), + NULL, + GUC_UNIT_S + }, + &replication_slot_inactive_timeout, + 0, 0, INT_MAX, + NULL, NULL, NULL + }, 3) Since SlotInactiveTimeoutCheckAllowed check is just done above and the current time has been retrieved can we used "now" variable instead of SlotInactiveTimeoutCheckAllowed again second time: @@ -1651,6 +1713,26 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause, if (SlotIsLogical(s)) invalidation_cause = cause; break; + case RS_INVAL_INACTIVE_TIMEOUT: + + /* + * Check if the slot needs to be invalidated due to + * replication_slot_inactive_timeout GUC. + */ + if (SlotInactiveTimeoutCheckAllowed(s) && + TimestampDifferenceExceeds(s->inactive_since, now, + replication_slot_inactive_timeout * 1000)) + { + invalidation_cause = cause; + inactive_since = s->inactive_since; 4) I'm not sure if this change required by this patch or is it a general optimization, if it is required for this patch we can detail the comments: @@ -2208,6 +2328,7 @@ RestoreSlotFromDisk(const char *name) bool restored = false; int readBytes; pg_crc32c checksum; + TimestampTz now; /* no need to lock here, no concurrent access allowed yet */ @@ -2368,6 +2489,9 @@ RestoreSlotFromDisk(const char *name) NameStr(cp.slotdata.name)), errhint("Change \"wal_level\" to be \"replica\" or higher."))); + /* Use same inactive_since time for all slots */ + now = GetCurrentTimestamp(); + /* nothing can be active yet, don't lock anything */ for (i = 0; i < max_replication_slots; i++) { @@ -2400,7 +2524,7 @@ RestoreSlotFromDisk(const char *name) * slot from the disk into memory. Whoever acquires the slot i.e. * makes the slot active will reset it. */ - slot->inactive_since = GetCurrentTimestamp(); + slot->inactive_since = now; 5) Why should the slot invalidation be updated during shutdown, shouldn't the inactive_since value be intact during shutdown? - <literal>NULL</literal> if the slot is currently being used. - Note that for slots on the standby that are being synced from a + <literal>NULL</literal> if the slot is currently being used. Once the + slot is invalidated, this value will remain unchanged until we shutdown + the server. Note that for slots on the standby that are being synced from a 6) New Style of ereport does not need braces around errcode, it can be changed similarly: + if (error_if_invalid && + s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT) + { + Assert(s->inactive_since > 0); + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("can no longer get changes from replication slot \"%s\"", + NameStr(s->data.name)), + errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".", + "replication_slot_inactive_timeout"))); Regards, Vignesh
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Thu, Nov 14, 2024 at 5:29 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Nisha. > > Thanks for the recent patch updates. Here are my review comments for > the latest patch v48-0001. > Thank you for the review. Comments are addressed in v49 version. Below is my response to comments that may require further discussion. > ====== > doc/src/sgml/system-views.sgml > > 2. > <para> > The time since the slot has become inactive. > - <literal>NULL</literal> if the slot is currently being used. > - Note that for slots on the standby that are being synced from a > + <literal>NULL</literal> if the slot is currently being used. Once the > + slot is invalidated, this value will remain unchanged until we shutdown > + the server. Note that for slots on the standby that are being > synced from a > primary server (whose <structfield>synced</structfield> field is > <literal>true</literal>), the > > Is this change related to the new inactivity timeout feature or are > you just clarifying the existing behaviour of the 'active_since' > field. > Yes, this patch introduces inactive_timeout invalidation and prevents updates to inactive_since for invalid slots. Only a node restart can modify it, so, I believe we should retain these lines in this patch. > Note there is already another thread [1] created to patch/clarify this > same field. So if you are just clarifying existing behavior then IMO > it would be better if you can to try and get your desired changes > included there quickly before that other patch gets pushed. > Thanks for the reference, I have posted my suggestion on the thread. > > ReplicationSlotAcquire: > > 5. > + * > + * An error is raised if error_if_invalid is true and the slot has been > + * invalidated previously. > */ > void > -ReplicationSlotAcquire(const char *name, bool nowait) > +ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid) > > This function comment makes it seem like "invalidated previously" > might mean *any* kind of invalidation, but later in the body of the > function we find the logic is really only used for inactive timeout. > > + /* > + * An error is raised if error_if_invalid is true and the slot has been > + * previously invalidated due to inactive timeout. > + */ > > So, I think a better name for that parameter might be > 'error_if_inactive_timeout' > > OTOH, if it really is supposed to erro for *any* kind of invalidation > then there needs to be more ereports. > +1 to the idea. I have created a separate patch v49-0001 adding more ereports for all kinds of invalidations. > ~~~ > SlotInactiveTimeoutCheckAllowed: > > 8. > +/* > + * Is this replication slot allowed for inactive timeout invalidation check? > + * > + * Inactive timeout invalidation is allowed only when: > + * > + * 1. Inactive timeout is set > + * 2. Slot is inactive > + * 3. Server is in recovery and slot is not being synced from the primary > + * > + * Note that the inactive timeout invalidation mechanism is not > + * applicable for slots on the standby server that are being synced > + * from the primary server (i.e., standby slots having 'synced' field 'true'). > + * Synced slots are always considered to be inactive because they don't > + * perform logical decoding to produce changes. > + */ > > 8a. > Somehow that first sentence seems strange. Would it be better to write it like: > > SUGGESTION > Can this replication slot timeout due to inactivity? > I feel the suggestion is not very clear on the purpose of the function, This function doesn't check inactivity or decide slot timeout invalidation. It only pre-checks if the slot qualifies for an inactivity check, which the caller will perform. As I have changed function name too as per commnet#9, I used the following - "Is inactive timeout invalidation possible for this replication slot?" Thoughts? > ~ > 8c. > Similarly, I think something about that "Note that the inactive > timeout invalidation mechanism is not applicable..." paragraph needs > tweaking because IMO that should also now be saying something about > 'RecoveryInProgress'. > 'RecoveryInProgress' check indicates that the server is a standby, and the mentioned paragraph uses the term "standby" to describe the condition. It seems unnecessary to mention RecoveryInProgress separately. > ~~~ > > InvalidatePossiblyObsoleteSlot: > > 10. > break; > + case RS_INVAL_INACTIVE_TIMEOUT: > + > + /* > + * Check if the slot needs to be invalidated due to > + * replication_slot_inactive_timeout GUC. > + */ > > Since there are no other blank lines anywhere in this switch, the > introduction of this one in v48 looks out of place to me. pgindent automatically added this blank line after 'case RS_INVAL_INACTIVE_TIMEOUT'. > IMO it would > be more readable if a blank line followed each/every of the breaks, > but then that is not a necessary change for this patch so... > Since it's not directly related to the patch, I feel it might be best to leave it as is for now. > ~~~ > > 11. > + /* > + * Invalidation due to inactive timeout implies that > + * no one is using the slot. > + */ > + Assert(s->active_pid == 0); > > Given this assertion, does it mean that "(s->active_pid == 0)" should > have been another condition done up-front in the function > 'SlotInactiveTimeoutCheckAllowed'? > I don't think it's a good idea to check (s->active_pid == 0) upfront, before the timeout-invalidation check. AFAIU, this assertion is meant to ensure active_pid = 0 only if the slot is going to be invalidated, i.e., when the following condition is true: TimestampDifferenceExceeds(s->inactive_since, now, replication_slot_inactive_timeout_sec * 1000) Thoughts? Open to others' opinions too. > ~~~ > > 12. > /* > - * If the slot can be acquired, do so and mark it invalidated > - * immediately. Otherwise we'll signal the owning process, below, and > - * retry. > + * If the slot can be acquired, do so and mark it as invalidated. If > + * the slot is already ours, mark it as invalidated. Otherwise, we'll > + * signal the owning process below and retry. > */ > - if (active_pid == 0) > + if (active_pid == 0 || > + (MyReplicationSlot == s && > + active_pid == MyProcPid)) > > I wasn't sure how this change belongs to this patch, because the logic > of the previous review comment said for the case of invalidation due > to inactivity that active_id must be 0. e.g. Assert(s->active_pid == > 0); > I don't fully understand the purpose of this change yet. I'll look into it further and get back. > ~~~ > > RestoreSlotFromDisk: > > 13. > - slot->inactive_since = GetCurrentTimestamp(); > + slot->inactive_since = now; > > In v47 this assignment used to call the function > 'ReplicationSlotSetInactiveSince'. I recognise there is a very subtle > difference between direct assignment and the function, because the > function will skip assignment if the slot is already invalidated. > Anyway, if you are *deliberately* not wanting to call > ReplicationSlotSetInactiveSince here then I think this assignment > should be commented to explain the reason why not, otherwise someone > in the future might be tempted to think it was just an oversight and > add the call back in that you don't want. > Added comment saying avoid using ReplicationSlotSetInactiveSince() here as it will skip the invalid slots. ~~~~ -- Thanks, Nisha
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Thu, Nov 14, 2024 at 9:14 AM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > Please find the v48 patch attached. > > > > On Thu, Sep 19, 2024 at 9:40 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > When we promote hot standby with synced logical slots to become new > > > primary, the logical slots are never invalidated with > > > 'inactive_timeout' on new primary. It seems the check in > > > SlotInactiveTimeoutCheckAllowed() is wrong. We should allow > > > invalidation of slots on primary even if they are marked as 'synced'. > > > > fixed. > > > > > I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3]. > > > Once all these are addressed, I can continue reviewing further. > > > > > > > Fixed issues reported in [1], [2]. > > Few comments: Thanks for the review. > > 2) Currently it allows a minimum value of less than 1 second like in > milliseconds, I feel we can have some minimum value at least something > like checkpoint_timeout: > diff --git a/src/backend/utils/misc/guc_tables.c > b/src/backend/utils/misc/guc_tables.c > index 8a67f01200..367f510118 100644 > --- a/src/backend/utils/misc/guc_tables.c > +++ b/src/backend/utils/misc/guc_tables.c > @@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] = > NULL, NULL, NULL > }, > > + { > + {"replication_slot_inactive_timeout", PGC_SIGHUP, > REPLICATION_SENDING, > + gettext_noop("Sets the amount of time a > replication slot can remain inactive before " > + "it will be invalidated."), > + NULL, > + GUC_UNIT_S > + }, > + &replication_slot_inactive_timeout, > + 0, 0, INT_MAX, > + NULL, NULL, NULL > + }, > Currently, the feature is disabled by default when replication_slot_inactive_timeout = 0. However, if we set a minimum value, the default_val cannot be less than min_val, making it impossible to use 0 to disable the feature. Thoughts or any suggestions? > > 4) I'm not sure if this change required by this patch or is it a > general optimization, if it is required for this patch we can detail > the comments: > @@ -2208,6 +2328,7 @@ RestoreSlotFromDisk(const char *name) > bool restored = false; > int readBytes; > pg_crc32c checksum; > + TimestampTz now; > > /* no need to lock here, no concurrent access allowed yet */ > > @@ -2368,6 +2489,9 @@ RestoreSlotFromDisk(const char *name) > NameStr(cp.slotdata.name)), > errhint("Change \"wal_level\" to be > \"replica\" or higher."))); > > + /* Use same inactive_since time for all slots */ > + now = GetCurrentTimestamp(); > + > /* nothing can be active yet, don't lock anything */ > for (i = 0; i < max_replication_slots; i++) > { > @@ -2400,7 +2524,7 @@ RestoreSlotFromDisk(const char *name) > * slot from the disk into memory. Whoever acquires > the slot i.e. > * makes the slot active will reset it. > */ > - slot->inactive_since = GetCurrentTimestamp(); > + slot->inactive_since = now; > After removing the "ReplicationSlotSetInactiveSince" from here, it became irrelevant to this patch. Now, it is a general optimization to set the same timestamp for all slots while restoring from disk. I have added a few comments as per Peter's suggestion. > 5) Why should the slot invalidation be updated during shutdown, > shouldn't the inactive_since value be intact during shutdown? > - <literal>NULL</literal> if the slot is currently being used. > - Note that for slots on the standby that are being synced from a > + <literal>NULL</literal> if the slot is currently being used. Once the > + slot is invalidated, this value will remain unchanged until we shutdown > + the server. Note that for slots on the standby that are being > synced from a > The "inactive_since" data of a slot is not stored on disk, so the older value cannot be restored after a restart. -- Thanks, Nisha
On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote: > > Attached is the v49 patch set: > - Fixed the bug reported in [1]. > - Addressed comments in [2] and [3]. > > I've split the patch into two, implementing the suggested idea in > comment #5 of [2] separately in 001: > > Patch-001: Adds additional error reports (for all invalidation types) > in ReplicationSlotAcquire() for invalid slots when error_if_invalid = > true. > Patch-002: The original patch with comments addressed. Few comments: 1) I felt this check in wait_for_slot_invalidation is not required as there is a call to trigger_slot_invalidation which sleeps for inactive_timeout seconds and ensures checkpoint is triggered, also the test passes without this: + # Wait for slot to become inactive + $node->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = '$slot' AND active = 'f' AND + inactive_since IS NOT NULL; + ]) + or die + "Timed out while waiting for slot $slot to become inactive on node $node_name"; 2) Instead of calling this in a loop, won't it be enough to call checkpoint only once explicitly: + for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++) + { + $node->safe_psql('postgres', "CHECKPOINT"); + if ($node->log_contains( + "invalidating obsolete replication slot \"$slot\"", $offset)) + { + $invalidated = 1; + last; + } + usleep(100_000); + } + ok($invalidated, + "check that slot $slot invalidation has been logged on node $node_name" + ); 3) Since pg_sync_replication_slots is a sync call, we can directly use "is( $standby1->safe_psql('postgres', SELECT COUNT(slot_name) = 1 FROM pg_replication_slots..." instead of poll_query_until: +$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();"); +$standby1->poll_query_until( + 'postgres', qq[ + SELECT COUNT(slot_name) = 1 FROM pg_replication_slots + WHERE slot_name = 'sync_slot1' AND + invalidation_reason = 'inactive_timeout'; +]) + or die + "Timed out while waiting for sync_slot1 invalidation to be synced on standby"; 4) Since this variable is being referred to at many places, how about changing it to inactive_timeout_1s so that it is easier while reviewing across many places: # Set timeout GUC on the standby to verify that the next checkpoint will not # invalidate synced slots. my $inactive_timeout = 1; 5) Since we have already tested invalidation of logical replication slot 'sync_slot1' above, this test might not be required: +# ============================================================================= +# Testcase start +# Invalidate logical subscriber slot due to inactive timeout. + +my $publisher = $primary; + +# Prepare for test +$publisher->safe_psql( + 'postgres', qq[ + ALTER SYSTEM SET replication_slot_inactive_timeout TO '0'; +]); +$publisher->reload; Regards, Vignesh
On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote: > > Attached is the v49 patch set: > - Fixed the bug reported in [1]. > - Addressed comments in [2] and [3]. > > I've split the patch into two, implementing the suggested idea in > comment #5 of [2] separately in 001: > > Patch-001: Adds additional error reports (for all invalidation types) > in ReplicationSlotAcquire() for invalid slots when error_if_invalid = > true. > Patch-002: The original patch with comments addressed. This Assert can fail: + /* + * Check if the slot needs to be invalidated due to + * replication_slot_inactive_timeout GUC. + */ + if (now && + TimestampDifferenceExceeds(s->inactive_since, now, + replication_slot_inactive_timeout_sec * 1000)) + { + invalidation_cause = cause; + inactive_since = s->inactive_since; + + /* + * Invalidation due to inactive timeout implies that + * no one is using the slot. + */ + Assert(s->active_pid == 0); With the following scenario: Set replication_slot_inactive_timeout to 10 seconds -- Create a slot postgres=# select pg_create_logical_replication_slot ('test', 'pgoutput', true, true); pg_create_logical_replication_slot ------------------------------------ (test,0/1748068) (1 row) -- Wait for 10 seconds and execute checkpoint postgres=# checkpoint; WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly The assert fails: #5 0x00005b074f0c922f in ExceptionalCondition (conditionName=0x5b074f2f0b4c "s->active_pid == 0", fileName=0x5b074f2f0010 "slot.c", lineNumber=1762) at assert.c:66 #6 0x00005b074ee26ead in InvalidatePossiblyObsoleteSlot (cause=RS_INVAL_INACTIVE_TIMEOUT, s=0x740925361780, oldestLSN=0, dboid=0, snapshotConflictHorizon=0, invalidated=0x7fffaee87e63) at slot.c:1762 #7 0x00005b074ee273b2 in InvalidateObsoleteReplicationSlots (cause=RS_INVAL_INACTIVE_TIMEOUT, oldestSegno=0, dboid=0, snapshotConflictHorizon=0) at slot.c:1952 #8 0x00005b074ee27678 in CheckPointReplicationSlots (is_shutdown=false) at slot.c:2061 #9 0x00005b074e9dfda7 in CheckPointGuts (checkPointRedo=24412528, flags=108) at xlog.c:7513 #10 0x00005b074e9df4ad in CreateCheckPoint (flags=108) at xlog.c:7179 #11 0x00005b074edc6bfc in CheckpointerMain (startup_data=0x0, startup_data_len=0) at checkpointer.c:463 Regards, Vignesh
On Tue, 19 Nov 2024 at 12:51, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Thu, Nov 14, 2024 at 9:14 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > Please find the v48 patch attached. > > > > > 2) Currently it allows a minimum value of less than 1 second like in > > milliseconds, I feel we can have some minimum value at least something > > like checkpoint_timeout: > > diff --git a/src/backend/utils/misc/guc_tables.c > > b/src/backend/utils/misc/guc_tables.c > > index 8a67f01200..367f510118 100644 > > --- a/src/backend/utils/misc/guc_tables.c > > +++ b/src/backend/utils/misc/guc_tables.c > > @@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] = > > NULL, NULL, NULL > > }, > > > > + { > > + {"replication_slot_inactive_timeout", PGC_SIGHUP, > > REPLICATION_SENDING, > > + gettext_noop("Sets the amount of time a > > replication slot can remain inactive before " > > + "it will be invalidated."), > > + NULL, > > + GUC_UNIT_S > > + }, > > + &replication_slot_inactive_timeout, > > + 0, 0, INT_MAX, > > + NULL, NULL, NULL > > + }, > > > > Currently, the feature is disabled by default when > replication_slot_inactive_timeout = 0. However, if we set a minimum > value, the default_val cannot be less than min_val, making it > impossible to use 0 to disable the feature. > Thoughts or any suggestions? We could implement this similarly to how the vacuum_buffer_usage_limit GUC is handled. Setting the value to 0 would allow the operation to use any amount of shared_buffers. Otherwise, valid sizes would range from 128 kB to 16 GB. Similarly, we can modify check_replication_slot_inactive_timeout to behave in the same way as check_vacuum_buffer_usage_limit function. Regards, Vignesh
On Thu, 21 Nov 2024 at 17:35, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Wed, Nov 20, 2024 at 1:29 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > Attached is the v49 patch set: > > > - Fixed the bug reported in [1]. > > > - Addressed comments in [2] and [3]. > > > > > > I've split the patch into two, implementing the suggested idea in > > > comment #5 of [2] separately in 001: > > > > > > Patch-001: Adds additional error reports (for all invalidation types) > > > in ReplicationSlotAcquire() for invalid slots when error_if_invalid = > > > true. > > > Patch-002: The original patch with comments addressed. > > > > This Assert can fail: > > > > Attached v50 patch-set addressing review comments in [1] and [2]. We are setting inactive_since when the replication slot is released. We are marking the slot as inactive only if it has been released. However, there's a scenario where the network connection between the publisher and subscriber may be lost where the replication slot is not released, but no changes are replicated due to the network problem. In this case, no updates would occur in the replication slot for a period exceeding the replication_slot_inactive_timeout. Should we invalidate these replication slots as well, or is it intentionally left out? Regards, Vignesh
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha, Here are my review comments for the patch v50-0001. ====== Commit message 1. In ReplicationSlotAcquire(), raise an error for invalid slots if caller specify error_if_invalid=true. /caller/the caller/ /specify/specifies/ ====== src/backend/replication/slot.c ReplicationSlotAcquire: 2. + * + * An error is raised if error_if_invalid is true and the slot has been + * invalidated previously. */ void -ReplicationSlotAcquire(const char *name, bool nowait) +ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid) The "has been invalidated previously." sounds a bit tricky. Do you just mean: "An error is raised if error_if_invalid is true and the slot is found to be invalid." ~ 3. + /* + * An error is raised if error_if_invalid is true and the slot has been + * previously invalidated. + */ (ditto previous comment) ~ 4. + appendStringInfo(&err_detail, _("This slot has been invalidated because ")); + + switch (s->data.invalidated) + { + case RS_INVAL_WAL_REMOVED: + appendStringInfo(&err_detail, _("the required WAL has been removed.")); + break; + + case RS_INVAL_HORIZON: + appendStringInfo(&err_detail, _("the required rows have been removed.")); + break; + + case RS_INVAL_WAL_LEVEL: + appendStringInfo(&err_detail, _("wal_level is insufficient for slot.")); + break; 4a. I suspect that building the errdetail in 2 parts like this will be troublesome for the translators of some languages. Probably it is safer to have the entire errdetail for each case. ~ 4b. By convention, I think the GUC "wal_level" should be double-quoted in the message. ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha, Here are some review comments for the patch v50-0002. ====== src/backend/replication/slot.c InvalidatePossiblyObsoleteSlot: 1. + if (now && + TimestampDifferenceExceeds(s->inactive_since, now, + replication_slot_inactive_timeout_sec * 1000)) Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed: + if (SlotInactiveTimeoutCheckAllowed(s) && + TimestampDifferenceExceeds(s->inactive_since, now, + replication_slot_inactive_timeout * 1000)) Is it OK to skip that call? e.g. can the slot fields possibly change between assigning the 'now' and acquiring the mutex? If not, then the current code is fine. The only reason for asking is because it is slightly suspicious that it was not done this "easy" way in the first place. ~~~ check_replication_slot_inactive_timeout: 2. +/* + * GUC check_hook for replication_slot_inactive_timeout + * + * We don't allow the value of replication_slot_inactive_timeout other than 0 + * during the binary upgrade. + */ The "We don't allow..." sentence seems like a backward way of saying: The value of replication_slot_inactive_timeout must be set to 0 during the binary upgrade. ====== src/test/recovery/t/050_invalidate_slots.pl 3. +# Despite inactive timeout being set, the synced slot won't get invalidated on +# its own on the standby. What does "on its own" mean here? Do you mean it won't get invalidated unless the invalidation state is propagated from the primary? Maybe the comment can be clearer. ~ 4. +# Wait for slot to first become inactive and then get invalidated +sub wait_for_slot_invalidation +{ + my ($node, $slot, $offset, $inactive_timeout_1s) = @_; + my $node_name = $node->name; + It was OK to change the variable name to 'inactive_timeout_1s' outside of here, but within the subroutine, I don't think it is appropriate because this is a parameter that potentially could have any value. ~ 5. +# Trigger slot invalidation and confirm it in the server log +sub trigger_slot_invalidation +{ + my ($node, $slot, $offset, $inactive_timeout_1s) = @_; + my $node_name = $node->name; + my $invalidated = 0; It was OK to change the variable name to 'inactive_timeout_1s' outside of here, but within the subroutine, I don't think it is appropriate because this is a parameter that potentially could have any value. ~ 6. + # Give enough time to avoid multiple checkpoints + sleep($inactive_timeout_1s + 1); + + # Run a checkpoint + $node->safe_psql('postgres', "CHECKPOINT"); Since you are not doing multiple checkpoints anymore, it looks like that "Give enough time..." comment needs updating. ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha, here are my review comments for the patch v51-0001. ====== src/backend/replication/slot.c ReplicationSlotAcquire: 1. + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("can no longer get changes from replication slot \"%s\"", + NameStr(s->data.name)), + errdetail_internal("%s", err_detail.data)); + + pfree(err_detail.data); + } + Won't the 'pfree' be unreachable due to the prior ereport ERROR? ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha. Here are some review comments for patch v51-0002. ====== doc/src/sgml/system-views.sgml 1. The time when the slot became inactive. <literal>NULL</literal> if the - slot is currently being streamed. + slot is currently being streamed. Once the slot is invalidated, this + value will remain unchanged until we shutdown the server. . I think "Once the ..." kind of makes it sound like invalidation is inevitable. Also maybe it's better to remove the "we". SUGGESTION: If the slot becomes invalidated, this value will remain unchanged until server shutdown. ====== src/backend/replication/slot.c ReplicationSlotAcquire: 2. GENERAL. This just is a question/idea. It may not be feasible to change. It seems like there is a lot of overlap between the error messages in 'ReplicationSlotAcquire' which are saying "This slot has been invalidated because...", and with the other function 'ReportSlotInvalidation' which is kind of the same but called in different circumstances and with slightly different message text. I wondered if there is a way to use common code to unify these messages instead of having a nearly duplicate set of messages for all the invalidation causes? ~~~ 3. + case RS_INVAL_INACTIVE_TIMEOUT: + appendStringInfo(&err_detail, _("inactivity exceeded the time limit set by \"%s\"."), + "replication_slot_inactive_timeout"); + break; Should this err_detail also say "This slot has been invalidated because ..." like all the others? ~~~ InvalidatePossiblyObsoleteSlot: 4. + case RS_INVAL_INACTIVE_TIMEOUT: + + /* + * Check if the slot needs to be invalidated due to + * replication_slot_inactive_timeout GUC. + */ + if (IsSlotInactiveTimeoutPossible(s) && + TimestampDifferenceExceeds(s->inactive_since, now, + replication_slot_inactive_timeout_sec * 1000)) + { Maybe this code should have Assert(now > 0); before the condition just as a way to 'document' that it is assumed 'now' was already set this outside the mutex. ====== Kind Regards, Peter Smith. Fujitsu Australia
RE: Introduce XID age and inactive timeout based replication slot invalidation
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Nisha, > > Attached v51 patch-set addressing all comments in [1] and [2]. > Thanks for working on the feature! I've stated to review the patch. Here are my comments - sorry if there are something which have already been discussed. The thread is too long to follow correctly. Comments for 0001 ============= 01. binary_upgrade_logical_slot_has_caught_up ISTM that error_if_invalid is set to true when the slot can be moved forward, otherwise it is set to false. Regarding the binary_upgrade_logical_slot_has_caught_up, however, only valid slots will be passed to the funciton (see pg_upgrade/info.c) so I feel it is OK to set to true. Thought? 02. ReplicationSlotAcquire According to other functions, we are adding to a note to the translator when parameters represent some common nouns, GUC names. I feel we should add a comment for RS_INVAL_WAL_LEVEL part based on it. Comments for 0002 ============= 03. check_replication_slot_inactive_timeout Can we overwrite replication_slot_inactive_timeout to zero when pg_uprade (and also pg_createsubscriber?) starts a server process? Several parameters have already been specified via -c option at that time. This can avoid an error while the upgrading. Note that this part is still needed even if you accept the comment. Users can manually boot with upgrade mode. 04. ReplicationSlotAcquire Same comment as 02. 05. ReportSlotInvalidation Same comment as 02. 06. found bug While testing the patch, I found that slots can be invalidated too early when when the GUC is quite large. I think because an overflow is caused in InvalidatePossiblyObsoleteSlot(). - Reproducer I set the replication_slot_inactive_timeout to INT_MAX and executed below commands, and found that the slot is invalidated. ``` postgres=# SHOW replication_slot_inactive_timeout; replication_slot_inactive_timeout ----------------------------------- 2147483647s (1 row) postgres=# SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding'); slot_name | lsn -----------+----------- test | 0/18B7F38 (1 row) postgres=# CHECKPOINT ; CHECKPOINT postgres=# SELECT slot_name, inactive_since, invalidation_reason FROM pg_replication_slots ; slot_name | inactive_since | invalidation_reason -----------+-------------------------------+--------------------- test | 2024-11-28 07:50:25.927594+00 | inactive_timeout (1 row) ``` - analysis In InvalidatePossiblyObsoleteSlot(), replication_slot_inactive_timeout_sec * 1000 is passed to the third argument of TimestampDifferenceExceeds(), which is also the integer datatype. This causes an overflow and parameter is handled as the small value. - solution I think there are two possible solutions. You can choose one of them: a. Make the maximum INT_MAX/1000, or b. Change the unit to millisecond. Best regards, Hayato Kuroda FUJITSU LIMITED
On Fri, 22 Nov 2024 at 17:43, vignesh C <vignesh21@gmail.com> wrote: > > On Thu, 21 Nov 2024 at 17:35, Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > On Wed, Nov 20, 2024 at 1:29 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > > Attached is the v49 patch set: > > > > - Fixed the bug reported in [1]. > > > > - Addressed comments in [2] and [3]. > > > > > > > > I've split the patch into two, implementing the suggested idea in > > > > comment #5 of [2] separately in 001: > > > > > > > > Patch-001: Adds additional error reports (for all invalidation types) > > > > in ReplicationSlotAcquire() for invalid slots when error_if_invalid = > > > > true. > > > > Patch-002: The original patch with comments addressed. > > > > > > This Assert can fail: > > > > > > > Attached v50 patch-set addressing review comments in [1] and [2]. > > We are setting inactive_since when the replication slot is released. > We are marking the slot as inactive only if it has been released. > However, there's a scenario where the network connection between the > publisher and subscriber may be lost where the replication slot is not > released, but no changes are replicated due to the network problem. In > this case, no updates would occur in the replication slot for a period > exceeding the replication_slot_inactive_timeout. > Should we invalidate these replication slots as well, or is it > intentionally left out? On further thinking, I felt we can keep the current implementation as is and simply add a brief comment in the code to address this. Additionally, we can mention it in the commit message for clarity. Regards, Vignesh
On Wed, 27 Nov 2024 at 16:25, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Wed, Nov 27, 2024 at 8:39 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Nisha, > > > > Here are some review comments for the patch v50-0002. > > > > ====== > > src/backend/replication/slot.c > > > > InvalidatePossiblyObsoleteSlot: > > > > 1. > > + if (now && > > + TimestampDifferenceExceeds(s->inactive_since, now, > > + replication_slot_inactive_timeout_sec * 1000)) > > > > Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed: > > > > + if (SlotInactiveTimeoutCheckAllowed(s) && > > + TimestampDifferenceExceeds(s->inactive_since, now, > > + replication_slot_inactive_timeout * 1000)) > > > > Is it OK to skip that call? e.g. can the slot fields possibly change > > between assigning the 'now' and acquiring the mutex? If not, then the > > current code is fine. The only reason for asking is because it is > > slightly suspicious that it was not done this "easy" way in the first > > place. > > > Good catch! While the mutex was being acquired right after the now > assignment, there was a rare chance of another process modifying the > slot in the meantime. So, I reverted the change in v51. To optimize > the SlotInactiveTimeoutCheckAllowed() call, it's sufficient to check > it here instead of during the 'now' assignment. > > Attached v51 patch-set addressing all comments in [1] and [2]. Few comments: 1) replication_slot_inactive_timeout can be mentioned in logical replication config, we could mention something like: Logical replication slot is also affected by replication_slot_inactive_timeout 2.a) Is this change applicable only for inactive timeout or it is applicable to others like wal removed, wal level etc also? If it is applicable to all of them we could move this to the first patch and update the commit message: + * If the slot can be acquired, do so and mark it as invalidated. If + * the slot is already ours, mark it as invalidated. Otherwise, we'll + * signal the owning process below and retry. */ - if (active_pid == 0) + if (active_pid == 0 || + (MyReplicationSlot == s && + active_pid == MyProcPid)) 2.b) Also this MyReplicationSlot and active_pid check can be in same line: + (MyReplicationSlot == s && + active_pid == MyProcPid)) 3) Error detail should start in upper case here similar to how others are done: + case RS_INVAL_INACTIVE_TIMEOUT: + appendStringInfo(&err_detail, _("inactivity exceeded the time limit set by \"%s\"."), + "replication_slot_inactive_timeout"); + break; 4) Since this change is not related to this patch, we can move this to the first patch and update the commit message: --- a/src/backend/replication/logical/slotsync.c +++ b/src/backend/replication/logical/slotsync.c @@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len) static void update_synced_slots_inactive_since(void) { - TimestampTz now = 0; + TimestampTz now; /* * We need to update inactive_since only when we are promoting standby to @@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void) /* The slot sync worker or SQL function mustn't be running by now */ Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing); + /* Use same inactive_since time for all slots */ + now = GetCurrentTimestamp(); 5) Since this change is not related to this patch, we can move this to the first patch. @@ -2250,6 +2350,7 @@ RestoreSlotFromDisk(const char *name) bool restored = false; int readBytes; pg_crc32c checksum; + TimestampTz now; /* no need to lock here, no concurrent access allowed yet */ @@ -2410,6 +2511,9 @@ RestoreSlotFromDisk(const char *name) NameStr(cp.slotdata.name)), errhint("Change \"wal_level\" to be \"replica\" or higher."))); + /* Use same inactive_since time for all slots */ + now = GetCurrentTimestamp(); + /* nothing can be active yet, don't lock anything */ for (i = 0; i < max_replication_slots; i++) { @@ -2440,9 +2544,11 @@ RestoreSlotFromDisk(const char *name) /* * Set the time since the slot has become inactive after loading the * slot from the disk into memory. Whoever acquires the slot i.e. - * makes the slot active will reset it. + * makes the slot active will reset it. Avoid calling + * ReplicationSlotSetInactiveSince() here, as it will not set the time + * for invalid slots. */ - slot->inactive_since = GetCurrentTimestamp(); + slot->inactive_since = now; [1] - https://www.postgresql.org/docs/current/logical-replication-config.html Regards, Vignesh
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Tue, Nov 19, 2024 at 12:47 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Thu, Nov 14, 2024 at 5:29 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > > > 12. > > /* > > - * If the slot can be acquired, do so and mark it invalidated > > - * immediately. Otherwise we'll signal the owning process, below, and > > - * retry. > > + * If the slot can be acquired, do so and mark it as invalidated. If > > + * the slot is already ours, mark it as invalidated. Otherwise, we'll > > + * signal the owning process below and retry. > > */ > > - if (active_pid == 0) > > + if (active_pid == 0 || > > + (MyReplicationSlot == s && > > + active_pid == MyProcPid)) > > > > I wasn't sure how this change belongs to this patch, because the logic > > of the previous review comment said for the case of invalidation due > > to inactivity that active_id must be 0. e.g. Assert(s->active_pid == > > 0); > > > > I don't fully understand the purpose of this change yet. I'll look > into it further and get back. > This change applies to all types of invalidation, not just inactive_timeout case, so moved the change to patch-001. It’s a general optimization for the case when the current process is the active PID for the slot. Also, the Assert(s->active_pid == 0); has been removed (in v50) as it was unnecessary. -- Thanks, Nisha
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Thu, Nov 28, 2024 at 1:29 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Nisha, > > > > > Attached v51 patch-set addressing all comments in [1] and [2]. > > > > Thanks for working on the feature! I've stated to review the patch. > Here are my comments - sorry if there are something which have already been discussed. > The thread is too long to follow correctly. > > Comments for 0001 > ============= > > 01. binary_upgrade_logical_slot_has_caught_up > > ISTM that error_if_invalid is set to true when the slot can be moved forward, otherwise > it is set to false. Regarding the binary_upgrade_logical_slot_has_caught_up, however, > only valid slots will be passed to the funciton (see pg_upgrade/info.c) so I feel > it is OK to set to true. Thought? > Right, corrected the call with error_if_invalid as true. > Comments for 0002 > ============= > > 03. check_replication_slot_inactive_timeout > > Can we overwrite replication_slot_inactive_timeout to zero when pg_uprade (and also > pg_createsubscriber?) starts a server process? Several parameters have already been > specified via -c option at that time. This can avoid an error while the upgrading. > Note that this part is still needed even if you accept the comment. Users can > manually boot with upgrade mode. > Done. > 06. found bug > > While testing the patch, I found that slots can be invalidated too early when when > the GUC is quite large. I think because an overflow is caused in InvalidatePossiblyObsoleteSlot(). > > - Reproducer > > I set the replication_slot_inactive_timeout to INT_MAX and executed below commands, > and found that the slot is invalidated. > > ``` > postgres=# SHOW replication_slot_inactive_timeout; > replication_slot_inactive_timeout > ----------------------------------- > 2147483647s > (1 row) > postgres=# SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding'); > slot_name | lsn > -----------+----------- > test | 0/18B7F38 > (1 row) > postgres=# CHECKPOINT ; > CHECKPOINT > postgres=# SELECT slot_name, inactive_since, invalidation_reason FROM pg_replication_slots ; > slot_name | inactive_since | invalidation_reason > -----------+-------------------------------+--------------------- > test | 2024-11-28 07:50:25.927594+00 | inactive_timeout > (1 row) > ``` > > - analysis > > In InvalidatePossiblyObsoleteSlot(), replication_slot_inactive_timeout_sec * 1000 > is passed to the third argument of TimestampDifferenceExceeds(), which is also the > integer datatype. This causes an overflow and parameter is handled as the small > value. > > - solution > > I think there are two possible solutions. You can choose one of them: > > a. Make the maximum INT_MAX/1000, or > b. Change the unit to millisecond. > Fixed. It is reasonable to align with other timeout parameters by using milliseconds as the unit. -- Thanks, Nisha
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Thu, Nov 28, 2024 at 5:20 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Nisha. Here are some review comments for patch v51-0002. > > ====== > src/backend/replication/slot.c > > ReplicationSlotAcquire: > > 2. > GENERAL. > > This just is a question/idea. It may not be feasible to change. It > seems like there is a lot of overlap between the error messages in > 'ReplicationSlotAcquire' which are saying "This slot has been > invalidated because...", and with the other function > 'ReportSlotInvalidation' which is kind of the same but called in > different circumstances and with slightly different message text. I > wondered if there is a way to use common code to unify these messages > instead of having a nearly duplicate set of messages for all the > invalidation causes? > The error handling could be moved to a new function; however, as you pointed out, the contexts in which these functions are called differ. IMO, a single error message may not suit both cases. For example, ReportSlotInvalidation provides additional details and a hint in its message, which isn’t necessary for ReplicationSlotAcquire. Thoughts? -- Thanks, Nisha
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha, here are a couple of review comments for patch v52-0001. ====== Commit Message Add check if slot is already acquired, then mark it invalidate directly. ~ /slot/the slot/ "mark it invalidate" ? Maybe you meant: "then invalidate it directly", or "then mark it 'invalidated' directly", or etc. ====== src/backend/replication/logical/slotsync.c 1. @@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len) static void update_synced_slots_inactive_since(void) { - TimestampTz now = 0; + TimestampTz now; /* * We need to update inactive_since only when we are promoting standby to @@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void) /* The slot sync worker or SQL function mustn't be running by now */ Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing); + /* Use same inactive_since time for all slots */ + now = GetCurrentTimestamp(); + Something is broken with these changes. AFAICT, the result after applying patch 0001 still has code: /* Use the same inactive_since time for all the slots. */ if (now == 0) now = GetCurrentTimestamp(); So the end result has multiple/competing assignments to variable 'now'. ====== Kind Regards, Peter Smith. Fujitsu Australia
RE: Introduce XID age and inactive timeout based replication slot invalidation
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Nisha, Thanks for updating the patch! > Fixed. It is reasonable to align with other timeout parameters by > using milliseconds as the unit. It looks you just replaced to GUC_UNIT_MS, but the documentation and postgresql.conf.sample has not been changed yet. They should follow codes. Anyway, here are other comments, mostly cosmetic. 01. slot.c ``` +int replication_slot_inactive_timeout_ms = 0; ``` According to other lines, we should add a short comment for the GUC. 02. 050_invalidate_slots.pl Do you have a reason why you use the number 050? I feel it can be 043. 03. 050_invalidate_slots.pl Also, not sure the file name is correct. This file contains only a slot invalidation due to the replication_slot_inactive_timeout. But I feel current name is too general. 04. 050_invalidate_slots.pl ``` +use Time::HiRes qw(usleep); ``` This line is not needed because usleep() is not used in this file. Best regards, Hayato Kuroda FUJITSU LIMITED
On Wed, 4 Dec 2024 at 15:01, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu) > <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Nisha, > > > > Thanks for updating the patch! > > > > > Fixed. It is reasonable to align with other timeout parameters by > > > using milliseconds as the unit. > > > > It looks you just replaced to GUC_UNIT_MS, but the documentation and > > postgresql.conf.sample has not been changed yet. They should follow codes. > > Anyway, here are other comments, mostly cosmetic. > > > > Here is v53 patch-set addressing all the comments in [1] and [2]. Currently, replication slots are invalidated based on the replication_slot_inactive_timeout only during a checkpoint. This means that if the checkpoint_timeout is set to a higher value than the replication_slot_inactive_timeout, slot invalidation will occur only when the checkpoint is triggered. Identifying the invalidation slots might be slightly delayed in this case. As an alternative, users can forcefully invalidate inactive slots that have exceeded the replication_slot_inactive_timeout by forcing a checkpoint. I was thinking we could suggest this in the documentation. + <para> + Slot invalidation due to inactive timeout occurs during checkpoint. + The duration of slot inactivity is calculated using the slot's + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield> + value. + </para> + We could accurately invalidate the slots using the checkpointer process by calculating the invalidation time based on the active_since timestamp and the replication_slot_inactive_timeout, and then set the checkpointer's main wait-latch accordingly for triggering the next checkpoint. Ideally, a different process handling this task would be better, but there is currently no dedicated daemon capable of identifying and managing slots across streaming replication, logical replication, and other slots used by plugins. Additionally, overloading the checkpointer with this responsibility may not be ideal. As an alternative, we could document about this delay in identifying and mention that it could be triggered by forceful manual checkpoint. Regards, Vignesh
On Wed, 4 Dec 2024 at 15:01, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu) > <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Nisha, > > > > Thanks for updating the patch! > > > > > Fixed. It is reasonable to align with other timeout parameters by > > > using milliseconds as the unit. > > > > It looks you just replaced to GUC_UNIT_MS, but the documentation and > > postgresql.conf.sample has not been changed yet. They should follow codes. > > Anyway, here are other comments, mostly cosmetic. > > > > Here is v53 patch-set addressing all the comments in [1] and [2]. CFBot is failing at [1] because the file name is changed to 043_invalidate_inactive_slots, the meson.build file should be updated accordingly: diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build index b1eb77b1ec..708a2a3798 100644 --- a/src/test/recovery/meson.build +++ b/src/test/recovery/meson.build @@ -51,6 +51,7 @@ tests += { 't/040_standby_failover_slots_sync.pl', 't/041_checkpoint_at_promote.pl', 't/042_low_level_backup.pl', + 't/050_invalidate_slots.pl', ], }, } [1] - https://cirrus-ci.com/task/6266479424831488 Regards, Vignesh
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha, Here are my review comments for the v53* patch set ////////// Patch v53-0001. ====== src/backend/replication/slot.c 1. + if (error_if_invalid && + s->data.invalidated != RS_INVAL_NONE) Looks like some unnecessary wrapping here. I think this condition can be on one line. ////////// Patch v53-0002. ====== GENERAL - How about using the term "idle"? 1. I got to wondering why this new GUC was called "replication_slot_inactive_timeout", with invalidation_reason = "inactive_timeout". When I look at similar GUCs I don't see words like "inactivity" or "inactive" anywhere; Instead, they are using the term "idle" to refer to when something is inactive: e.g. #idle_in_transaction_session_timeout = 0 # in milliseconds, 0 is disabled #idle_session_timeout = 0 # in milliseconds, 0 is disabled I know the "inactive" term is used a bit in the slot code but that is (mostly) not exposed to the user. Therefore, I am beginning to feel it would be better (e.g. more consistent) to use "idle" for the user-facing stuff. e.g. New Slot GUC = "idle_replication_slot_timeout" Slot invalidation_reason = "idle_timeout" Of course, changing this will cascade to impact quite a lot of other things in the patch -- comments, error messages, some function names etc. ====== doc/src/sgml/logical-replication.sgml 2. + <para> + Logical replication slot is also affected by + <link linkend="guc-replication-slot-inactive-timeout"><varname>replication_slot_inactive_timeout</varname></link>. + </para> + /Logical replication slot is also affected by/Logical replication slots are also affected by/ ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
On Wed, Dec 4, 2024 at 9:27 PM vignesh C <vignesh21@gmail.com> wrote: > ... > > Currently, replication slots are invalidated based on the > replication_slot_inactive_timeout only during a checkpoint. This means > that if the checkpoint_timeout is set to a higher value than the > replication_slot_inactive_timeout, slot invalidation will occur only > when the checkpoint is triggered. Identifying the invalidation slots > might be slightly delayed in this case. As an alternative, users can > forcefully invalidate inactive slots that have exceeded the > replication_slot_inactive_timeout by forcing a checkpoint. I was > thinking we could suggest this in the documentation. > > + <para> > + Slot invalidation due to inactive timeout occurs during checkpoint. > + The duration of slot inactivity is calculated using the slot's > + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield> > + value. > + </para> > + > > We could accurately invalidate the slots using the checkpointer > process by calculating the invalidation time based on the active_since > timestamp and the replication_slot_inactive_timeout, and then set the > checkpointer's main wait-latch accordingly for triggering the next > checkpoint. Ideally, a different process handling this task would be > better, but there is currently no dedicated daemon capable of > identifying and managing slots across streaming replication, logical > replication, and other slots used by plugins. Additionally, > overloading the checkpointer with this responsibility may not be > ideal. As an alternative, we could document about this delay in > identifying and mention that it could be triggered by forceful manual > checkpoint. > Hi Vignesh. I felt that manipulating the checkpoint timing behind the scenes without the user's consent might be a bit of an overreach. But there might still be something else we could do: 1. We can add the documentation note like you suggested ("we could document about this delay in identifying and mention that it could be triggered by forceful manual checkpoint"). 2. We can also detect such delays in the code. When the invalidation occurs (e.g. code fragment below) we could check if there was some excessive lag between the slot becoming idle and it being invalidated. If the lag is too much (whatever "too much" means) we can log a hint for the user to increase the checkpoint frequency (or whatever else we might advise them to do). + /* + * Check if the slot needs to be invalidated due to + * replication_slot_inactive_timeout GUC. + */ + if (IsSlotInactiveTimeoutPossible(s) && + TimestampDifferenceExceeds(s->inactive_since, now, + replication_slot_inactive_timeout_ms)) + { + invalidation_cause = cause; + inactive_since = s->inactive_since; pseudo-code: if (slot invalidation occurred much later after the replication_slot_inactive_timeout GUC elapsed) { elog(LOG, "This slot was inactive for a period of %s. Slot timeout invalidation only occurs at a checkpoint so if you want inactive slots to be invalidated in a more timely manner consider reducing the time between checkpoints or executing a manual checkpoint. (replication_slot_inactive_timeout = %s; checkpoint_timeout = %s, ....)" } + } ====== Kind Regards, Peter Smith. Fujitsu Australia
On Thu, 5 Dec 2024 at 06:44, Peter Smith <smithpb2250@gmail.com> wrote: > > On Wed, Dec 4, 2024 at 9:27 PM vignesh C <vignesh21@gmail.com> wrote: > > > ... > > > > Currently, replication slots are invalidated based on the > > replication_slot_inactive_timeout only during a checkpoint. This means > > that if the checkpoint_timeout is set to a higher value than the > > replication_slot_inactive_timeout, slot invalidation will occur only > > when the checkpoint is triggered. Identifying the invalidation slots > > might be slightly delayed in this case. As an alternative, users can > > forcefully invalidate inactive slots that have exceeded the > > replication_slot_inactive_timeout by forcing a checkpoint. I was > > thinking we could suggest this in the documentation. > > > > + <para> > > + Slot invalidation due to inactive timeout occurs during checkpoint. > > + The duration of slot inactivity is calculated using the slot's > > + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield> > > + value. > > + </para> > > + > > > > We could accurately invalidate the slots using the checkpointer > > process by calculating the invalidation time based on the active_since > > timestamp and the replication_slot_inactive_timeout, and then set the > > checkpointer's main wait-latch accordingly for triggering the next > > checkpoint. Ideally, a different process handling this task would be > > better, but there is currently no dedicated daemon capable of > > identifying and managing slots across streaming replication, logical > > replication, and other slots used by plugins. Additionally, > > overloading the checkpointer with this responsibility may not be > > ideal. As an alternative, we could document about this delay in > > identifying and mention that it could be triggered by forceful manual > > checkpoint. > > > > Hi Vignesh. > > I felt that manipulating the checkpoint timing behind the scenes > without the user's consent might be a bit of an overreach. Agree > But there might still be something else we could do: > > 1. We can add the documentation note like you suggested ("we could > document about this delay in identifying and mention that it could be > triggered by forceful manual checkpoint"). Yes, that makes sense > 2. We can also detect such delays in the code. When the invalidation > occurs (e.g. code fragment below) we could check if there was some > excessive lag between the slot becoming idle and it being invalidated. > If the lag is too much (whatever "too much" means) we can log a hint > for the user to increase the checkpoint frequency (or whatever else we > might advise them to do). > > + /* > + * Check if the slot needs to be invalidated due to > + * replication_slot_inactive_timeout GUC. > + */ > + if (IsSlotInactiveTimeoutPossible(s) && > + TimestampDifferenceExceeds(s->inactive_since, now, > + replication_slot_inactive_timeout_ms)) > + { > + invalidation_cause = cause; > + inactive_since = s->inactive_since; > > pseudo-code: > if (slot invalidation occurred much later after the > replication_slot_inactive_timeout GUC elapsed) > { > elog(LOG, "This slot was inactive for a period of %s. Slot timeout > invalidation only occurs at a checkpoint so if you want inactive slots > to be invalidated in a more timely manner consider reducing the time > between checkpoints or executing a manual checkpoint. > (replication_slot_inactive_timeout = %s; checkpoint_timeout = %s, > ....)" > } > > + } Determining the correct time may be challenging for users, as it depends on when the active_since value is set, as well as when the checkpoint_timeout occurs and the subsequent checkpoint is triggered. Even if the user sets it to an appropriate value, there is still a possibility of delayed identification due to the timing of when the slot's active_timeout is being set. Including this information in the documentation should be sufficient. Regards, Vignesh
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha. Here are some review comments for patch v54-0002. (I had also checked patch v54-0001, but have no further review comments for that one). ====== doc/src/sgml/config.sgml 1. + <para> + Slot invalidation due to idle timeout occurs during checkpoint. + If the <varname>checkpoint_timeout</varname> exceeds + <varname>idle_replication_slot_timeout</varname>, the slot + invalidation will be delayed until the next checkpoint is triggered. + To avoid delays, users can force a checkpoint to promptly invalidate + inactive slots. The duration of slot inactivity is calculated using the slot's + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield> + value. + </para> + The wording of "If the checkpoint_timeout exceeds idle_replication_slot_timeout, the slot invalidation will be delayed until the next checkpoint is triggered." seems slightly misleading, because AFAIK it is not conditional on the GUC value differences like that -- i.e. slot invalidation is *always* delayed until the next checkpoint occurs. SUGGESTION: Slot invalidation due to idle timeout occurs during checkpoint. Because checkpoints happen at checkpoint_timeout intervals, there can be some lag between when the idle_replication_slot_timeout was exceeded and when the slot invalidation is triggered at the next checkpoint. To avoid such lags, users can force... ======= src/backend/replication/slot.c 2. GENERAL +/* Invalidate replication slots idle beyond this time; '0' disables it */ +int idle_replication_slot_timeout_ms = 0; I noticed this patch is using a variety of ways of describing the same thing: * guc var: Invalidate replication slots idle beyond this time... * guc_tables: ... the amount of time a replication slot can remain idle before it will be invalidated. * docs: means that the slot has remained idle beyond the duration specified by the idle_replication_slot_timeout parameter * errmsg: ... slot has been invalidated because inactivity exceeded the time limit set by ... * etc.. They are all the same, but they are all worded slightly differently: * "idle" vs "inactivity" vs ... * "time" vs "amount of time" vs "duration" vs "time limit" vs ... There may not be a one-size-fits-all, but still, it might be better to try to search for all different phrasing and use common wording as much as possible. ~~~ CheckPointReplicationSlots: 3. + * XXX: Slot invalidation due to 'idle_timeout' occurs only for + * released slots, based on 'idle_replication_slot_timeout'. Active + * slots in use for replication are excluded, preventing accidental + * invalidation. Slots where communication between the publisher and + * subscriber is down are also excluded, as they are managed by the + * 'wal_sender_timeout'. Maybe a slight rewording like below is better. Maybe not. YMMV. SUGGESTION: XXX: Slot invalidation due to 'idle_timeout' applies only to released slots, and is based on the 'idle_replication_slot_timeout' GUC. Active slots currently in use for replication are excluded to prevent accidental invalidation. Slots... ====== src/bin/pg_upgrade/server.c 4. + /* + * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to + * inactive_timeout by checkpointer process during upgrade. + */ + if (GET_MAJOR_VERSION(cluster->major_version) >= 1800) + appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0"); + /inactive_timeout/idle_timeout/ ====== src/test/recovery/t/043_invalidate_inactive_slots.pl 5. +# Wait for slot to first become idle and then get invalidated +sub wait_for_slot_invalidation +{ + my ($node, $slot, $offset, $idle_timeout) = @_; + my $node_name = $node->name; AFAICT this 'idle_timeout' parameter is passed units of "seconds", so it would be better to call it something like 'idle_timeout_s' to make the units clear. ~~~ 6. +# Trigger slot invalidation and confirm it in the server log +sub trigger_slot_invalidation +{ + my ($node, $slot, $offset, $idle_timeout) = @_; + my $node_name = $node->name; + my $invalidated = 0; Ditto above review comment #5 -- better to call it something like 'idle_timeout_s' to make the units clear. ====== Kind Regards, Peter Smith. Fujitsu Australia
On Tue, 10 Dec 2024 at 17:21, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Fri, Dec 6, 2024 at 11:04 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Determining the correct time may be challenging for users, as it > > depends on when the active_since value is set, as well as when the > > checkpoint_timeout occurs and the subsequent checkpoint is triggered. > > Even if the user sets it to an appropriate value, there is still a > > possibility of delayed identification due to the timing of when the > > slot's active_timeout is being set. Including this information in the > > documentation should be sufficient. > > > > +1 > v54 documents this information as suggested. > > Attached the v54 patch-set addressing all the comments till now in Few comments on the test added: 1) Can we remove this and set idle_replication_slot_timeout while the standby node is created itself during append_conf: +# Set timeout GUC on the standby to verify that the next checkpoint will not +# invalidate synced slots. +my $idle_timeout_1s = 1; +$standby1->safe_psql( + 'postgres', qq[ + ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s'; +]); +$standby1->reload; 2) You can move these statements before the standby node is created: +# Create sync slot on the primary +$primary->psql('postgres', + q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);} +); + +# Create standby slot on the primary +$primary->safe_psql( + 'postgres', qq[ + SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true); +]); 3) Do we need autovacuum as off for these tests, is there any probability of a test failure without this. I felt it should not impact these tests, if not we can remove this: +# Avoid unpredictability +$primary->append_conf( + 'postgresql.conf', qq{ +checkpoint_timeout = 1h +autovacuum = off +}); 4) Generally we mention single char in single quotes, we can update "t" to 't': + ), + "t", + 'logical slot sync_slot1 is synced to standby'); + 5) Similarly here too: + WHERE slot_name = 'sync_slot1' + AND invalidation_reason IS NULL;} + ), + "t", + 'check that synced slot sync_slot1 has not been invalidated on standby'); 6) This standby offset is not used anywhere, it can be removed: +my $logstart = -s $standby1->logfile; + +# Set timeout GUC on the standby to verify that the next checkpoint will not +# invalidate synced slots. Regards, Vignesh
On Tue, 10 Dec 2024 at 17:21, Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Fri, Dec 6, 2024 at 11:04 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Determining the correct time may be challenging for users, as it > > depends on when the active_since value is set, as well as when the > > checkpoint_timeout occurs and the subsequent checkpoint is triggered. > > Even if the user sets it to an appropriate value, there is still a > > possibility of delayed identification due to the timing of when the > > slot's active_timeout is being set. Including this information in the > > documentation should be sufficient. > > > > +1 > v54 documents this information as suggested. > > Attached the v54 patch-set addressing all the comments till now in > [1], [2] and [3]. Now that we support idle_replication_slot_timeout in milliseconds, we can set this value from 1s to 1ms or 10millseconds and change sleep to usleep, this will bring down the test execution time significantly: +# Set timeout GUC on the standby to verify that the next checkpoint will not +# invalidate synced slots. +my $idle_timeout_1s = 1; +$standby1->safe_psql( + 'postgres', qq[ + ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s'; +]); +$standby1->reload; + +# Sync the primary slots to the standby +$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();"); + +# Confirm that the logical failover slot is created on the standby +is( $standby1->safe_psql( + 'postgres', + q{SELECT count(*) = 1 FROM pg_replication_slots + WHERE slot_name = 'sync_slot1' AND synced + AND NOT temporary + AND invalidation_reason IS NULL;} + ), + "t", + 'logical slot sync_slot1 is synced to standby'); + +# Give enough time for inactive_since to exceed the timeout +sleep($idle_timeout_1s + 1); Regards, Vignesh
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Nisha Moond
Date:
On Wed, Dec 11, 2024 at 8:14 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Nisha. > > Here are some review comments for patch v54-0002. > ====== > src/test/recovery/t/043_invalidate_inactive_slots.pl > > 5. > +# Wait for slot to first become idle and then get invalidated > +sub wait_for_slot_invalidation > +{ > + my ($node, $slot, $offset, $idle_timeout) = @_; > + my $node_name = $node->name; > > AFAICT this 'idle_timeout' parameter is passed units of "seconds", so > it would be better to call it something like 'idle_timeout_s' to make > the units clear. > As per the suggestion in [1], the test has been updated to use idle_timeout=1ms. Since the parameter uses the default unit of "milliseconds," keeping it as 'idle_timeout' seems reasonable to me. > ~~~ > > 6. > +# Trigger slot invalidation and confirm it in the server log > +sub trigger_slot_invalidation > +{ > + my ($node, $slot, $offset, $idle_timeout) = @_; > + my $node_name = $node->name; > + my $invalidated = 0; > > Ditto above review comment #5 -- better to call it something like > 'idle_timeout_s' to make the units clear. > The 'idle_timeout' parameter name remains unchanged as explained above. [1] https://www.postgresql.org/message-id/CALDaNm1FQS04aG0C0gCRpvi-o-OTdq91y6Az34YKN-dVc9r5Ng%40mail.gmail.com -- Thanks, Nisha
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
Hi Nisha. Thanks for the v55* patches. I have no comments for patch v55-0001. I have only 1 comment for patch v55-0002 regarding some remaining nitpicks (below) about the consistency of phrases. ====== I scanned again over all the phrases for consistency: CURRENT PATCH: Docs (idle_replication_slot_timeout): Invalidate replication slots that are idle for longer than this amount of time Docs (idle_timeout): means that the slot has remained idle longer than the duration specified by the idle_replication_slot_timeout parameter. Code (guc var comment): Invalidate replication slots idle longer than this time Code (guc_tables): Sets the time limit for how long a replication slot can remain idle before it is invalidated. Msg (errdetail): This slot has been invalidated because it has remained idle longer than the configured \"%s\" time. Msg (errdetail): The slot has been inactive since %s and has remained idle longer than the configured \"%s\" time. ~ NITPICKS: nit -- There are still some variations "amount of time" versus "time" versus "duration". I think the term "duration" best describe the maing so we can use that everywhere. nit - Should consistently say "remained idle" instead of just "idle" or "are idle", nit - The last errdetail is also rearranged a bit because IMO we don't need to say inactive and idle in the same sentence. nit - Just say "longer than" instead of sometimes saying "for longer than" ~ SUGGESTIONS: Docs (idle_replication_slot_timeout): Invalidate replication slots that have remained idle longer than this duration. Docs (idle_timeout): means that the slot has remained idle longer than the configured idle_replication_slot_timeout duration. Code (guc var comment): Invalidate replication slots that have remained idle longer than this duration. Code (guc_tables): Sets the duration a replication slot can remain idle before it is invalidated. Msg (errdetail): This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration. Msg (errdetail): The slot has remained idle since %s, which is longer than the configured \"%s\" duration. ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Peter Smith
Date:
On Mon, Dec 16, 2024 at 9:40 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Mon, Dec 16, 2024 at 9:58 AM Peter Smith <smithpb2250@gmail.com> wrote: > > ... > > SUGGESTIONS: > > > > Docs (idle_replication_slot_timeout): Invalidate replication slots > > that have remained idle longer than this duration. > > Docs (idle_timeout): means that the slot has remained idle longer than > > the configured idle_replication_slot_timeout duration. > > > > Code (guc var comment): Invalidate replication slots that have > > remained idle longer than this duration. > > Code (guc_tables): Sets the duration a replication slot can remain > > idle before it is invalidated. > > > > Msg (errdetail): This slot has been invalidated because it has > > remained idle longer than the configured \"%s\" duration. > > Msg (errdetail): The slot has remained idle since %s, which is longer > > than the configured \"%s\" duration. > > > > Here is the v56 patch set with the above comments incorporated. > Hi Nisha. Thanks for the updates. - Both patches could be applied cleanly. - Tests (make check, TAP subscriber, TAP recovery) are all passing. - The rendering of the documentation changes from patch 0002 looked good. - I have no more review comments. So, the v56* patchset LGTM. ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Introduce XID age and inactive timeout based replication slot invalidation
From
Amit Kapila
Date:
On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > Here is the v56 patch set with the above comments incorporated. > Review comments: =============== 1. + { + {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING, + gettext_noop("Sets the duration a replication slot can remain idle before " + "it is invalidated."), + NULL, + GUC_UNIT_MS + }, + &idle_replication_slot_timeout_ms, I think users are going to keep idele_slot timeout at least in hours. So, millisecond seems the wrong choice to me. I suggest to keep the units in minutes. I understand that writing a test would be challenging as spending a minute or more on one test is not advisable. But I don't see any test testing the other GUCs that are in minutes (wal_summary_keep_time and log_rotation_age). The default value should be one day. 2. + /* + * An error is raised if error_if_invalid is true and the slot is found to + * be invalid. + */ + if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE) + { + StringInfoData err_detail; + + initStringInfo(&err_detail); + + switch (s->data.invalidated) + { + case RS_INVAL_WAL_REMOVED: + appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed.")); + break; + + case RS_INVAL_HORIZON: + appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed.")); + break; + + case RS_INVAL_WAL_LEVEL: + /* translator: %s is a GUC variable name */ + appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."), + "wal_level"); + break; + + case RS_INVAL_NONE: + pg_unreachable(); + } + + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("can no longer get changes from replication slot \"%s\"", + NameStr(s->data.name)), + errdetail_internal("%s", err_detail.data)); + } + This should be moved to a separate function. 3. +static inline bool +IsSlotIdleTimeoutPossible(ReplicationSlot *s) Would it be better to name this function as CanInvalidateIdleSlot()? The current name doesn't seem to match with similar other functionalities. -- With Regards, Amit Kapila.
RE: Introduce XID age and inactive timeout based replication slot invalidation
From
"Zhijie Hou (Fujitsu)"
Date:
On Tuesday, December 24, 2024 8:57 PM Michail Nikolaev <michail.nikolaev@gmail.com> wrote: Hi, > Yesterday I got a strange set of test errors, probably somehow related to > that patch. It happened on changed master branch (based on > d96d1d5152f30d15678e08e75b42756101b7cab6) but I don't think my changes were > affecting it. > > My setup is a little bit tricky: Windows 11 run WSL2 with Ubuntu, meson. > > So, `recovery ` suite started failing on: > > 1) at /src/test/recovery/t/http://019_replslot_limit.pl line 530. > 2) at /src/test/recovery/t/http://040_standby_failover_slots_sync.pl line > 198. > > It was failing almost every run, one test or another. I was lurking around > for about 10 min, and..... it just stopped failing. And I can't reproduce it > anymore. > > But I have logs of two fails. I am not sure if it is helpful, but decided to > mail them here just in case. Thanks for reporting the issue. After checking the log, I think the failure is caused by the unexpected behavior of the local system clock. It's clear from the '019_replslot_limit_primary4.log'[1] that the clock went backwards which makes the slot's inactive_since go backwards as well. That's why the last testcase didn't pass. And for 040_standby_failover_slots_sync, we can see that the clock of standby lags behind that of the primary, which caused the inactive_since of newly synced slot on standby to be earlier than the one on the primary. So, I think it's not a bug in the committed patch but an issue in the testing environment. Besides, since we have not seen such failures on BF, I think it may not be necessary to improve the testcases. [1] 2024-12-24 01:37:19.967 CET [161409] sub STATEMENT: START_REPLICATION SLOT "lsub4_slot" LOGICAL 0/0 (proto_version '4',streaming 'parallel', origin 'any', publication_names '"pub"') ... 2024-12-24 01:37:20.025 CET [161447] 019_replslot_limit.pl LOG: statement: SELECT '0/30003D8' <= replay_lsn AND state ='streaming' ... 2024-12-24 01:37:19.388 CET [161097] LOG: received fast shutdown request Best Regards, Hou zj