Thread: START_REPLICATION SLOT causing a crash in an assert build
Hi, I'm not sure what is causing this, but I have seen this twice. The second time without activity after changing the set of tables in a PUBLICATION. gdb says that debug_query_string contains: """ START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATIONSLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"') """ attached the backtrace. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Attachment
Hi, On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote: > I'm not sure what is causing this, but I have seen this twice. The > second time without activity after changing the set of tables in a > PUBLICATION. Can you describe the steps to reproduce? Which git commit does this happen on? > gdb says that debug_query_string contains: > > """ > START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATIONSLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"') > """ > > attached the backtrace. > > #2 0x00005559bfd4f0ed in ExceptionalCondition ( > conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d"FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c", > lineNumber=89) at assert.c:69 what are statent->slotname and slot->data.name? Greetings, Andres Freund
On Wed, Sep 07, 2022 at 12:39:08PM -0700, Andres Freund wrote: > Hi, > > On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote: > > I'm not sure what is causing this, but I have seen this twice. The > > second time without activity after changing the set of tables in a > > PUBLICATION. > > Can you describe the steps to reproduce? > I'm still trying to determine that > Which git commit does this happen on? > 6e55ea79faa56db85a2b6c5bf94cee8acf8bfdb8 (Stamp 15beta4) > > > gdb says that debug_query_string contains: > > > > """ > > START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATIONSLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"') > > """ > > > > attached the backtrace. > > > > > #2 0x00005559bfd4f0ed in ExceptionalCondition ( > > conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d"FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c", > > lineNumber=89) at assert.c:69 > > what are statent->slotname and slot->data.name? > slot->data.name seems to be the replication_slot record, and statent->slotname comes from the in shared memory stats for that slot. And the assert happens when &statent->slotname.data comes empty, which is not frequent but it happens from time to time btw, while I'm looking at this I found that we can drop a publication while there are active subscriptions pointing to it, is that something we should allow? anyway, that is not the cause of this because the replication slot actually exists. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
On Wed, Sep 07, 2022 at 12:39:08PM -0700, Andres Freund wrote: > Hi, > > On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote: > > I'm not sure what is causing this, but I have seen this twice. The > > second time without activity after changing the set of tables in a > > PUBLICATION. This crash happens after a reset of statistics for a slot replication > Can you describe the steps to reproduce? > bin/pg_ctl -D data1 initdb bin/pg_ctl -D data1 -l logfile1 -o "-c port=54315 -c wal_level=logical" start bin/psql -p 54315 postgres <<EOF create table t1 (i int primary key); create publication pub1 for table t1; EOF bin/pg_ctl -D data2 initdb bin/pg_ctl -D data2 -l logfile2 -o "-c port=54316" start bin/psql -p 54316 postgres <<EOF create table t1 (i int primary key); create subscription sub1 connection 'host=/tmp port=54315 dbname=postgres' publication pub1; EOF bin/psql -p 54315 postgres <<EOF select pg_stat_reset_replication_slot('sub1'); insert into t1 values(1); EOF > Which git commit does this happen on? > just tested again on f5047c1293acce3c6c3802b06825aa3a9f9aa55a > > > gdb says that debug_query_string contains: > > > > """ > > START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATIONSLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"') > > """ > > > > attached the backtrace. > > > > > #2 0x00005559bfd4f0ed in ExceptionalCondition ( > > conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d"FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c", > > lineNumber=89) at assert.c:69 > > what are statent->slotname and slot->data.name? > and the problem seems to be that after zero'ing the stats that includes the name of the replication slot, this simple patch fixes it... not sure if it's the right fix though... -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Attachment
Nice finding. At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > and the problem seems to be that after zero'ing the stats that includes > the name of the replication slot, this simple patch fixes it... not sure > if it's the right fix though... That doesn't work. since what that function clears is not the name in the slot struct but that in stats entry. The cause is what pg_stat_reset_replslot wants to do does not match what pgstat feature thinks as reset. Unfortunately, we don't have a function to leave a portion alone after a reset. However, fetching the stats entry in pgstat_reset_replslot is ugly.. I'm not sure this is less uglier but it works if pgstat_report_replslot sets the name if it is found to be wiped out... But that hafly nullify the protction by the assertion just after. --- a/src/backend/utils/activity/pgstat_replslot.c +++ b/src/backend/utils/activity/pgstat_replslot.c @@ -83,9 +83,11 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re statent = &shstatent->stats; /* - * Any mismatch should have been fixed in pgstat_create_replslot() or - * pgstat_acquire_replslot(). + * pgstat_create_replslot() and pgstat_acquire_replslot() enters the name, + * but pgstat_reset_replslot() may clear it. */ + if (statent->slotname.data[0] == 0) + namestrcpy(&shstatent->stats.slotname, NameStr(slot->data.name)); Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0); Another measure would be to add the region to wipe-out on reset to PgStat_KindInfo, but it seems too much.. (attached) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote: > Nice finding. > > At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > > and the problem seems to be that after zero'ing the stats that includes > > the name of the replication slot, this simple patch fixes it... not sure > > if it's the right fix though... > > That doesn't work. since what that function clears is not the name in > the slot struct but that in stats entry. > you're right... the curious thing is that I tested it and it worked, but now it doesn't... maybe it was too late for me... > The cause is what pg_stat_reset_replslot wants to do does not match > what pgstat feature thinks as reset. > [...] > > Another measure would be to add the region to wipe-out on reset to > PgStat_KindInfo, but it seems too much.. (attached) > This patch solves the problem, i didn't like the other solution because as you say it partly nullify the protection of the assertion. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote: > On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote: > > > > Another measure would be to add the region to wipe-out on reset to > > PgStat_KindInfo, but it seems too much.. (attached) > > > > This patch solves the problem, i didn't like the other solution because > as you say it partly nullify the protection of the assertion. > I talked too fast, while it solves the immediate problem the patch as is causes other crashes. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
At Thu, 15 Sep 2022 01:26:15 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote: > > On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote: > > > > > > Another measure would be to add the region to wipe-out on reset to > > > PgStat_KindInfo, but it seems too much.. (attached) > > > > > > > This patch solves the problem, i didn't like the other solution because > > as you say it partly nullify the protection of the assertion. > > > > I talked too fast, while it solves the immediate problem the patch as is > causes other crashes. Where did the crash happen? Is it a bug introduced by it? Or does it root to other cause? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Sep 15, 2022 at 05:30:11PM +0900, Kyotaro Horiguchi wrote: > At Thu, 15 Sep 2022 01:26:15 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > > On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote: > > > On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote: > > > > > > > > Another measure would be to add the region to wipe-out on reset to > > > > PgStat_KindInfo, but it seems too much.. (attached) > > > > > > > > > > This patch solves the problem, i didn't like the other solution because > > > as you say it partly nullify the protection of the assertion. > > > > > > > I talked too fast, while it solves the immediate problem the patch as is > > causes other crashes. > > Where did the crash happen? Is it a bug introduced by it? Or does it > root to other cause? > Just compile and run the installcheck tests. It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside pgstat_release_entry_ref() because it expects a "deadbeef", it seems to be a magic variable but cannot find what its use is. Attached a backtrace. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Attachment
At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside Thanks for the info. I reproduced by make check.. stupid.. It's the thinko about the base address of reset_off. So the attached doesn't crash.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
On Fri, Sep 16, 2022 at 02:37:17PM +0900, Kyotaro Horiguchi wrote: > At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > > It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside > > Thanks for the info. I reproduced by make check.. stupid.. > > It's the thinko about the base address of reset_off. > > So the attached doesn't crash.. > Hi, Just confirming there have been no crash since this last patch. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
At Mon, 19 Sep 2022 11:04:03 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > On Fri, Sep 16, 2022 at 02:37:17PM +0900, Kyotaro Horiguchi wrote: > > At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in > > > It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside > > > > Thanks for the info. I reproduced by make check.. stupid.. > > > > It's the thinko about the base address of reset_off. > > > > So the attached doesn't crash.. > > > > Hi, > > Just confirming there have been no crash since this last patch. Thanks for confirmation. Althouh I'm not sure whether this is the right direction, this seems to be an open item of 15? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hi, I wonder if the correct fix here wouldn't be to move the slotname out of PgStat_StatReplSlotEntry? On 2022-09-16 14:37:17 +0900, Kyotaro Horiguchi wrote: > diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c > index 6224c498c2..ed3f3af4d9 100644 > --- a/src/backend/utils/activity/pgstat.c > +++ b/src/backend/utils/activity/pgstat.c > @@ -263,6 +263,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { > .shared_size = sizeof(PgStatShared_Database), > .shared_data_off = offsetof(PgStatShared_Database, stats), > .shared_data_len = sizeof(((PgStatShared_Database *) 0)->stats), > + .reset_off = 0, > + .reset_len = sizeof(((PgStatShared_Database *) 0)->stats), > .pending_size = sizeof(PgStat_StatDBEntry), > > .flush_pending_cb = pgstat_database_flush_cb, > @@ -277,6 +279,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { > .shared_size = sizeof(PgStatShared_Relation), > .shared_data_off = offsetof(PgStatShared_Relation, stats), > .shared_data_len = sizeof(((PgStatShared_Relation *) 0)->stats), > + .reset_off = 0, > + .reset_len = sizeof(((PgStatShared_Relation *) 0)->stats), > .pending_size = sizeof(PgStat_TableStatus), > > .flush_pending_cb = pgstat_relation_flush_cb, > @@ -291,6 +295,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { > .shared_size = sizeof(PgStatShared_Function), > .shared_data_off = offsetof(PgStatShared_Function, stats), > .shared_data_len = sizeof(((PgStatShared_Function *) 0)->stats), > + .reset_off = 0, > + .reset_len = sizeof(((PgStatShared_Function *) 0)->stats), > .pending_size = sizeof(PgStat_BackendFunctionEntry), > > .flush_pending_cb = pgstat_function_flush_cb, > @@ -307,6 +313,10 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { > .shared_size = sizeof(PgStatShared_ReplSlot), > .shared_data_off = offsetof(PgStatShared_ReplSlot, stats), > .shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), > + /* reset doesn't wipe off slot name */ > + .reset_off = offsetof(PgStat_StatReplSlotEntry, spill_txns), > + .reset_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), > + offsetof(PgStat_StatReplSlotEntry, spill_txns), I'm confused what this offsetof does here? It's not even assigned to a specific field? Am I missing something? Also, wouldn't we need to subtract something of the size? > diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c > index ac98918688..09a8c3873c 100644 > --- a/src/backend/utils/activity/pgstat_shmem.c > +++ b/src/backend/utils/activity/pgstat_shmem.c > @@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header, > { > const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind); > > - memset(pgstat_get_entry_data(kind, header), 0, > - pgstat_get_entry_len(kind)); > + memset((char *)pgstat_get_entry_data(kind, header) + > + kind_info->reset_off, 0, > + kind_info->reset_len); > > if (kind_info->reset_timestamp_cb) > kind_info->reset_timestamp_cb(header, ts); This likely doesn't quite conform to what pgindent wants... Greetings, Andres Freund
Thanks! At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in > I wonder if the correct fix here wouldn't be to move the slotname out of > PgStat_StatReplSlotEntry? Ugh. Right. I thought its outer struct as purely the part for the common header. But we can freely place anything after the header part. I moved it to the outer struct. I didn't clear that part in pgstat_create_relation() because it is filled in immediately. The attached is that. > On 2022-09-16 14:37:17 +0900, Kyotaro Horiguchi wrote: ... > > @@ -307,6 +313,10 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { > > .shared_size = sizeof(PgStatShared_ReplSlot), > > .shared_data_off = offsetof(PgStatShared_ReplSlot, stats), > > .shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), > > + /* reset doesn't wipe off slot name */ > > + .reset_off = offsetof(PgStat_StatReplSlotEntry, spill_txns), > > + .reset_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), > > + offsetof(PgStat_StatReplSlotEntry, spill_txns), > > I'm confused what this offsetof does here? It's not even assigned to a > specific field? Am I missing something? > > Also, wouldn't we need to subtract something of the size? Yeah, I felt it confusing. The last line above is offset from just after the header part (it is PgStat_, not PgStatShared_). I first wrote that as you suggested but rewrote to shorter representation. > > > diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c > > index ac98918688..09a8c3873c 100644 > > --- a/src/backend/utils/activity/pgstat_shmem.c > > +++ b/src/backend/utils/activity/pgstat_shmem.c > > @@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header, > > { > > const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind); > > > > - memset(pgstat_get_entry_data(kind, header), 0, > > - pgstat_get_entry_len(kind)); > > + memset((char *)pgstat_get_entry_data(kind, header) + > > + kind_info->reset_off, 0, > > + kind_info->reset_len); > > > > if (kind_info->reset_timestamp_cb) > > kind_info->reset_timestamp_cb(header, ts); > > This likely doesn't quite conform to what pgindent wants... In the first place, it's ugly... regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote: > Thanks! > > At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in >> I wonder if the correct fix here wouldn't be to move the slotname out of >> PgStat_StatReplSlotEntry? > > Ugh. Right. I thought its outer struct as purely the part for the > common header. But we can freely place anything after the header > part. I moved it to the outer struct. I didn't clear that part in > pgstat_create_relation() because it is filled in immediately. > > The attached is that. This is still listed as an open item[1] for v15. Does this fix proposed address the issue? Thanks, Jonathan [1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items
Attachment
Hi, On 2022-10-05 13:00:53 -0400, Jonathan S. Katz wrote: > On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote: > > Thanks! > > > > At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in > > > I wonder if the correct fix here wouldn't be to move the slotname out of > > > PgStat_StatReplSlotEntry? > > > > Ugh. Right. I thought its outer struct as purely the part for the > > common header. But we can freely place anything after the header > > part. I moved it to the outer struct. I didn't clear that part in > > pgstat_create_relation() because it is filled in immediately. > > > > The attached is that. > > This is still listed as an open item[1] for v15. Does this fix proposed > address the issue? Unfortunately not - it doesn't even pass the existing tests (test_decoding/001_repl_stats fails) :(. The reason for that is that with the patch nothing restores the slotname when reading stats from disk. That turns out not to cause immediate issues, but at the next shutdown the name won't be set, and we'll serialize the stats data with an empty string as the name. I have two ideas how to fix it. As a design constraint, I'd be interested in the RMTs opinion on the following: Is a cleaner fix that changes the stats format (i.e. existing stats will be thrown away when upgrading) or one that doesn't change the stats format preferrable? Greetings, Andres Freund
On 10/5/22 8:44 PM, Andres Freund wrote: > Hi, > > On 2022-10-05 13:00:53 -0400, Jonathan S. Katz wrote: >> On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote: >>> Thanks! >>> >>> At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in >>>> I wonder if the correct fix here wouldn't be to move the slotname out of >>>> PgStat_StatReplSlotEntry? >>> >>> Ugh. Right. I thought its outer struct as purely the part for the >>> common header. But we can freely place anything after the header >>> part. I moved it to the outer struct. I didn't clear that part in >>> pgstat_create_relation() because it is filled in immediately. >>> >>> The attached is that. >> >> This is still listed as an open item[1] for v15. Does this fix proposed >> address the issue? > > Unfortunately not - it doesn't even pass the existing tests > (test_decoding/001_repl_stats fails) :(. Thanks for checking. > The reason for that is that with the patch nothing restores the slotname when > reading stats from disk. That turns out not to cause immediate issues, but at > the next shutdown the name won't be set, and we'll serialize the stats data > with an empty string as the name. Ah. > I have two ideas how to fix it. As a design constraint, I'd be interested in > the RMTs opinion on the following: > Is a cleaner fix that changes the stats format (i.e. existing stats will be > thrown away when upgrading) or one that doesn't change the stats format > preferrable? [My opinion, will let Michael/John chime in] Ideally we would keep the stats on upgrade -- I think that's the better user experience. Thanks, Jonathan
Attachment
On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote: > On 10/5/22 8:44 PM, Andres Freund wrote: >> I have two ideas how to fix it. As a design constraint, I'd be interested in >> the RMTs opinion on the following: >> Is a cleaner fix that changes the stats format (i.e. existing stats will be >> thrown away when upgrading) or one that doesn't change the stats format >> preferrable? > > [My opinion, will let Michael/John chime in] > > Ideally we would keep the stats on upgrade -- I think that's the better user > experience. The release has not happened yet, so I would be fine to remain flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the cleanest approach in place for the release and the future. At the end, we would throw automatically the data of a file that's marked with a version that does not match with what we expect at load time, so there's a limited impact on the user experience, except, well, losing these stats, of course. -- Michael
Attachment
At Thu, 6 Oct 2022 13:44:43 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote: > > On 10/5/22 8:44 PM, Andres Freund wrote: > >> I have two ideas how to fix it. As a design constraint, I'd be interested in > >> the RMTs opinion on the following: > >> Is a cleaner fix that changes the stats format (i.e. existing stats will be > >> thrown away when upgrading) or one that doesn't change the stats format > >> preferrable? > > > > [My opinion, will let Michael/John chime in] > > > > Ideally we would keep the stats on upgrade -- I think that's the better user > > experience. > > The release has not happened yet, so I would be fine to remain > flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the > cleanest approach in place for the release and the future. At the > end, we would throw automatically the data of a file that's marked > with a version that does not match with what we expect at load time, > so there's a limited impact on the user experience, except, well, > losing these stats, of course. +1. FWIW, the atttached is an example of what it looks like if we avoid file format change. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
On 10/6/22 1:10 AM, Kyotaro Horiguchi wrote: > At Thu, 6 Oct 2022 13:44:43 +0900, Michael Paquier <michael@paquier.xyz> wrote in >> On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote: >>> On 10/5/22 8:44 PM, Andres Freund wrote: >>>> I have two ideas how to fix it. As a design constraint, I'd be interested in >>>> the RMTs opinion on the following: >>>> Is a cleaner fix that changes the stats format (i.e. existing stats will be >>>> thrown away when upgrading) or one that doesn't change the stats format >>>> preferrable? >>> >>> [My opinion, will let Michael/John chime in] >>> >>> Ideally we would keep the stats on upgrade -- I think that's the better user >>> experience. >> >> The release has not happened yet, so I would be fine to remain >> flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the >> cleanest approach in place for the release and the future. Yes, I agree with this. > At the >> end, we would throw automatically the data of a file that's marked >> with a version that does not match with what we expect at load time, >> so there's a limited impact on the user experience, except, well, >> losing these stats, of course. I'm fine with this. > +1. FWIW, the atttached is an example of what it looks like if we > avoid file format change. Thanks for the quick turnaround. I'll let others chime in on the review. Thanks, Jonathan
Attachment
Hi, On 2022-10-06 14:10:46 +0900, Kyotaro Horiguchi wrote: > +1. FWIW, the atttached is an example of what it looks like if we > avoid file format change. What about if we go the other direction - simply remove the name from the stats entry at all. I don't actually think we need it anymore. Unless I am missing something right now - entirely possible! - the danger that pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a crash we throw away the old stats data and if a slot is dropped while shut down, we'll not load the slot data at startup. The attached is a rough prototype, but should be enough for Jaime to test and Horiguchi to test / review / think about. Amit, I CCed you, since you've thought a bunch about the dangers in this area too. Greetings, Andres Freund
Attachment
On Fri, Oct 7, 2022 at 8:00 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2022-10-06 14:10:46 +0900, Kyotaro Horiguchi wrote: > > +1. FWIW, the atttached is an example of what it looks like if we > > avoid file format change. > > What about if we go the other direction - simply remove the name from the > stats entry at all. I don't actually think we need it anymore. Unless I am > missing something right now - entirely possible! - the danger that > pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a > crash we throw away the old stats data and if a slot is dropped while shut > down, we'll not load the slot data at startup. +1. I think it works. Since the replication slot index doesn't change during server running we can fetch the name from ReplicationSlotCtl->replication_slots. If we don't need the name in stats entry, pgstat_acquire_replslot() is no longer necessary? Regards, -- Masahiko Sawada PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
At Fri, 7 Oct 2022 12:14:40 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in > > What about if we go the other direction - simply remove the name from the > > stats entry at all. I don't actually think we need it anymore. Unless I am > > missing something right now - entirely possible! - the danger that > > pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a > > crash we throw away the old stats data and if a slot is dropped while shut > > down, we'll not load the slot data at startup. The key point of this is this: + * XXX: I think there cannot actually be data from an older slot + * here. After a crash we throw away the old stats data and if a slot is + * dropped while shut down, we'll not load the slot data at startup. I think this is true. Assuming that we don't recreate or rename objects that have stats after writing out stats, we won't have stats for a different object with the same name. If we can rely on that fact, the existing check in pgstat_acquire_replslot() becomes useless. Thus we don't need to store object name in stats entry. I agree to that. > +1. I think it works. Since the replication slot index doesn't change > during server running we can fetch the name from > ReplicationSlotCtl->replication_slots. That access seems safe in a bit different aspect, too. Both checkpointer (and walsender) properly initialize ReplicationSlotCtl. > If we don't need the name in stats entry, pgstat_acquire_replslot() is > no longer necessary? I think so. The entry will be created at the first report. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hi, On 2022-10-07 15:30:43 +0900, Kyotaro Horiguchi wrote: > At Fri, 7 Oct 2022 12:14:40 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in > > > What about if we go the other direction - simply remove the name from the > > > stats entry at all. I don't actually think we need it anymore. Unless I am > > > missing something right now - entirely possible! - the danger that > > > pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a > > > crash we throw away the old stats data and if a slot is dropped while shut > > > down, we'll not load the slot data at startup. > > The key point of this is this: > > + * XXX: I think there cannot actually be data from an older slot > + * here. After a crash we throw away the old stats data and if a slot is > + * dropped while shut down, we'll not load the slot data at startup. > > I think this is true. Assuming that we don't recreate or rename > objects that have stats after writing out stats, we won't have stats > for a different object with the same name. Thanks for thinking through this! > If we can rely on that fact, the existing check in pgstat_acquire_replslot() > becomes useless. Thus we don't need to store object name in stats entry. I > agree to that. I don't agree with this aspect. I think it's better to ensure that the stats object exists when acquiring a slot, rather than later, when reporting. It's a lot simpler to think through the lock nesting etc that way. I'd also eventually like to remove the stats that are currently kept separately in ReorderBuffer, and that will be easier/cheaper if we can rely on the stats objects to have already been created. Greetings, Andres Freund
Hi, On 2022-10-07 12:00:56 -0700, Andres Freund wrote: > On 2022-10-07 15:30:43 +0900, Kyotaro Horiguchi wrote: > > The key point of this is this: > > > > + * XXX: I think there cannot actually be data from an older slot > > + * here. After a crash we throw away the old stats data and if a slot is > > + * dropped while shut down, we'll not load the slot data at startup. > > > > I think this is true. Assuming that we don't recreate or rename > > objects that have stats after writing out stats, we won't have stats > > for a different object with the same name. > > Thanks for thinking through this! > > If we can rely on that fact, the existing check in pgstat_acquire_replslot() > > becomes useless. Thus we don't need to store object name in stats entry. I > > agree to that. > > I don't agree with this aspect. I think it's better to ensure that the stats > object exists when acquiring a slot, rather than later, when reporting. It's a > lot simpler to think through the lock nesting etc that way. > > I'd also eventually like to remove the stats that are currently kept > separately in ReorderBuffer, and that will be easier/cheaper if we can rely on > the stats objects to have already been created. Here's a cleaned up version of my earlier prototype. - I wrapped the access to ReplicationSlotCtl->replication_slots[i].data.name in a new function bool ReplicationSlotName(index, name). I'm not entirely happy with that name, as it sounds like a more general accessor than it is - I toyed with ReplicationSlotNameForIndex(), but that seemed somewhat unnecessary, I don't forsee a need for another name accessor. Anyone wants to weigh in? - Substantial comment and commit message polishing - I'm planning to drop PgStat_StatReplSlotEntry.slotname and a PGSTAT_FILE_FORMAT_ID bump in master and to rename slotname to slotname_unused in 15. - Self-review found a bug, I copy-pasted create=false in the call to pgstat_get_entry_ref() in pgstat_acquire_replslot(). This did *NOT* cause any test failures - clearly our test coverage in this area is woefully inadequate. - This patch does not contain a test for the fix. I think this can only be tested by a tap test starting pg_recvlogical in the background and checking pg_recvlogical's output. That type of test is notoriously hard to be reliable, so committing it shortly before the release is wrapped seems like a bad idea. I manually verified that: - a continually active slot reporting stats after pgstat_reset_replslot() works correctly (this is what crashed before) - replslot stats reporting works correctly after stats were removed - upgrading from pre-fix to post-fix preserves stats (when keeping slotname / not increasing the stats version, of course) I'm planning to push this either later tonight (if I feel up to it after cooking dinner) or tomorrow morning PST, due to the release wrap deadline. Greetings, Andres Freund
Attachment
Hi, On 2022-10-07 19:56:33 -0700, Andres Freund wrote: > I'm planning to push this either later tonight (if I feel up to it after > cooking dinner) or tomorrow morning PST, due to the release wrap deadline. I looked this over again, tested a bit more, and pushed the adjusted 15 and master versions to github to get a CI run. Once that passes, as I expect, I'll push them for real. Greetings, Andres Freund
On 2022-10-08 09:53:50 -0700, Andres Freund wrote: > On 2022-10-07 19:56:33 -0700, Andres Freund wrote: > > I'm planning to push this either later tonight (if I feel up to it after > > cooking dinner) or tomorrow morning PST, due to the release wrap deadline. > > I looked this over again, tested a bit more, and pushed the adjusted 15 and > master versions to github to get a CI run. Once that passes, as I expect, I'll > push them for real. Those passed and thus pushed. Thanks for the report, debugging and review everyone! I think we need at least the following tests for replslots: - a reset while decoding is ongoing works correctly - replslot stats continue to be accumulated after stats have been removed I wonder how much it'd take to teach isolationtester to handle the replication protocol...
On 10/8/22 1:40 PM, Andres Freund wrote: > On 2022-10-08 09:53:50 -0700, Andres Freund wrote: >> On 2022-10-07 19:56:33 -0700, Andres Freund wrote: >>> I'm planning to push this either later tonight (if I feel up to it after >>> cooking dinner) or tomorrow morning PST, due to the release wrap deadline. >> >> I looked this over again, tested a bit more, and pushed the adjusted 15 and >> master versions to github to get a CI run. Once that passes, as I expect, I'll >> push them for real. > > Those passed and thus pushed. > > Thanks for the report, debugging and review everyone! Thanks for the quick turnaround! I've closed the open item. Thanks, Jonathan
Attachment
On Sun, Oct 9, 2022 at 2:42 AM Andres Freund <andres@anarazel.de> wrote: > > On 2022-10-08 09:53:50 -0700, Andres Freund wrote: > > On 2022-10-07 19:56:33 -0700, Andres Freund wrote: > > > I'm planning to push this either later tonight (if I feel up to it after > > > cooking dinner) or tomorrow morning PST, due to the release wrap deadline. > > > > I looked this over again, tested a bit more, and pushed the adjusted 15 and > > master versions to github to get a CI run. Once that passes, as I expect, I'll > > push them for real. > > Those passed and thus pushed. > > Thanks for the report, debugging and review everyone! Thanks! > > > I think we need at least the following tests for replslots: > - a reset while decoding is ongoing works correctly > - replslot stats continue to be accumulated after stats have been removed > > > I wonder how much it'd take to teach isolationtester to handle the replication > protocol... I think we can do these tests by using pg_recvlogical in TAP tests. I've attached a patch to do that. Regards, -- Masahiko Sawada PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Hi, On 2022-10-11 17:10:52 +0900, Masahiko Sawada wrote: > +# Reset the replication slot statistics. > +$node->safe_psql('postgres', > + "SELECT pg_stat_reset_replication_slot('regression_slot');"); > +my $result = $node->safe_psql('postgres', > + "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'" > +); Typo in the slot name "regrssion_slot" instead of "regression_slot". We can't use * here, because that'll include the reset timestamp. > +# Teardown the node so the statistics is removed. > +$pg_recvlogical->kill_kill; > +$node->teardown_node; > +$node->start; ISTM that removing the file instead of shutting down the cluster with force would make it a more targeted test. > +# Check if the replication slot statistics have been removed. > +$result = $node->safe_psql('postgres', > + "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'" > +); > +is($result, "", "replication slot statistics are removed"); Same typo as above. We can't assert a specific result here either, because recvlogical will have processed a bunch of changes. Perhaps we could check at least that the reset time is NULL? > +# Test if the replication slot staistics continue to be accumulated even after s/staistics/statistics/ Greetings, Andres Freund
On Thu, Oct 13, 2022 at 1:21 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2022-10-11 17:10:52 +0900, Masahiko Sawada wrote: > > +# Reset the replication slot statistics. > > +$node->safe_psql('postgres', > > + "SELECT pg_stat_reset_replication_slot('regression_slot');"); > > +my $result = $node->safe_psql('postgres', > > + "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'" > > +); > > Typo in the slot name "regrssion_slot" instead of "regression_slot". We can't > use * here, because that'll include the reset timestamp. Fixed. > > > > +# Teardown the node so the statistics is removed. > > +$pg_recvlogical->kill_kill; > > +$node->teardown_node; > > +$node->start; > > ISTM that removing the file instead of shutting down the cluster with force > would make it a more targeted test. Agreed. > > > > +# Check if the replication slot statistics have been removed. > > +$result = $node->safe_psql('postgres', > > + "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'" > > +); > > +is($result, "", "replication slot statistics are removed"); > > Same typo as above. We can't assert a specific result here either, because > recvlogical will have processed a bunch of changes. Perhaps we could check at > least that the reset time is NULL? Agreed. > > > > +# Test if the replication slot staistics continue to be accumulated even after > > s/staistics/statistics/ Fixed. I've attached an updated patch. I've added the common function to start pg_recvlogical and wait for it to become active. Please review it. Regards, -- Masahiko Sawada PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Hi, On 2022-10-13 15:57:28 +0900, Masahiko Sawada wrote: > I've attached an updated patch. I've added the common function to > start pg_recvlogical and wait for it to become active. Please review > it. > +# Start pg_recvlogical process and wait for it to become active. > +sub start_pg_recvlogical > +{ > + my ($node, $slot_name, $create_slot) = @_; > + > + my @cmd = ( > + 'pg_recvlogical', '-S', "$slot_name", '-d', > + $node->connstr('postgres'), > + '--start', '--no-loop', '-f', '-'); > + push @cmd, '--create-slot' if $create_slot; > + > + # start pg_recvlogical process. > + my $pg_recvlogical = IPC::Run::start(@cmd); > + > + # Wait for the replication slot to become active. > + $node->poll_query_until('postgres', > + "SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = '$slot_name' AND active_pid IS NOT NULL)" > + ) or die "slot never became active"; > + > + return $pg_recvlogical; > +} Because you never process the output from pg_recvlogical I think this test will fail if the pipe buffer is small (or more changes are made). I think either it needs to output to a file, or we need to process the output. A file seems the easier solution in this case, we don't really care about what changes are streamed to the client, right? > +$node = PostgreSQL::Test::Cluster->new('test2'); > +$node->init(allows_streaming => 'logical'); > +$node->start; > +$node->safe_psql('postgres', "CREATE TABLE test(i int)"); Why are we creating a new cluster? Initdbs takes a fair bit of time on some platforms, so this seems unnecessary? Greetings, Andres Freund
On Thu, Oct 20, 2022 at 6:54 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2022-10-13 15:57:28 +0900, Masahiko Sawada wrote: > > I've attached an updated patch. I've added the common function to > > start pg_recvlogical and wait for it to become active. Please review > > it. > > > +# Start pg_recvlogical process and wait for it to become active. > > +sub start_pg_recvlogical > > +{ > > + my ($node, $slot_name, $create_slot) = @_; > > + > > + my @cmd = ( > > + 'pg_recvlogical', '-S', "$slot_name", '-d', > > + $node->connstr('postgres'), > > + '--start', '--no-loop', '-f', '-'); > > + push @cmd, '--create-slot' if $create_slot; > > + > > + # start pg_recvlogical process. > > + my $pg_recvlogical = IPC::Run::start(@cmd); > > + > > + # Wait for the replication slot to become active. > > + $node->poll_query_until('postgres', > > + "SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = '$slot_name' AND active_pid IS NOTNULL)" > > + ) or die "slot never became active"; > > + > > + return $pg_recvlogical; > > +} > > Because you never process the output from pg_recvlogical I think this test > will fail if the pipe buffer is small (or more changes are made). I think > either it needs to output to a file, or we need to process the output. Okay, but how can we test this situation? As far as I tested, if we don't specify the redirection of pg_recvlogical's output as above, pg_recvlogical's stdout and stderr are output to the log file. So I could not reproduce the issue you're concerned about. Which pipe do you refer to? > > A file seems the easier solution in this case, we don't really care about what > changes are streamed to the client, right? > > > > +$node = PostgreSQL::Test::Cluster->new('test2'); > > +$node->init(allows_streaming => 'logical'); > > +$node->start; > > +$node->safe_psql('postgres', "CREATE TABLE test(i int)"); > > Why are we creating a new cluster? Initdbs takes a fair bit of time on some > platforms, so this seems unnecessary? Agreed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com