On Wed, Sep 07, 2022 at 12:39:08PM -0700, Andres Freund wrote:
> Hi,
>
> On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote:
> > I'm not sure what is causing this, but I have seen this twice. The
> > second time without activity after changing the set of tables in a
> > PUBLICATION.
This crash happens after a reset of statistics for a slot replication
> Can you describe the steps to reproduce?
>
bin/pg_ctl -D data1 initdb
bin/pg_ctl -D data1 -l logfile1 -o "-c port=54315 -c wal_level=logical" start
bin/psql -p 54315 postgres <<EOF
create table t1 (i int primary key);
create publication pub1 for table t1;
EOF
bin/pg_ctl -D data2 initdb
bin/pg_ctl -D data2 -l logfile2 -o "-c port=54316" start
bin/psql -p 54316 postgres <<EOF
create table t1 (i int primary key);
create subscription sub1 connection 'host=/tmp port=54315 dbname=postgres' publication pub1;
EOF
bin/psql -p 54315 postgres <<EOF
select pg_stat_reset_replication_slot('sub1');
insert into t1 values(1);
EOF
> Which git commit does this happen on?
>
just tested again on f5047c1293acce3c6c3802b06825aa3a9f9aa55a
>
> > gdb says that debug_query_string contains:
> >
> > """
> > START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names
'"pub_pgbench"')START_REPLICATIONSLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
> > """
> >
> > attached the backtrace.
> >
>
> > #2 0x00005559bfd4f0ed in ExceptionalCondition (
> > conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0",
errorType=0x5559bff30e0d"FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c",
> > lineNumber=89) at assert.c:69
>
> what are statent->slotname and slot->data.name?
>
and the problem seems to be that after zero'ing the stats that includes
the name of the replication slot, this simple patch fixes it... not sure
if it's the right fix though...
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL