[Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident== REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident ==REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX" - Mailing list pgsql-hackers

From Neha Sharma
Subject [Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident== REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident ==REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX"
Date
Msg-id CANiYTQsL1Hb8_Km08qd32svrqNumXLJeoGo014O7VZymgOhZEA@mail.gmail.com
Whole thread Raw
Responses Re: [Logical Replication] TRAP:FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX"  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hi,
I am getting a server crash on publication server on HEAD for the below test case.

Commit: b9c130a1fdf16cd99afb390c186d19acaea7d132

Data setup:
Publication server:
wal_level = logical
max_wal_senders = 10
max_replication_slots = 15
wal_log_hints = on
hot_standby_feedback = on
wal_receiver_status_interval = 1
listen_addresses='*'
log_min_messages=debug1
wal_sender_timeout = 0
logical_decoding_work_mem=64kB

Subscription server:
wal_level = logical
wal_log_hints = on
hot_standby_feedback = on
wal_receiver_status_interval = 1
log_min_messages=debug1
port=5433
logical_decoding_work_mem=64kB

Test case:
Publication server:
create table test(a int);
create publication test_pub for all tables;
alter table test replica identity NOTHING ;

Subscription server:
create table test(a int);
create subscription test_sub CONNECTION 'host=172.16.208.32 port=5432 dbname=postgres user=centos' PUBLICATION test_pub WITH ( slot_name = test_slot_sub);

Publication server:
insert into test values(generate_series(1,5),'aa');

After executing the DML in publication server ,it crashed with the mentioned assert.

Publication Server log File snippet:
2020-01-07 11:54:00.476 UTC [17417] DETAIL:  Streaming transactions committing after 0/163CC30, reading WAL from 0/163CC30.
2020-01-07 11:54:00.476 UTC [17417] LOG:  logical decoding found consistent point at 0/163CC30
2020-01-07 11:54:00.476 UTC [17417] DETAIL:  There are no running transactions.
TRAP: FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX", File: "proto.c", Line: 148)
postgres: walsender centos 172.16.208.32(40292) idle(ExceptionalCondition+0x53)[0x8ca453]
postgres: walsender centos 172.16.208.32(40292) idle[0x74c515]
/home/centos/PG_master/postgresql/inst/lib/pgoutput.so(+0x2114)[0x7f3bb463d114]
postgres: walsender centos 172.16.208.32(40292) idle[0x747fa8]
postgres: walsender centos 172.16.208.32(40292) idle(ReorderBufferCommit+0x12ee)[0x75187e]
postgres: walsender centos 172.16.208.32(40292) idle[0x7455a8]
postgres: walsender centos 172.16.208.32(40292) idle(LogicalDecodingProcessRecord+0x2ea)[0x74593a]
postgres: walsender centos 172.16.208.32(40292) idle[0x766c24]
postgres: walsender centos 172.16.208.32(40292) idle[0x7693a2]
postgres: walsender centos 172.16.208.32(40292) idle(exec_replication_command+0xbb1)[0x76a091]
postgres: walsender centos 172.16.208.32(40292) idle(PostgresMain+0x4b9)[0x7b1099]
postgres: walsender centos 172.16.208.32(40292) idle[0x482bc7]
postgres: walsender centos 172.16.208.32(40292) idle(PostmasterMain+0xdbf)[0x73339f]
postgres: walsender centos 172.16.208.32(40292) idle(main+0x44f)[0x48403f]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f3bc53f23d5]
postgres: walsender centos 172.16.208.32(40292) idle[0x4840a6]
2020-01-07 11:54:00.802 UTC [17359] LOG:  server process (PID 17417) was terminated by signal 6: Aborted
2020-01-07 11:54:00.802 UTC [17359] LOG:  terminating any other active server processes
2020-01-07 11:54:00.802 UTC [17413] WARNING:  terminating connection because of crash of another server process
2020-01-07 11:54:00.802 UTC [17413] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.


Stack Trace:
Core was generated by `postgres: walsender centos 172.16.208.32(40286) idle      '.
Program terminated with signal 6, Aborted.
#0  0x00007f3bc5406207 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-34.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f3bc5406207 in raise () from /lib64/libc.so.6
#1  0x00007f3bc54078f8 in abort () from /lib64/libc.so.6
#2  0x00000000008ca472 in ExceptionalCondition (
    conditionName=conditionName@entry=0xa5b9c8 "rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX", errorType=errorType@entry=0x918fe9 "FailedAssertion", fileName=fileName@entry=0xa5b903 "proto.c", lineNumber=lineNumber@entry=148) at assert.c:67
#3  0x000000000074c515 in logicalrep_write_insert (out=0xe8c768, rel=rel@entry=0x7f3bc69548d8, newtuple=0x7f3bb3e3a068) at proto.c:146
#4  0x00007f3bb463d114 in pgoutput_change (ctx=0xe8a8b0, txn=<optimized out>, relation=0x7f3bc69548d8, change=0xf3e018) at pgoutput.c:345
#5  0x0000000000747fa8 in change_cb_wrapper (cache=<optimized out>, txn=<optimized out>, relation=<optimized out>, change=<optimized out>) at logical.c:754
#6  0x000000000075187e in ReorderBufferCommit (rb=0xf2e090, xid=xid@entry=489, commit_lsn=23318344, end_lsn=<optimized out>, commit_time=commit_time@entry=631713235036663, origin_id=origin_id@entry=0,
    origin_lsn=origin_lsn@entry=0) at reorderbuffer.c:1661
#7  0x00000000007455a8 in DecodeCommit (xid=489, parsed=0x7ffec53910a0, buf=0x7ffec5391260, ctx=0xe8a8b0) at decode.c:637
#8  DecodeXactOp (ctx=0xe8a8b0, buf=buf@entry=0x7ffec5391260) at decode.c:245
#9  0x000000000074593a in LogicalDecodingProcessRecord (ctx=0xe8a8b0, record=0xe8ab70) at decode.c:114
#10 0x0000000000766c24 in XLogSendLogical () at walsender.c:2806
#11 0x00000000007693a2 in WalSndLoop (send_data=send_data@entry=0x766bc0 <XLogSendLogical>) at walsender.c:2230
#12 0x000000000076a091 in StartLogicalReplication (cmd=0xee9d68) at walsender.c:1153
#13 exec_replication_command (cmd_string=cmd_string@entry=0xe652d0 "START_REPLICATION SLOT \"test_slot_sub\" LOGICAL 0/0 (proto_version '1', publication_names '\"test_pub\"')") at walsender.c:1576
#14 0x00000000007b1099 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xe90a58, dbname=0xe90960 "postgres", username=<optimized out>) at postgres.c:4287
#15 0x0000000000482bc7 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4498
#16 BackendStartup (port=0xe88920) at postmaster.c:4189
#17 ServerLoop () at postmaster.c:1727
#18 0x000000000073339f in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0xe5fe20) at postmaster.c:1400
#19 0x000000000048403f in main (argc=3, argv=0xe5fe20) at main.c:210

Thanks.
--
Regards,
Neha Sharma

pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Greatest Common Divisor
Next
From: Tom Lane
Date:
Subject: Re: Greatest Common Divisor