Thread: Wal sender segfault
I launched pg_recvlogical with 'test_decoding' plugin and wal sender was crashed with segfault after several minutes of work.
pg_recvlogical --start --slot test_slot --no-loop -d dbname -h 127.0.0.1 -p5432 -U dbuser -w -f /tmp/test_logical.xlog
postgres=# select version();
version
-----------------------------------------------------------------------------------------------
PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit
(1 row)I have core dump file (size 66G)
I launch gdb with core file and getting incomplete backtrace:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fa9248af742 in ReorderBufferGetTupleBuf ()
(gdb) bt
#0 0x00007fa9248af742 in ReorderBufferGetTupleBuf ()
#1 0x00007fa9248acde2 in LogicalDecodingProcessRecord ()
#2 0x00007fa9248b53b4 in ?? ()
#3 0x00007fa9248b6ed3 in ?? ()
#4 0x00007fa9248b7d8a in exec_replication_command ()
#5 0x00007fa9248f39fe in PostgresMain ()
#6 0x00007fa9246bb92e in ?? ()
#7 0x00007fa92489e58b in PostmasterMain ()
#8 0x00007fa9246bcac2 in main ()
What can i do to get more info about the reason of this segfault?
--
Best regards,
Dmitriy Sarafannikov
Hi, Thanks for the report! On 2016-01-22 15:45:27 +0300, Dmitriy Sarafannikov wrote: > Hi, i'm trying to test logical decoding on server under load. > I launched pg_recvlogical with 'test_decoding' plugin and wal sender was crashed with segfault after several minutes ofwork. > > pg_recvlogical --start --slot test_slot --no-loop -d dbname -h 127.0.0.1 -p5432 -U dbuser -w -f /tmp/test_logical.xlog > > postgres=# select version(); > version > ----------------------------------------------------------------------------------------------- > PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit > (1 row) I have core dump file (size 66G) > > I launch gdb with core file and getting incomplete backtrace: > > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x00007fa9248af742 in ReorderBufferGetTupleBuf () > (gdb) bt > #0 0x00007fa9248af742 in ReorderBufferGetTupleBuf () > #1 0x00007fa9248acde2 in LogicalDecodingProcessRecord () > #2 0x00007fa9248b53b4 in ?? () > #3 0x00007fa9248b6ed3 in ?? () > #4 0x00007fa9248b7d8a in exec_replication_command () > #5 0x00007fa9248f39fe in PostgresMain () > #6 0x00007fa9246bb92e in ?? () > #7 0x00007fa92489e58b in PostmasterMain () > #8 0x00007fa9246bcac2 in main () Any chance that one of modfied tables in question uses REPLICA IDENTITY FULL? There's an open bug about too large rows produced by that, which we don't currently handle correctly. I'm working on fixing that bug. > What can i do to get more info about the reason of this segfault? You could post a reproducible example... Other than that it's usually helpful to build postgres with debugging symbols enabled, that'd already give more context. Regards, Andres Freund
Thanks,Any chance that one of modfied tables in question uses REPLICA IDENTITY
FULL? There's an open bug about too large rows produced by that, which
we don't currently handle correctly. I'm working on fixing that bug.
Yes, i have 3 tables with REPLICA IDENTITY FULL. 2 of this tables have many fields and big text fields.
--
Best regards,
Dmitriy Sarafannikov
On Fri, Jan 22, 2016 at 11:19 PM, Dmitriy Sarafannikov <dimon99901@mail.ru> wrote: > Any chance that one of modfied tables in question uses REPLICA IDENTITY > FULL? There's an open bug about too large rows produced by that, which > we don't currently handle correctly. I'm working on fixing that bug. > > Thanks, > > Yes, i have 3 tables with REPLICA IDENTITY FULL. 2 of this tables have many > fields and big text fields. The original bug is here: http://www.postgresql.org/message-id/CAKg6ypLd7773AOX4DiOGRwQk1TVOQKhNwjYiVjJnpq8Wo+i62Q@mail.gmail.com -- Michael
(gdb) bt
#0 slist_pop_head_node (head=0x7fa92656bc10) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/include/lib/ilist.h:649
#1 ReorderBufferGetTupleBuf (rb=0x7fa92656bb90) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/logical/reorderbuffer.c:456
#2 0x00007fa9248acde2 in DecodeUpdate (ctx=<optimized out>, ctx=<optimized out>, buf=0x7fffecd7e300) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/logical/decode.c:651
#3 DecodeHeapOp (buf=0x7fffecd7e300, ctx=0x7fa92655db60) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/logical/decode.c:430
#4 LogicalDecodingProcessRecord (ctx=0x7fa92655db60, record=<optimized out>) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/logical/decode.c:115
#5 0x00007fa9248b53b4 in XLogSendLogical () at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/walsender.c:2428
#6 0x00007fa9248b6ed3 in WalSndLoop (send_data=send_data@entry=0x7fa9248b5350 <XLogSendLogical>) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/walsender.c:1829
#7 0x00007fa9248b7d8a in StartLogicalReplication (cmd=<optimized out>) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/walsender.c:996
#8 exec_replication_command (cmd_string=cmd_string@entry=0x7fa926494820 "START_REPLICATION SLOT \"test_slot\" LOGICAL 0/0")
at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/replication/walsender.c:1321
#9 0x00007fa9248f39fe in PostgresMain (argc=<optimized out>, argv=argv@entry=0x7fa92647b448, dbname=0x7fa92647b370 "dbname", username=<optimized out>)
at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/tcop/postgres.c:4077
#10 0x00007fa9246bb92e in BackendRun (port=0x7fa9264bc3d0) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/postmaster/postmaster.c:4252
#11 BackendStartup (port=0x7fa9264bc3d0) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/postmaster/postmaster.c:3917
#12 ServerLoop () at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/postmaster/postmaster.c:1678
#13 0x00007fa92489e58b in PostmasterMain (argc=5, argv=<optimized out>) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/postmaster/postmaster.c:1287
#14 0x00007fa9246bcac2 in main (argc=5, argv=0x7fa92647a570) at /build/postgresql-9.4-MZhK6O/postgresql-9.4-9.4.5/build/../src/backend/main/main.c:228
Воскресенье, 24 января 2016, 0:08 +09:00 от Michael Paquier <michael.paquier@gmail.com>:
<dimon99901@mail.ru> wrote:
> Any chance that one of modfied tables in question uses REPLICA IDENTITY
> FULL? There's an open bug about too large rows produced by that, which
> we don't currently handle correctly. I'm working on fixing that bug.
>
> Thanks,
>
> Yes, i have 3 tables with REPLICA IDENTITY FULL. 2 of this tables have many
> fields and big text fields.
The original bug is here:
http://www.postgresql.org/message-id/CAKg6ypLd7773AOX4DiOGRwQk1TVOQKhNwjYiVjJnpq8Wo+i62Q@mail.gmail.com
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
--
Best regards,
Dmitriy Sarafannikov