Re: subscription/015_stream sometimes breaks - Mailing list pgsql-hackers

From vignesh C
Subject Re: subscription/015_stream sometimes breaks
Date
Msg-id CALDaNm1vFH=K+N7KyVw1eg0z7tj9YuuNjuN4oRSr4VN6pvCK7A@mail.gmail.com
Whole thread Raw
In response to Re: subscription/015_stream sometimes breaks  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Wed, 23 Aug 2023 at 02:25, Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Wed, Aug 23, 2023 at 8:21 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > I didn't study it closely but it looks like there might be a second
> > deadlock, after the one that is expected by the test?  Examples from
> > the past couple of weeks:
>
> I should add, it's not correlated with the patches that cfbot is
> testing, and it's the most frequent failure for which that is the
> case.
>
>     suite     |    name    | distinct_patches | errors
> --------------+------------+------------------+--------
>  subscription | 015_stream |               47 |     61

I had noticed that it is failing because of a segmentation fault:
2023-08-22 19:07:22.403 UTC [3823023][logical replication parallel
worker][4/44:767] FATAL:  terminating logical replication worker due
to administrator command
2023-08-22 19:07:22.403 UTC [3823023][logical replication parallel
worker][4/44:767] CONTEXT:  processing remote data for replication
origin "pg_16397" during message type "STREAM STOP" in transaction 748
2023-08-22 19:07:22.404 UTC [3819892][postmaster][:0] DEBUG:
unregistering background worker "logical replication parallel apply
worker for subscription 16397"
2023-08-22 19:07:22.404 UTC [3819892][postmaster][:0] LOG:  background
worker "logical replication parallel worker" (PID 3823455) exited with
exit code 1
2023-08-22 19:07:22.404 UTC [3819892][postmaster][:0] DEBUG:
unregistering background worker "logical replication parallel apply
worker for subscription 16397"
2023-08-22 19:07:22.404 UTC [3819892][postmaster][:0] LOG:  background
worker "logical replication parallel worker" (PID 3823023) exited with
exit code 1
2023-08-22 19:07:22.419 UTC [3819892][postmaster][:0] LOG:  background
worker "logical replication apply worker" (PID 3822876) was terminated
by signal 11: Segmentation fault

The stack trace for the same generated at [1] is:
Core was generated by `postgres: subscriber: logical replication apply
worker for subscription 16397 '.
Program terminated with signal SIGSEGV, Segmentation fault.

warning: Section `.reg-xstate/3822876' in core file too small.
#0  0x00000000007b461e in logicalrep_worker_stop_internal
(worker=<optimized out>, signo=<optimized out>) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/replication/logical/launcher.c:583
583 kill(worker->proc->pid, signo);
#0  0x00000000007b461e in logicalrep_worker_stop_internal
(worker=<optimized out>, signo=<optimized out>) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/replication/logical/launcher.c:583
#1  0x00000000007b565a in logicalrep_worker_detach () at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/replication/logical/launcher.c:774
#2  0x00000000007b49ff in logicalrep_worker_onexit (code=<optimized
out>, arg=<optimized out>) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/replication/logical/launcher.c:829
#3  0x00000000008034c5 in shmem_exit (code=<optimized out>) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/ipc.c:239
#4  0x00000000008033dc in proc_exit_prepare (code=1) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/ipc.c:194
#5  0x000000000080333d in proc_exit (code=1) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/ipc.c:107
#6  0x0000000000797068 in StartBackgroundWorker () at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:827
#7  0x000000000079f257 in do_start_bgworker (rw=0x284e750) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:5734
#8  0x000000000079b541 in maybe_start_bgworkers () at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:5958
#9  0x000000000079cb51 in process_pm_pmsignal () at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:5121
#10 0x000000000079b6bb in ServerLoop () at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:1769
#11 0x000000000079aaa5 in PostmasterMain (argc=4, argv=<optimized
out>) at /home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:1462
#12 0x00000000006d82a0 in main (argc=4, argv=0x27e3fd0) at
/home/bf/bf-build/dragonet/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:198
$1 = {si_signo = 11, si_errno = 0, si_code = 1, _sifields = {_pad =
{64, 0 <repeats 27 times>}, _kill = {si_pid = 64, si_uid = 0}, _timer
= {si_tid = 64, si_overrun = 0, si_sigval = {sival_int = 0, sival_ptr
= 0x0}}, _rt = {si_pid = 64, si_uid = 0, si_sigval = {sival_int = 0,
sival_ptr = 0x0}}, _sigchld = {si_pid = 64, si_uid = 0, si_status = 0,
si_utime = 0, si_stime = 0}, _sigfault = {si_addr = 0x40, _addr_lsb =
0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = {si_band =
64, si_fd = 0}, _sigsys = {_call_addr = 0x40, _syscall = 0, _arch =
0}}}

[1] -
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=dragonet&dt=2023-08-22%2018%3A56%3A04&stg=subscription-check

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Next
From: John Naylor
Date:
Subject: Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns