RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id TYAPR01MB586607E3786DC241054DA7F2F53B9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: Perform streaming logical transactions by background workers and parallel apply
List pgsql-hackers
Dear Hou,

Thank you for updating the patch!
While testing yours, I found that the leader apply worker has been crashed in the following case.
I will dig the failure more, but I reported here for records.


1. Change macros for forcing to write a temporary file.

```
-#define CHANGES_THRESHOLD      1000
-#define SHM_SEND_TIMEOUT_MS    10000
+#define CHANGES_THRESHOLD      10
+#define SHM_SEND_TIMEOUT_MS    100
```

2. Set logical_decoding_work_mem to 64kB on publisher

3. Insert huge data on publisher

```
publisher=# \d tbl 
                Table "public.tbl"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 c      | integer |           |          | 
Publications:
    "pub"


publisher=# BEGIN;
BEGIN
publisher=*# INSERT INTO tbl SELECT i FROM generate_series(1, 5000000) s(i);
INSERT 0 5000000
publisher=*# COMMIT;
```

-> LA crashes on subscriber! Followings are the backtrace.


```
(gdb) bt
#0  0x00007f2663ae4387 in raise () from /lib64/libc.so.6
#1  0x00007f2663ae5a78 in abort () from /lib64/libc.so.6
#2  0x0000000000ad0a95 in ExceptionalCondition (conditionName=0xcabdd0 "mqh->mqh_partial_bytes <= nbytes", 
    fileName=0xcabc30 "../src/backend/storage/ipc/shm_mq.c", lineNumber=420) at ../src/backend/utils/error/assert.c:66
#3  0x00000000008eaeb7 in shm_mq_sendv (mqh=0x271ebd8, iov=0x7ffc664a2690, iovcnt=1, nowait=false, force_flush=true)
    at ../src/backend/storage/ipc/shm_mq.c:420
#4  0x00000000008eac5a in shm_mq_send (mqh=0x271ebd8, nbytes=1, data=0x271f3c0, nowait=false, force_flush=true)
    at ../src/backend/storage/ipc/shm_mq.c:338
#5  0x0000000000880e18 in parallel_apply_free_worker (winfo=0x271f270, xid=735, stop_worker=true)
    at ../src/backend/replication/logical/applyparallelworker.c:368
#6  0x00000000008a3638 in apply_handle_stream_commit (s=0x7ffc664a2790) at
../src/backend/replication/logical/worker.c:2081
#7  0x00000000008a54da in apply_dispatch (s=0x7ffc664a2790) at ../src/backend/replication/logical/worker.c:3195
#8  0x00000000008a5a76 in LogicalRepApplyLoop (last_received=378674872) at
../src/backend/replication/logical/worker.c:3431
#9  0x00000000008a72ac in start_apply (origin_startpos=0) at ../src/backend/replication/logical/worker.c:4245
#10 0x00000000008a7d77 in ApplyWorkerMain (main_arg=0) at ../src/backend/replication/logical/worker.c:4555
#11 0x000000000084983c in StartBackgroundWorker () at ../src/backend/postmaster/bgworker.c:861
#12 0x0000000000854192 in do_start_bgworker (rw=0x26c0d20) at ../src/backend/postmaster/postmaster.c:5801
#13 0x000000000085457c in maybe_start_bgworkers () at ../src/backend/postmaster/postmaster.c:6025
#14 0x000000000085350b in sigusr1_handler (postgres_signal_arg=10) at ../src/backend/postmaster/postmaster.c:5182
#15 <signal handler called>
#16 0x00007f2663ba3b23 in __select_nocancel () from /lib64/libc.so.6
#17 0x000000000084edbc in ServerLoop () at ../src/backend/postmaster/postmaster.c:1768
#18 0x000000000084e737 in PostmasterMain (argc=3, argv=0x2690f60) at ../src/backend/postmaster/postmaster.c:1476
#19 0x000000000074adfb in main (argc=3, argv=0x2690f60) at ../src/backend/main/main.c:197
``` 

PSA the script that can reproduce the failure on my environment. 

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

pgsql-hackers by date:

Previous
From: Rahila Syed
Date:
Subject: Re: Allow single table VACUUM in transaction block
Next
From: Amit Kapila
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply