Does anyone have a stress test for parallel workers ?
On a customer's new VM, I got this several times while (trying to) migrate their DB:
< 2019-07-23 10:33:51.552 CDT postgres >FATAL: postmaster exited during a parallel transaction
< 2019-07-23 10:33:51.552 CDT postgres >STATEMENT: CREATE UNIQUE INDEX
unused0_huawei_umts_nodeb_locell_201907_unique_idxON child.unused0_huawei_umts_nodeb_locell_201907 USING btree ...
There's nothing in dmesg nor in postgres logs.
At first I thought it's maybe because of a network disconnection, then I
thought it's because we ran out of space (wal), then they had a faulty blade.
After that, I'd tried and failed to reproduce it a number of times, but it's
just recurred during what was intended to be their final restore. I've set
max_parallel_workers_per_gather=0, but I'm planning to try to diagnose an issue
in another instance. Ideally a minimal test, since I'm apparently going to
have to run under gdb to see how it's dying, or even what process is failing.
DMI: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015
CentOS release 6.9 (Final)
Linux alextelsasrv01 2.6.32-754.17.1.el6.x86_64 #1 SMP Tue Jul 2 12:42:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
version | PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23), 64-bit
Justin