Current master hangs under the debugger after Parallel Seq Scan (Linux, MacOS) - Mailing list pgsql-hackers

From Vladlen Popolitov
Subject Current master hangs under the debugger after Parallel Seq Scan (Linux, MacOS)
Date
Msg-id f28f3b45ec84bf9dc99fe129023a2d6b@postgrespro.ru
Whole thread Raw
Responses Re: Current master hangs under the debugger after Parallel Seq Scan (Linux, MacOS)
Re: Current master hangs under the debugger after Parallel Seq Scan (Linux, MacOS)
List pgsql-hackers
Hi!

During debug session I found, that queries with Parallel Seq Scan hang
in the current master - the leader worker waits indefinitely the signal
from parallel workers. A query is not possible to break, the leader
does not check interrupt status in the waiting loop.

1. How to reproduce:
a) Create table:

CREATE DATABASE expr;
\c expr
CREATE TABLE testexpr(
id INT,
val INT
);
INSERT INTO testexpr (id, val)
SELECT serie as id , MOD(serie, 10) as val
FROM generate_series(1,1000000) as serie;
EXPLAIN (ANALYZE) SELECT * FROM testexpr
WHERE val=1 AND id<30;

b) start debugger for this connection

c) Run command (parallel workers should be enabled as it is by default
configuration)
EXPLAIN (ANALYZE) SELECT * FROM testexpr
WHERE val=1 AND id<30;

d) Above query will start parallel worker(s). When worker(s) finish(es),
it/they send SIGUSR1 that is caught by debugger. When you dimiss
the signal message, you find that query continues to run, but really it
waits (in latch.c or in waiteventset.c depending on commit version).

2. Original commit with reproducible behaviour.
I tracked this behaviour down to commit
> commit 7202d72787d3b93b692feae62ee963238580c877
> Date:   Fri Feb 21 08:03:33 2025 +0100
> backend launchers void * arguments for binary data
> Change backend launcher functions to take void * for binary data
> instead of char *.  This removes the need for numerous casts.
> Discussion: 
> https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org


It could be, that this patch activated the side problem, that already 
was
in the system before. I looked for first commit with this problem from 6 
Jan 2025,
and 2 commits hanged the same way, but both did not reproduce it after 
repeat.
Starting from the patch above, the hang is reproduced on Linux and 
MacOS.

Also I afraid, the same behaviour will be for other types of parallel
workers under debugger (Parallel Hash etc).

-- 
Best regards,

Vladlen Popolitov.



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: vacuum_truncate configuration parameter and isset_offset
Next
From: Nikolay Shaplov
Date:
Subject: Re: vacuum_truncate configuration parameter and isset_offset