Home > mailing lists

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current) - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Date	January 27, 2023 09:23:58
Msg-id	CA+hUKG+YkAnOLrKKcy-FLjoVUV3r=L+c28gzMSL58Cv9jC4nvg@mail.gmail.com Whole thread
In response to	Re: lockup in parallel hash join on dikkop (freebsd 14.0-current) (Thomas Munro <thomas.munro@gmail.com>)
Responses	Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
List	pgsql-hackers

Tree view

After 1000 make check loops, and 1000 make -C src/test/modules/test_shm_mq
check loops, on the same FBSD 13.1 machine as elver which has failed
like this once before, I haven't been able to reproduce this on
REL_12_STABLE.  Not really sure how to chase this, but if you see this
situation again, I'd been interested to see the output of fstat -p PID
(shows bytes in pipes) and procstat -j PID (shows pending signals) for
all PIDs involved (before connecting a debugger or doing anything else
that might make it return with EINTR, after which we know it continues
happily because it then sees latch->is_set next time around the loop).
If poll() is not returning when there are bytes ready to read from the
self-pipe, which fstat can show, I think that'd indicate a kernel bug.
If procstat -j shows signals pending but somehow it's still blocked in
the syscall.  Otherwise, it might indicate a compiler or postgres bug,
but I don't have any particular theories.

pgsql-hackers by date:

From: Andres Freund
Date: 27 January 2023, 09:02:53
Subject: Re: New strategies for freezing, advancing relfrozenxid early

From: Bharath Rupireddy
Date: 27 January 2023, 09:35:01
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current) - Mailing list pgsql-hackers

Previous

Next