On Sat, Apr 8, 2023 at 12:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Thomas Munro <thomas.munro@gmail.com> writes:
> > I committed the main patch.
>
> BTW, it was easy to miss in all the buildfarm noise from
> last-possible-minute patches, but chimaera just showed something
> that looks like a bug in this code [1]:
>
> 2023-04-08 12:25:28.709 UTC [18027:321] pg_regress/join_hash LOG: statement: savepoint settings;
> 2023-04-08 12:25:28.709 UTC [18027:322] pg_regress/join_hash LOG: statement: set local
max_parallel_workers_per_gather= 2;
> 2023-04-08 12:25:28.710 UTC [18027:323] pg_regress/join_hash LOG: statement: explain (costs off)
> select count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
> 2023-04-08 12:25:28.710 UTC [18027:324] pg_regress/join_hash LOG: statement: select count(*) from simple r full
outerjoin simple s on (r.id = 0 - s.id);
> TRAP: failed Assert("BarrierParticipants(&batch->batch_barrier) == 1"), File: "nodeHash.c", Line: 2118, PID: 19147
> postgres: parallel worker for PID 18027 (ExceptionalCondition+0x84)[0x10ae2bfa4]
> postgres: parallel worker for PID 18027 (ExecParallelPrepHashTableForUnmatched+0x224)[0x10aa67544]
> postgres: parallel worker for PID 18027 (+0x3db868)[0x10aa6b868]
> postgres: parallel worker for PID 18027 (+0x3c4204)[0x10aa54204]
> postgres: parallel worker for PID 18027 (+0x3c81b8)[0x10aa581b8]
> postgres: parallel worker for PID 18027 (+0x3b3d28)[0x10aa43d28]
> postgres: parallel worker for PID 18027 (standard_ExecutorRun+0x208)[0x10aa39768]
> postgres: parallel worker for PID 18027 (ParallelQueryMain+0x2bc)[0x10aa4092c]
> postgres: parallel worker for PID 18027 (ParallelWorkerMain+0x660)[0x10a874870]
> postgres: parallel worker for PID 18027 (StartBackgroundWorker+0x2a8)[0x10ab8abf8]
> postgres: parallel worker for PID 18027 (+0x50290c)[0x10ab9290c]
> postgres: parallel worker for PID 18027 (+0x5035e4)[0x10ab935e4]
> postgres: parallel worker for PID 18027 (PostmasterMain+0x1304)[0x10ab96334]
> postgres: parallel worker for PID 18027 (main+0x86c)[0x10a79daec]
Having not done much debugging on buildfarm animals before, I don't
suppose there is any way to get access to the core itself? I'd like to
see how many participants the batch barrier had at the time of the
assertion failure. I assume it was 2, but I just wanted to make sure I
understand the race.
- Melanie