Re: Parallel Full Hash Join - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Parallel Full Hash Join
Date
Msg-id CAAKRu_ak4DBfT9BCkjD-UZLcXEwewM6sWa3HWf5=prxhbKghXA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Full Hash Join  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Parallel Full Hash Join
List pgsql-hackers
On Sat, Nov 6, 2021 at 11:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> > Rebased patches attached. I will change status back to "Ready for Committer"
>
> The CI showed a crash on freebsd, which I reproduced.
> https://cirrus-ci.com/task/5203060415791104
>
> The crash is evidenced in 0001 - but only ~15% of the time.
>
> I think it's the same thing which was committed and then reverted here, so
> maybe I'm not saying anything new.
>
> https://commitfest.postgresql.org/33/3031/
> https://www.postgresql.org/message-id/flat/20200929061142.GA29096@paquier.xyz
>
> (gdb) p pstate->build_barrier->phase
> Cannot access memory at address 0x7f82e0fa42f4
>
> #1  0x00007f13de34f801 in __GI_abort () at abort.c:79
> #2  0x00005638e6a16d28 in ExceptionalCondition (conditionName=conditionName@entry=0x5638e6b62850 "!pstate ||
BarrierPhase(&pstate->build_barrier)>= PHJ_BUILD_RUN",
 
>     errorType=errorType@entry=0x5638e6a6f00b "FailedAssertion", fileName=fileName@entry=0x5638e6b625be "nodeHash.c",
lineNumber=lineNumber@entry=3305)at assert.c:69
 
> #3  0x00005638e678085b in ExecHashTableDetach (hashtable=0x5638e8e6ca88) at nodeHash.c:3305
> #4  0x00005638e6784656 in ExecShutdownHashJoin (node=node@entry=0x5638e8e57cb8) at nodeHashjoin.c:1400
> #5  0x00005638e67666d8 in ExecShutdownNode (node=0x5638e8e57cb8) at execProcnode.c:812
> #6  ExecShutdownNode (node=0x5638e8e57cb8) at execProcnode.c:772
> #7  0x00005638e67cd5b1 in planstate_tree_walker (planstate=planstate@entry=0x5638e8e58580,
walker=walker@entry=0x5638e6766680<ExecShutdownNode>, context=context@entry=0x0) at nodeFuncs.c:4009
 
> #8  0x00005638e67666b2 in ExecShutdownNode (node=0x5638e8e58580) at execProcnode.c:792
> #9  ExecShutdownNode (node=0x5638e8e58580) at execProcnode.c:772
> #10 0x00005638e67cd5b1 in planstate_tree_walker (planstate=planstate@entry=0x5638e8e58418,
walker=walker@entry=0x5638e6766680<ExecShutdownNode>, context=context@entry=0x0) at nodeFuncs.c:4009
 
> #11 0x00005638e67666b2 in ExecShutdownNode (node=0x5638e8e58418) at execProcnode.c:792
> #12 ExecShutdownNode (node=node@entry=0x5638e8e58418) at execProcnode.c:772
> #13 0x00005638e675f518 in ExecutePlan (execute_once=<optimized out>, dest=0x5638e8df0058, direction=<optimized out>,
numberTuples=0,sendTuples=<optimized out>, operation=CMD_SELECT,
 
>     use_parallel_mode=<optimized out>, planstate=0x5638e8e58418, estate=0x5638e8e57a10) at execMain.c:1658
> #14 standard_ExecutorRun () at execMain.c:410
> #15 0x00005638e6763e0a in ParallelQueryMain (seg=0x5638e8d823d8, toc=0x7f13df4e9000) at execParallel.c:1493
> #16 0x00005638e663f6c7 in ParallelWorkerMain () at parallel.c:1495
> #17 0x00005638e68542e4 in StartBackgroundWorker () at bgworker.c:858
> #18 0x00005638e6860f53 in do_start_bgworker (rw=<optimized out>) at postmaster.c:5883
> #19 maybe_start_bgworkers () at postmaster.c:6108
> #20 0x00005638e68619e5 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5272
> #21 <signal handler called>
> #22 0x00007f13de425ff7 in __GI___select (nfds=nfds@entry=7, readfds=readfds@entry=0x7ffef03b8400,
writefds=writefds@entry=0x0,exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7ffef03b8360)
 
>     at ../sysdeps/unix/sysv/linux/select.c:41
> #23 0x00005638e68620ce in ServerLoop () at postmaster.c:1765
> #24 0x00005638e6863bcc in PostmasterMain () at postmaster.c:1473
> #25 0x00005638e658fd00 in main (argc=8, argv=0x5638e8d54730) at main.c:198

Yes, this looks like that issue.

I've attached a v8 set with the fix I suggested in [1] included.
(I added it to 0001).

- Melanie

[1] https://www.postgresql.org/message-id/flat/20200929061142.GA29096%40paquier.xyz

Attachment

pgsql-hackers by date:

Previous
From: Joshua Brindle
Date:
Subject: Re: Support for NSS as a libpq TLS backend
Next
From: "Bossart, Nathan"
Date:
Subject: Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir