Re: PHJ file leak. - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: PHJ file leak. |
Date | |
Msg-id | CA+hUKGJ-Vuin4KPsWyd2Yi0xpvE+YK+O+oKL7+CWYKU2zYjPHA@mail.gmail.com Whole thread Raw |
In response to | Re: PHJ file leak. (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: PHJ file leak.
|
List | pgsql-hackers |
On Tue, Nov 12, 2019 at 4:20 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Mon, 11 Nov 2019 17:24:45 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in > > Although the patch seems unobjectionable as far as it goes, I'd like > > to understand why we didn't see the need for it long since. Is there > > another call to ExecParallelHashCloseBatchAccessors somewhere, and > > if so, is that one wrongly placed? > > The previous patch would be wrong. The root cause is a open batch so > the right thing to be done at scan end is > ExecHashTableDeatchBatch. And the real issue here seems to be in > ExecutePlan, not in PHJ. You are right. Here is the email I just wrote that says the same thing, but with less efficiency: On Tue, Nov 12, 2019 at 11:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > On Tue, Nov 12, 2019 at 1:24 AM Kyotaro Horiguchi > > <horikyota.ntt@gmail.com> wrote: > >> Hello. While looking a patch, I found that PHJ sometimes complains for > >> file leaks if accompanied by LIMIT. > > > Thanks for the patch! Yeah, this seems correct, but I'd like to think > > about it some more before committing. I'm going to be a bit tied up > > with travel so that might be next week. > > At this point we've missed the window for this week's releases, so > there's no great hurry (and it'd be best not to push any noncritical > patches into the back branches anyway, for the next 24 hours). > > Although the patch seems unobjectionable as far as it goes, I'd like > to understand why we didn't see the need for it long since. Is there > another call to ExecParallelHashCloseBatchAccessors somewhere, and > if so, is that one wrongly placed? I'll need to look into this some more in a few days, but here's a partial analysis: The usual way that these files get closed is by sts_end_parallel_scan(). For the particular file in question here -- an outer batch file that is open for reading while we probe -- that usually happens at the end of the per-batch probe phase before we try to move to another batch or reach end of data. In case of an early end, it also happens in ExecShutdownHashJoin(), which detaches from and cleans up shared resources, which includes closing these files. There are a few ways that ExecShutdownNode() can be reached: 1. ExecLimit() (which we now suspect to be bogus -- see nearby unfinished business[1]), though that only happens in the leader in this case 2. ExecutePlan() on end-of-data. 3. ExecutePlan() on reaching the requested tuple count. Unfortunately those aren't the only ways out of ExecutePlan()'s loop, and that may be a problem. I think what's happening in this case is that there is a race where dest->receiveSlot(slot, dest) returns false because the leader has stopped receiving tuples (having received enough tuples to satisfy the LIMIT) so we exit early, but that path out of the loop doesn't run ExecShutdownNode(). EXEC_FLAG_BACKWARD would also inhibit it, but that shouldn't be set in a parallel plan. So, after thinking about it, I'm not sure the proposed patch is conceptually sound (even if it seems to work), because ExecHashTableDestroy() runs at 'end' time and that's after shared memory has disappeared, so it shouldn't be doing shared resource-related cleanup work, whereas ExecParallelHashCloseBatchAccessors() is relates to shared resources (for example, it calls sts_end_write(), which should never actually do anything at this time but it is potentially a shm-updating routine, which seems wrong to me). Recommending a change is going to require more brainpower than I have spare today due to other commitments. ExecShutdownNode() is certainly a bit tricky. [1] https://www.postgresql.org/message-id/flat/87ims2amh6.fsf%40jsievers.enova.com
pgsql-hackers by date: