Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps insideExecEndGather) - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps insideExecEndGather) |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F8012A4A2F@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather) (Claudio Freire <klaussfreire@gmail.com>) |
Responses |
Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)
|
List | pgsql-hackers |
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Claudio Freire > Sent: Saturday, February 04, 2017 8:47 AM > To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com> > Cc: Amit Kapila <amit.kapila16@gmail.com>; Robert Haas > <robertmhaas@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org> > Subject: Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside > ExecEndGather) > > On Mon, Oct 31, 2016 at 11:33 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> > wrote: > > Hello, > > > > The attached patch implements the suggestion by Amit before. > > > > What I'm motivated is to collect extra run-time statistics specific to > > a particular ForeignScan/CustomScan, not only the standard > > Instrumentation; like DMA transfer rate or execution time of GPU > > kernels in my case. > > > > Per-node DSM toc is one of the best way to return run-time statistics > > to the master backend, because FDW/CSP can assign arbitrary length of > > the region according to its needs. It is quite easy to require. > > However, one problem is, the per-node DSM toc is already released when > > ExecEndNode() is called on the child node of Gather. > > > > This patch allows extensions to get control on the master backend's > > context when all the worker node gets finished but prior to release of > > the DSM segment. If FDW/CSP has its special statistics on the segment, > > it can move to the private memory area for EXPLAIN output or something > > other purpose. > > > > One design consideration is whether the hook shall be called from > > ExecParallelRetrieveInstrumentation() or ExecParallelFinish(). > > The former is a function to retrieve the standard Instrumentation > > information, thus, it is valid only if EXPLAIN ANALYZE. > > On the other hands, if we put entrypoint at ExecParallelFinish(), > > extension can get control regardless of EXPLAIN ANALYZE, however, it > > also needs an extra planstate_tree_walker(). > > If the use case for this is to gather instrumentation, I'd suggest calling > the hook in RetrieveInstrumentation, and calling it appropriately. It would > make the intended use far clearer than it is now. > > And if it saves some work, all the better. > > Until there's a use case for a non-instrumentation hook in that place, I > wouldn't add it. This level of generality sounds like a solution waiting > for a problem to solve. > The use cases I'd like to add are extension specific but significant for performance analytics. These statistics are not included in Instrumentation. For example, my problems are GPU execution time, data transfer ratio over DMA, synchronization time for GPU task completion, and so on. Only extension can know which attributes are collected during the execution, and its data format. I don't think Instrumentation fits these requirements. This is a problem I faced on the v9.6 based interface design, so I could notice it. Thanks, ---- PG-Strom Project / NEC OSS Promotion Center KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: