Thread: [WIP] Patches to enable extraction state of query execution from external session
Hi, hackers!
Now I complete extension that provides facility to see the current state of query execution working on external session in form of EXPLAIN ANALYZE output. This extension works on 9.5 version, for 9.6 and later it doesn't support detailed statistics for parallel nodes yet.
I want to present patches to the latest version of PostgreSQL core to enable this extension.
1. Patch that provides facility to send user signal to external backend (custom_signals.patch).
This patch transparently extends process signal interface to enable sending user defined signals (I call them - custom signals) and defining callbacks to them. Formally it will appear in following manner:
/* Register custom signal and define signal callback */
Reason = RegisterCustomProcSignalHandler(sighandler);
void
sighandler(void)
{
...
}
/* On other session we can send process signal to the first backend */
SendProcSignal(pid, Reason, backendId)
The InterruptPending flag is set under process signal handling and sighandler is called in CHECK_FOR_INTERRUPTS.
I use this patch to notice other backend to send its state to caller.
2. Patch that enables to interrupt the query executor (executor_hooks.patch).
This patch enables to hang up hooks on executor function of each node (ExecProcNode). I define hooks before any node execution and after execution.
I use this patch to add possibility of query tracing by emitted rows from any node. I interrupt query executor after any node delivers one or zero rows to upper node. And after execution of specific number trace steps I can get the query state of traceable backend which will be somewhat deterministic. I use this possibility for regression tests of my extension.
3. Patch that enables to output runtime explain statistics (runtime_explain.patch).
This patch extends the regular explain functionality. The problem is in the point that regular explain call makes result output after query execution performing InstrEndLoop on nodes where necessary. My patch introduces specific flag *runtime* that indicates whether we explain running query and does some insertions in source code dedicated to output the statistics of running query.
I want to notice that this patch is completed only for 9.5 postgres version. For later versions there is not implemented detailed statistics for parellel nodes. Now, I'm working at this feature.
That's all. CC welcome!
-- Maksim Milyutin Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Attachment
Re: [WIP] Patches to enable extraction state of query execution from external session
Hi, hackers!
Now I complete extension that provides facility to see the current state of query execution working on external session in form of EXPLAIN ANALYZE output. This extension works on 9.5 version, for 9.6 and later it doesn't support detailed statistics for parallel nodes yet.I want to present patches to the latest version of PostgreSQL core to enable this extension.
Alex
Re: [WIP] Patches to enable extraction state of query execution from external session
Hi, On 2016-08-29 18:22:56 +0300, maksim wrote: > Now I complete extension that provides facility to see the current state of > query execution working on external session in form of EXPLAIN ANALYZE > output. This extension works on 9.5 version, for 9.6 and later it doesn't > support detailed statistics for parallel nodes yet. Could you expand a bit on what you want this for exactly? > 2. Patch that enables to interrupt the query executor > (executor_hooks.patch). > This patch enables to hang up hooks on executor function of each node > (ExecProcNode). I define hooks before any node execution and after > execution. > I use this patch to add possibility of query tracing by emitted rows from > any node. I interrupt query executor after any node delivers one or zero > rows to upper node. And after execution of specific number trace steps I can > get the query state of traceable backend which will be somewhat > deterministic. I use this possibility for regression tests of my extension. This will increase executor overhead. I think we'll need to find a way to hide this behind the existing if (instrument) branches. > 3. Patch that enables to output runtime explain statistics > (runtime_explain.patch). > This patch extends the regular explain functionality. The problem is in the > point that regular explain call makes result output after query execution > performing InstrEndLoop on nodes where necessary. My patch introduces > specific flag *runtime* that indicates whether we explain running query and > does some insertions in source code dedicated to output the statistics of > running query. Unless I'm missing something this doesn't really expose a user of this functionality? Greetings, Andres Freund
Re: [WIP] Patches to enable extraction state of query execution from external session
> On Mon, Aug 29, 2016 at 5:22 PM, maksim <m.milyutin@postgrespro.ru > <mailto:m.milyutin@postgrespro.ru>> wrote: > > Hi, hackers! > > Now I complete extension that provides facility to see the current > state of query execution working on external session in form of > EXPLAIN ANALYZE output. This extension works on 9.5 version, for 9.6 > and later it doesn't support detailed statistics for parallel nodes yet. > > I want to present patches to the latest version of PostgreSQL core > to enable this extension. > > Hello, > > Did you publish the extension itself yet? > Hello, extension for version 9.5 is available in repository https://github.com/postgrespro/pg_query_state/tree/master. > Last year (actually, exactly one year ago) I was trying to do something > very similar, and it quickly turned out that signals are not the best > way to make this sort of inspection. You can find the discussion > here: https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com Thanks for link! My patch *custom_signal.patch* resolves the problem of «heavy» signal handlers. In essence, I follow the course offered in *procsignal.c* file. They define *ProcSignalReason* values on which the SUGUSR1 is multiplexed. Signal recent causes setting flags for *ProcessInterrupt* actuating, i.e. procsignal_sigusr1_handler() only sets specific flags. When CHECK_FOR_INTERRUPTS appears later on query execution *ProcessInterrupt* is called. Then triggered user defined signal handler is executed. As a result, we have a deferred signal handling. Patch *runtime_explain.patch* releases the problem with error from InstrEndLoop(). I catch all places where this unlucky function is called and wrap in checks on *runtime* flag. This flag indicates whether *ExplainQuery* is called for running query. Also I complement explain output, you can see details in README.md in repository. -- Maksim Milyutin Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: [WIP] Patches to enable extraction state of query execution from external session
> Hi, > > On 2016-08-29 18:22:56 +0300, maksim wrote: >> Now I complete extension that provides facility to see the current state of >> query execution working on external session in form of EXPLAIN ANALYZE >> output. This extension works on 9.5 version, for 9.6 and later it doesn't >> support detailed statistics for parallel nodes yet. > > Could you expand a bit on what you want this for exactly? Max goal - to push my extension to postgres core. But now it's ready only for 9.5. Prerequisites of this extension are patches presented here. >> 2. Patch that enables to interrupt the query executor >> (executor_hooks.patch). >> This patch enables to hang up hooks on executor function of each node >> (ExecProcNode). I define hooks before any node execution and after >> execution. >> I use this patch to add possibility of query tracing by emitted rows from >> any node. I interrupt query executor after any node delivers one or zero >> rows to upper node. And after execution of specific number trace steps I can >> get the query state of traceable backend which will be somewhat >> deterministic. I use this possibility for regression tests of my extension. > > This will increase executor overhead. In simple case we have checks on existence of hooks. We may suppose to use only not heavy processing on hooks under regular execution of query. In my case (query trace), I set up hooks under trace mode and throw off otherwise. > I think we'll need to find a way> to hide this behind the existing if (instrument) branches. And so can be. It doesn't matter for trace mode. But I think instrument branch is intended only for collecting statistics by nodes. >> 3. Patch that enables to output runtime explain statistics >> (runtime_explain.patch). >> This patch extends the regular explain functionality. The problem is in the >> point that regular explain call makes result output after query execution >> performing InstrEndLoop on nodes where necessary. My patch introduces >> specific flag *runtime* that indicates whether we explain running query and >> does some insertions in source code dedicated to output the statistics of >> running query. > > Unless I'm missing something this doesn't really expose a user of this > functionality? This patch is only for extension. As described in https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com regular ExplainNode() prints only statistics after query execution and asserts under InstEndLoop(). My patch releases this problem and rewrite formulas for statistic parameters appropriate to running queries without affecting regular EXPLAIN outputs. -- Maksim Milyutin Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: [WIP] Patches to enable extraction state of query execution from external session
On 2016-08-30 11:22:43 +0300, Maksim Milyutin wrote: > > Hi, > > > > On 2016-08-29 18:22:56 +0300, maksim wrote: > > > Now I complete extension that provides facility to see the current state of > > > query execution working on external session in form of EXPLAIN ANALYZE > > > output. This extension works on 9.5 version, for 9.6 and later it doesn't > > > support detailed statistics for parallel nodes yet. > > > > Could you expand a bit on what you want this for exactly? > > Max goal - to push my extension to postgres core. But now it's ready only > for 9.5. Prerequisites of this extension are patches presented here. I'm asking what you want this for. "An extension" isn't a detailed description... > > > 2. Patch that enables to interrupt the query executor > > > (executor_hooks.patch). > > > This patch enables to hang up hooks on executor function of each node > > > (ExecProcNode). I define hooks before any node execution and after > > > execution. > > > I use this patch to add possibility of query tracing by emitted rows from > > > any node. I interrupt query executor after any node delivers one or zero > > > rows to upper node. And after execution of specific number trace steps I can > > > get the query state of traceable backend which will be somewhat > > > deterministic. I use this possibility for regression tests of my extension. > > > > This will increase executor overhead. > > In simple case we have checks on existence of hooks. That *is* noticeable. > > I think we'll need to find a way > > to hide this behind the existing if (instrument) branches. > > And so can be. It doesn't matter for trace mode. But I think instrument > branch is intended only for collecting statistics by nodes. I can't follow here. That's all what analyze is about? Andres
Re: [WIP] Patches to enable extraction state of query execution from external session
> On 2016-08-30 11:22:43 +0300, Maksim Milyutin wrote: >>> Hi, >>> >>> On 2016-08-29 18:22:56 +0300, maksim wrote: >>>> Now I complete extension that provides facility to see the current state of >>>> query execution working on external session in form of EXPLAIN ANALYZE >>>> output. This extension works on 9.5 version, for 9.6 and later it doesn't >>>> support detailed statistics for parallel nodes yet. >>> >>> Could you expand a bit on what you want this for exactly? >> >> Max goal - to push my extension to postgres core. But now it's ready only >> for 9.5. Prerequisites of this extension are patches presented here. > > I'm asking what you want this for. "An extension" isn't a detailed > description... > I want to provide the facility to fetch state of query on some other backend running on the same server. In essence, it's going to be a microlevel monitoring tool. A typical use case looks like this: 1) assume 1st backend executes a simple query: select * from foo join bar on foo.c1=bar.c1 2) somebody tries to fetch state of that backend, so he addresses it through pid: select * from pg_query_state(pid := <1st_backend_pid>) 3) he'll get detailed description of state - something like this: Hash Join (Current loop: actual rows=0, loop number=1) Hash Cond: (foo.c1 = bar.c1) -> Seq Scan on foo (Currentloop: actual rows=1, loop number=1) -> Hash (Current loop: actual rows=0, loop number=1) Buckets:131072 Batches: 8 Memory Usage: 1kB -> Seq Scan on bar (Current loop: actual rows=49, loop number=1) Note that I've added *Current loop* records with mumber of emitted rows (*actual rows*) and *loop number* attached to each node. We could also add a timing info. For parallel nodes I want to print statistics for each worker separately (it's not finished yet). You could also watch my screencast (it's short enough) to get the idea: https://asciinema.org/a/981bed2lu7r8sx60u5lsjei30 > >>>> 2. Patch that enables to interrupt the query executor >>>> (executor_hooks.patch). >>>> This patch enables to hang up hooks on executor function of each node >>>> (ExecProcNode). I define hooks before any node execution and after >>>> execution. >>>> I use this patch to add possibility of query tracing by emitted rows from >>>> any node. I interrupt query executor after any node delivers one or zero >>>> rows to upper node. And after execution of specific number trace steps I can >>>> get the query state of traceable backend which will be somewhat >>>> deterministic. I use this possibility for regression tests of my extension. >>> >>> This will increase executor overhead. >> >> In simple case we have checks on existence of hooks. > > That *is* noticeable. > Then I'll really consider the case with hiding hook checking inside the "if (instrument)" statement, thanks! >>> I think we'll need to find a way >>> to hide this behind the existing if (instrument) branches. >> >> And so can be.It doesn't matter for trace mode. But I think instrument >> branch is intended only for collecting statistics by nodes.> > I can't follow here. That's all what analyze is about? > I meant that hiding hooks is not universal solution. If 'instrument' variable is empty (e.g. query without analyze) hooks become disabled. But in my case 'instrument' is initialized anyway and I don't care about it. >> 3. Patch that enables to output runtime explain statistics >> (runtime_explain.patch). >> This patch extends the regularexplain functionality. The problem is in the >> point that regular explain call makes result output after query execution >> performing InstrEndLoop on nodes where necessary. My patch introduces >> specific flag *runtime* that indicateswhether we explain running query and >> does some insertions in source code dedicated to output the statistics of >> running query. > > Unless I'm missing something this doesn't really expose a user of this > functionality?> Probably I could exclude *runtime_explain.patch* from the Postgres core through copying *explain.c* to module's directory and its further customization for my purposes. But in that case I'd have to maintain 'local explain', fixing bugs and coping with other issues from time to time (i.e. in case of major upgrade). -- Maksim Milyutin Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: [WIP] Patches to enable extraction state of query execution from external session
On Mon, Aug 29, 2016 at 5:22 PM, maksim <m.milyutin@postgrespro.ru
<mailto:m.milyutin@postgrespro.ru>> wrote:
Hi, hackers!
Now I complete extension that provides facility to see the current
state of query execution working on external session in form of
EXPLAIN ANALYZE output. This extension works on 9.5 version, for 9.6
and later it doesn't support detailed statistics for parallel nodes yet.
I want to present patches to the latest version of PostgreSQL core
to enable this extension.
Hello,
Did you publish the extension itself yet?
Hello, extension for version 9.5 is available in repository https://github.com/postgrespro/pg_query_state/tree/master. Last year (actually, exactly one year ago) I was trying to do something
very similar, and it quickly turned out that signals are not the best
way to make this sort of inspection. You can find the discussion
here: https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM =XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@ mail.gmail.com
Thanks for link!
My patch *custom_signal.patch* resolves the problem of «heavy» signal handlers. In essence, I follow the course offered in *procsignal.c* file. They define *ProcSignalReason* values on which the SUGUSR1 is multiplexed. Signal recent causes setting flags for *ProcessInterrupt* actuating, i.e. procsignal_sigusr1_handler() only sets specific flags. When CHECK_FOR_INTERRUPTS appears later on query execution *ProcessInterrupt* is called. Then triggered user defined signal handler is executed.
As a result, we have a deferred signal handling.
Re: [WIP] Patches to enable extraction state of query execution from external session
> On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin > <m.milyutin@postgrespro.ru <mailto:m.milyutin@postgrespro.ru>> wrote: > > On Mon, Aug 29, 2016 at 5:22 PM, maksim > <m.milyutin@postgrespro.ru <mailto:m.milyutin@postgrespro.ru> > <mailto:m.milyutin@postgrespro.ru > <mailto:m.milyutin@postgrespro.ru>>> wrote: > > Hi, hackers! > > Now I complete extension that provides facility to see the > current > state of query execution working on external session in form of > EXPLAIN ANALYZE output. This extension works on 9.5 version, > for 9.6 > and later it doesn't support detailed statistics for > parallel nodes yet. > > I want to present patches to the latest version of > PostgreSQL core > to enable this extension. > > Hello, > > Did you publish the extension itself yet? > > > Hello, extension for version 9.5 is available in repository > https://github.com/postgrespro/pg_query_state/tree/master > <https://github.com/postgrespro/pg_query_state/tree/master>. > > Last year (actually, exactly one year ago) I was trying to do > something > very similar, and it quickly turned out that signals are not the > best > way to make this sort of inspection. You can find the discussion > here: > https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com > <https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com> > > > Thanks for link! > > My patch *custom_signal.patch* resolves the problem of «heavy» > signal handlers. In essence, I follow the course offered in > *procsignal.c* file. They define *ProcSignalReason* values on which > the SUGUSR1 is multiplexed. Signal recent causes setting flags for > *ProcessInterrupt* actuating, i.e. procsignal_sigusr1_handler() only > sets specific flags. When CHECK_FOR_INTERRUPTS appears later on > query execution *ProcessInterrupt* is called. Then triggered user > defined signal handler is executed. > As a result, we have a deferred signal handling. > > > Yes, but the problem is that nothing gives you the guarantee that at the > moment you decide to handle the interrupt, the QueryDesc structures > you're looking at are in a good shape for Explain* functions to run on > them. Even if that appears to be the case in your testing now, there's > no way to tell if that will be the case at any future point in time. CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc structure) is more or less consistent. In these macro calls I pass QueryDesc to Explain* functions. I exactly know that elementary statistics updating functions (e.g. InstrStartNode, InstrStopNode, etc) don't contain CHECK_FOR_INTERRUPTS within itself therefore statistics at least on node level is consistent when backend will be ready to transfer its state. The problem may be in interpretation of collected statistics in Explain* functions. In my practice there was the case when wrong number of inserted rows is shown under INSERT ON CONFLICT request. That request consisted of two parts: SELECT from table and INSERT with check on predicate. And I was interrupted between these parts. Formula for inserted rows was the number of extracting rows from SELECT minus rejected rows from INSERT. And I got imaginary inserted row. I removed the printing number of inserted rows under explain of running query because I don't know whether INSERT node has processed that last row. But the remaining number of rejected rows was deterministic and I showed it. > Another problem is use if shm_mq facility, because it manipulates the > state of process latch. This is not supposed to happen to a backend > happily performing its tasks, at random due to external factors, and > this is what the proposed approach introduces In Postgres source code the most WaitLatch() call on process latch is surrounded by loop and forms the pattern like this: for (;;) { if (desired_state_has_occured) break; WaitLatch(MyLatch); CHECK_FOR_INTERRUPTS(); ResetLatch(MyLatch) } The motivation of this decision is pretty clear illustrated by the extract from comment in Postgres core: usage of "the generic process latch has to be robust against unrelated wakeups: Always check that the desired state has occurred, and wait again if not"[1]. I mean that random setting of process latch at the time of query executing don't affect on another usage of that latch later in code. 1. src/backend/storage/lmgr/proc.c:1724 for ProcWaitForSignal function -- Maksim Milyutin Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: [WIP] Patches to enable extraction state of query execution from external session
Maksim Milyutin <m.milyutin@postgrespro.ru> writes: >> On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin >> <m.milyutin@postgrespro.ru <mailto:m.milyutin@postgrespro.ru>> wrote: >> Yes, but the problem is that nothing gives you the guarantee that at the >> moment you decide to handle the interrupt, the QueryDesc structures >> you're looking at are in a good shape for Explain* functions to run on >> them. Even if that appears to be the case in your testing now, there's >> no way to tell if that will be the case at any future point in time. > CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc > structure) is more or less consistent. Really? Even if that's 100% true today, which I wouldn't bet very much on, it seems like a really dangerous property to insist on system-wide. The only restriction we have ever placed on CHECK_FOR_INTERRUPTS is that it occur at places where it'd be safe to throw elog(ERROR), and in general we don't assume that the active query is still in a usable state after an error. What you propose would amount to a new restriction that nothing can ever call any nontrivial subroutine while the active query tree is less than fully valid (because the subroutine might contain a CHECK_FOR_INTERRUPTS somewhere). That sounds impractical and unenforceable. regards, tom lane
Re: [WIP] Patches to enable extraction state of query execution from external session
01.09.2016 18:58, Tom Lane пишет: > Maksim Milyutin <m.milyutin@postgrespro.ru> writes: >>> On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin >>> <m.milyutin@postgrespro.ru <mailto:m.milyutin@postgrespro.ru>> wrote: >>> Yes, but the problem is that nothing gives you the guarantee that at the >>> moment you decide to handle the interrupt, the QueryDesc structures >>> you're looking at are in a good shape for Explain* functions to run on >>> them. Even if that appears to be the case in your testing now, there's >>> no way to tell if that will be the case at any future point in time. > >> CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc >> structure) is more or less consistent. > > Really? Even if that's 100% true today, which I wouldn't bet very much > on, it seems like a really dangerous property to insist on system-wide. > The only restriction we have ever placed on CHECK_FOR_INTERRUPTS is that > it occur at places where it'd be safe to throw elog(ERROR), and in general > we don't assume that the active query is still in a usable state after > an error. What you propose would amount to a new restriction that nothing > can ever call any nontrivial subroutine while the active query tree is > less than fully valid (because the subroutine might contain a > CHECK_FOR_INTERRUPTS somewhere). That sounds impractical and > unenforceable. Ok, thanks! I could propose a different approach: when CHECK_FOR_INTERRUPTS occurs, set a hook to be executed by the following ExecProcNode(). The hook is going to be provided by my extension. It is expected to send current query state to some recipient backend (the one who asked for it). I think the active query is consistent after any node have worked off one or zero rows. After it has sent all necessary data, the hook will disable itself. -- Maksim Milyutin Postgres Professional: http://www.postgrespro.com Russian Postgres Company