Re: Sampling-based timing for EXPLAIN ANALYZE - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Sampling-based timing for EXPLAIN ANALYZE |
Date | |
Msg-id | 9b29c49f-70a4-ee20-b565-6a4cfa0d3c2f@enterprisedb.com Whole thread Raw |
In response to | Re: Sampling-based timing for EXPLAIN ANALYZE (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Sampling-based timing for EXPLAIN ANALYZE
|
List | pgsql-hackers |
On 1/17/23 19:46, Andres Freund wrote: > Hi, > > On 2023-01-17 19:00:02 +0100, Tomas Vondra wrote: >> On 1/17/23 18:02, Andres Freund wrote: >>> On 2023-01-17 15:52:07 +0100, Tomas Vondra wrote: >>>> That also does not have issues with timestamp "rounding" - considering >>>> e.g. sample rate 1000Hz, that's 1ms between samples. And it's quite >>>> possible the node completes within 1ms, in which case >>>> >>>> (now - last_sample) >>>> >>>> ends up being 0 (assuming I correctly understand the code). >>> >>> That part should be counting in nanoseconds, I think? Unless I misunderstand >>> something? >>> >> >> The higher precision does not help, because both values come from the >> *sampled* timestamp (i.e. the one updated from the signal handler). So >> if the node happens to execute between two signals, the values are going >> to be the same, and the difference is 0. > > In that case there simply wasn't any sample for the node, and a non-timestamp > based sample counter wouldn't do anything different? > Yeah, you're right. > If you're worried about the case where a timer does fire during execution of > the node, but exactly once, that should provide a difference between the last > sampled timestamp and the current time. It'll attribute a bit too much to the > in-progress nodes, but well, that's sampling for you. > > > I think a "hybrid" explain mode might be worth thinking about. Use the > "current" sampling method for the first execution of a node, and for the first > few milliseconds of a query (or perhaps the first few timestamp > acquisitions). That provides an accurate explain analyze for short queries, > without a significant slowdown. Then switch to sampling, which provides decent > attribution for a bit longer running queries. > Yeah, this is essentially the sampling I imagined when I first read the subject of this thread. It samples which node executions to measure (and then measures those accurately), while these patches sample timestamps. > > >>>>> I've been thinking that we should consider making more of the instrumentation >>>>> code work like that. The amount of work we're doing in InstrStart/StopNode() >>>>> has steadily risen. When buffer usage and WAL usage are enabled, we're >>>>> executing over 1000 instructions! And each single Instrumentation node is ~450 >>>>> bytes, to a good degree due to having 2 BufUsage and 2 WalUsage structs >>>>> embedded. >>>>> >>>>> If we instead have InstrStartNode() set up a global pointer to the >>>>> Instrumentation node, we can make the instrumentation code modify both the >>>>> "global" counters (pgBufferUsage, pgWalUsage) and, if set, >>>>> current_instr->{pgBufferUsage, pgWalUsage}. That'll require some larger >>>>> changes - right now nodes "automatically" include the IO/WAL incurred in child >>>>> nodes, but that's just a small bit of additional summin-up to be done during >>>>> EXPLAIN. >>>>> >>>> >>>> That's certainly one way to implement that. I wonder if we could make >>>> that work without the global pointer, but I can't think of any. >>> >>> I don't see a realistic way at least. We could pass down an >>> "InstrumentationContext" through everything that needs to do IO and WAL. But >>> that seems infeasible at this point. > >> Why infeasible? > > Primarily the scale of the change. We'd have to pass down the context into all > table/index AM functions. And into a lot of bufmgr.c, xlog.c functions, > which'd require their callers to have access to the context. That's hundreds > if not thousands places. > > Adding that many function parameters might turn out to be noticable runtime > wise, due to increased register movement. I think for a number of the places > where we currently don't, we ought to use by-reference struct for the > not-always-used parameters, that then also could contain this context. > OK, I haven't realized we'd have to pass it to that many places. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: