Re: making EXPLAIN extensible - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: making EXPLAIN extensible |
Date | |
Msg-id | CA+TgmoYViWseEenZv27tC2yaYY89iKgTTa+wyj53GTSmbkR5dQ@mail.gmail.com Whole thread Raw |
In response to | Re: making EXPLAIN extensible (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
List | pgsql-hackers |
On Mon, Mar 3, 2025 at 9:14 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > I think you meant "some time prior to PostgreSQL 10". > PostgreSQL 9.0 had 5 options, of which COSTS, BUFFERS, and FORMAT were > newly added, so only before 9.0 we had 2 options. > PostgreSQL 9.2 then added TIMING on top of that, for a total of 6 > options prior to PostgreSQL 10. Probably I meant 9 rather than 10, then. > +1, Neon would greatly appreciate infrastructure to allow extending EXPLAIN. Cool. > Does this work with parallel workers' stats? > I can't seem to figure out whether or where parallel workers would > pass through their extended explain statistics, mostly because of the > per-backend nature of ID generation making the pointers of > ExplainState->extension_state unshareable. I don't fully understand what your question is. I think there are a couple of separate points to consider here. First, I don't think we ever store an ExplainState in DSM. If we do, then the per-backend nature of ID generation is a fundamental design issue and needs to be rethought. Otherwise, I don't see why it matters. Second, I did not add a hook to allow an extension to add data to a "Worker N" section. I'm open to suggestions. Third, regardless of parallel query, there is a general problem with this infrastructure if what you want to do is print out some instrumentation data. Sure, the hooks might allow you to get control at a point where you can print some stuff, but how are you supposed to get the stuff you want to print? planduration, bufusage, and mem_counters are passed down to ExplainOnePlan(); and there's other stuff in struct Instrumentation that is used in ExplainNode(), but those approaches don't seem to scale nicely to arbitrary new things that somebody might want to measure. While I welcome ideas about how to fix that, my current view is that it's a job for a separate patch set. In general, it's expected that each parallel-aware node may register a shm_toc entry using the plan_node_id as the key. So if you wanted per-worker instrumentation of any sort of some particular node, you could possibly add it to that chunk of memory. This would work well, for example, for a custom scan, or any other case where the node is under the control over the same code that is trying to instrument stuff. A patch to core could extend both the node's DSM footprint and the explain.c code that prints data from it. However, if you want to do something like "for every executor node, count the number of flying spaghetti monster tendrils that pass through the computer during the execution of that node," there's not really any great way of doing that today, with or without this patch, and with or without parallel query. I mean, you can patch core, but that's it; there's no extensibility here. I'm not sure if any of this is responsive to your actual question; if not, please help me get on the right track. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: