Re: Duplicate Workers entries in some EXPLAIN plans - Mailing list pgsql-hackers
From | Maciek Sakrejda |
---|---|
Subject | Re: Duplicate Workers entries in some EXPLAIN plans |
Date | |
Msg-id | CAOtHd0Cux3rNRVNpWiSxLGWCZk=Uzb_n=5g1Q0LODqnin-dsaQ@mail.gmail.com Whole thread Raw |
In response to | Re: Duplicate Workers entries in some EXPLAIN plans (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Duplicate Workers entries in some EXPLAIN plans
|
List | pgsql-hackers |
On Thu, Oct 24, 2019 at 6:48 PM Andres Freund <andres@anarazel.de> wrote: > Unfortunately I think the fix isn't all that trivial, due to the way we > output the per-worker information at the end of ExplainNode(), by just > dumping things into a string. It seems to me that a step in the right > direction would be for ExplainNode() to create > planstate->worker_instrument StringInfos, which can be handed to > routines like show_sort_info(), which would print the per-node > information into that, rather than directly dumping into > es->output. Most of the current "Show worker detail" would be done > earlier in ExplainNode(), at the place where we current display the > "actual rows" bit. > > ISTM that should include removing the duplication fo the the contents of > show_sort_info(), and probably also for the Gather, GatherMerge blocks > (I've apparently skipped adding the JIT information to the latter, not > sure if we ought to fix that in the stable branches). > > Any chance you want to take a stab at that? It took me a while, but I did take a stab at it (thanks for your off-list help). Attached is my patch that changes the structured formats to merge sort worker output in with costs/timing/buffers worker output. I have not touched any other worker output yet, since it's not under a Workers group as far as I can tell (so it does not exhibit the problem I originally reported). I can try to make further changes here if the approach is deemed sound. I also have not touched text output; above you had proposed > Sort Method: external merge Disk: 4920kB > Buffers: shared hit=682 read=10188, temp read=1415 written=2101 > Worker 0: actual time=130.058..130.324 rows=1324 loops=1 > Sort Method: external merge Disk: 5880kB > Buffers: shared hit=337 read=3489, temp read=505 written=739 > Worker 1: actual time=130.273..130.512 rows=1297 loops=1 > Buffers: shared hit=345 read=3507, temp read=505 written=744 > Sort Method: external merge Disk: 5920kB which makes sense to me, but I'd like to confirm this is the approach we want before I add it to the patch. This is my first C in close to a decade (and I was never much of a C programmer to begin with), so please be gentle. As Andres suggested off-list, I also changed the worker output to order fields that also occur in the parent node in the same way as the parent node. I've also added a test for the patch, and because this is really an EXPLAIN issue rather than a query feature issue, I added a src/test/regress/sql/explain.sql for the test. I added a couple of utility functions for munging json-formatted EXPLAIN plans into something we can repeatably verify in regression tests (the functions use json rather than jsonb to preserve field order). I have not added this for YAML or XML (even though they should behave the same way), since I'm not familiar with the the functions to manipulate those data types in a similar way (if they exist). My hunch is due to the similarity of structured formats, just testing JSON is enough, but I can expand/adjust tests as necessary.
Attachment
pgsql-hackers by date: