Thread: machine-readable explain output v2
OK, here's the second version of my machine-readable explain output patches. Previous version here: http://archives.postgresql.org/pgsql-hackers/2009-06/msg00867.php To apply the current version, you will first need to apply explain_refactor-v4.patch and explain_options-v3.patch. http://archives.postgresql.org/pgsql-hackers/2009-06/msg00865.php http://archives.postgresql.org/pgsql-hackers/2009-06/msg01094.php As before, the infrastructure patch applies first and is separated only for ease of reviewing. Based on feedback from the list, and especially from Peter Eisentraut, I've made a number of changes and improvements since the previous version of the patch: - The XML format now uses <explain xmlns="..."> rather than <pgexplain>. - The JSON format no longer quotes numeric arguments. I'm actually not really that happy with this change; it seems likely that when presented with an unquoted floating point number, most JSON parsers will convert it to a native floating point value, throwing away the information on how many significant digits were present in the original. I think we should rethink this one. I don't know of a programming language where there isn't an easy equivalent of JavaScript's parseFloat() or C's atof(). - In XML or JSON formats, return the entire result as a single row, rather than a series of rows that the caller is responsible for concatenating. Also, in XML format, return the result as type xml, rather than type text. If you want this to work even without --with-libxml, you'll need to also apply do_tup_output_datum-v2.patch, which you can find here: http://archives.postgresql.org/pgsql-hackers/2009-06/msg00973.php - Add a very simple regression test for EXPLAIN. - Add documentation. - Add a new GUC to contrib/auto_explain so that you can specify xml or json output when using that module, as requested by Dave Page. - Avoid using the "Trigger" tag to mean two different things; when displaying trigger statistics, use "Trigger-Name" and "Constraint-Name" instead of just "Trigger" and "Constraint", as suggested by Andres Freund. Along the way, fix a bug in the JSON format where {} was being used in place of []. - When VERBOSE is specified, schema-qualify all tables and functions scanned and alias-qualify all variable names appearing within quals. - I also changed the behavior with regard to displaying trigger and constraint names. Previously, the code displayed the trigger name only if there was no constraint name. Now, the XML and JSON formats always display both, and the text format displays both if VERBOSE is specified. Hope you all like it. Thanks, ...Robert
Attachment
Hi Robert, Hi all, explain_format_infrastructure-v2.patch: Applies cleanly, code changes look sensible, no behaviour changes. explain_format-v2.patch Applies after trivial conflicts (updated version attached). Once more the code changes look sensible to me, except maybe that I don't like the codeflow at some parts. That's partially related due to having to generate xml/json output by hand... One part where I find the code flow ugly is 'did_boilerplate' in report_triggers/its callsites. I can see why it is done that way, but its not exactly obvious to read when you want to find out how the format looks. Another, minor, issue is that the patch changes the FORMAT TEXT/default output if VERBOSE is specified (schema is added). I don't see that as a real problem because the format for VERBOSE is new anyway, but I thought I would mention it. XML Format: As discussed previously I would like the format to look a bit different - but everyone wants it to look different anyway, so I think its ok as one of multiple possible lowest common determinators... With the big advantage of already being implemented. I think it would be nice in the future to add some sort of 'category={planner,timing,..}' attribute, but I think that should be discussed/implemented separately. Documentation: I think it would be nice to add some more documentation about the xml format for application writers, but I think this should be a separate patch anyway. I did not yet look at the contrib/autoexplain portions. Will do that on Monday or so. Andres
Hi Robert, Hi All, On Sunday 19 July 2009 04:29:42 Andres Freund wrote: > I did not yet look at the contrib/autoexplain portions. Will do that on > Monday or so. Early Monday: Looks fine, except that the new "auto_explain.log_format" parameter is not documented. Andres
On Sun, Jul 19, 2009 at 6:54 AM, Andres Freund<andres@anarazel.de> wrote: > On Sunday 19 July 2009 04:29:42 Andres Freund wrote: >> I did not yet look at the contrib/autoexplain portions. Will do that on >> Monday or so. > Early Monday: > Looks fine, except that the new "auto_explain.log_format" parameter is not > documented. Fixed in the attached version, which is also updated based on changes to the precursor "explain options" patch. This will probably need some further adjustment after I finish implementing Tom's latest suggestions on that patch, but here it is for now. Note that these require explain-refactor v4 and explain-options v5. http://archives.postgresql.org/pgsql-hackers/2009-06/msg00865.php http://archives.postgresql.org/pgsql-hackers/2009-07/msg01389.php ...Robert
Attachment
On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote: > One part where I find the code flow ugly is 'did_boilerplate' in > report_triggers/its callsites. > I can see why it is done that way, but its not exactly obvious to read when > you want to find out how the format looks. Suggestions? > Another, minor, issue is that the patch changes the FORMAT TEXT/default output > if VERBOSE is specified (schema is added). I don't see that as a real problem > because the format for VERBOSE is new anyway, but I thought I would mention > it. Verbose isn't new, but in 8.4 all it does is displays the output list for each node. I think there's room for verbose to include some other types of verbosity that don't merit their own options. I think that the choice of what information to display shouldn't depend on what format you use to display it. The funny thing about me being the one to implement XML and JSON output is that I don't actually want to use them for anything - and especially not XML. I implemented them because they were a good (and popular) test case for the options facility, but they're not actually the options I want. And if I or others add functionality in the future to gather more information via EXPLAIN, I don't want to have to use XML or JSON to get at them. > XML Format: > As discussed previously I would like the format to look a bit different - but > everyone wants it to look different anyway, so I think its ok as one of > multiple possible lowest common determinators... With the big advantage of > already being implemented. Yes, I think there's not much advantage to changing this around. There are lots of people with lots of ideas on this, but they're all different and AFAICS there's no clear pattern. I think it's good to try to match the text output as much as possible. > I think it would be nice in the future to add some sort of > 'category={planner,timing,..}' attribute, but I think that should be > discussed/implemented separately. Agree that there are more things to be added. But I haven't taken the time to figure out exactly what. One of things I would really like to be able to get is the number of buckets and batches (expected and actual) for a hash join. Other things I've wished for: - number of tuples discarded by a filter condition on a particular node - amount of time spent constructing the output list of a node as opposed to the rest of what the node does > Documentation: > I think it would be nice to add some more documentation about the xml format > for application writers, but I think this should be a separate patch anyway. Suggestions? I have posted a new version of this patch on a separate thread; do you have time to re-review? ...Robert
Hi Robert, On Friday 31 July 2009 23:13:54 Robert Haas wrote: > On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote: > I have posted a new version of this patch on a separate thread; do you > have time to re-review? Yes, I have seen it. I plan to spent some time on it tonight and/or tomorrow. You havent pushed out to your git mirror? Andres
On Friday 31 July 2009 23:13:54 Robert Haas wrote: > On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote: > > One part where I find the code flow ugly is 'did_boilerplate' in > > report_triggers/its callsites. > > I can see why it is done that way, but its not exactly obvious to read > > when you want to find out how the format looks. > Suggestions? Will take a look when looking at the whole patch again. > > Another, minor, issue is that the patch changes the FORMAT TEXT/default > > output if VERBOSE is specified (schema is added). I don't see that as a > > real problem because the format for VERBOSE is new anyway, but I thought > > I would mention it. > Verbose isn't new, but in 8.4 all it does is displays the output list > for each node. I think there's room for verbose to include some other > types of verbosity that don't merit their own options. Well, before 8.4 it was something entirely different... So its kinda "new". > I think that the choice of what information to display shouldn't > depend on what format you use to display it. The funny thing about me > being the one to implement XML and JSON output is that I don't > actually want to use them for anything - and especially not XML. I > implemented them because they were a good (and popular) test case for > the options facility, but they're not actually the options I want. > And if I or others add functionality in the future to gather more > information via EXPLAIN, I don't want to have to use XML or JSON to > get at them. I am not particularly interested in XML itself as well. Just as yours my main interest is having the possibility to add information without causing wreckage all over the place.... Although for some of the more complex queries (And as you know I have some rather ugly/complex ones...) a graphical viewer of the plans might be nice. I am quite happy that the annoyance over a patch of mine "helped" you starting to work on this ;-) Thanks for all the work. > > I think it would be nice in the future to add some sort of > > 'category={planner,timing,..}' attribute, but I think that should be > > discussed/implemented separately. > Agree that there are more things to be added. But I haven't taken the > time to figure out exactly what. One of things I would really like to > be able to get is the number of buckets and batches (expected and > actual) for a hash join. Other things I've wished for: I think after the patch is committed there should be a big collection of wishes so we can see what further infrastructure work is going to be needed... Depending on the amount and kind of different options it might not be sufficient to simply extent struct Instrumentation/the current instrumentation infrastructure... > > Documentation: > > I think it would be nice to add some more documentation about the xml > > format for application writers, but I think this should be a separate > > patch anyway. > Suggestions? I think extending, correcting and commenting a schema like the one I provided sometime ago would be a good start. Anybody wanting to use the output should be familiar enough with that... I can try to do some of that if somebody goes over my english afterwards... Andres
On Fri, Jul 31, 2009 at 6:04 PM, Andres Freund<andres@anarazel.de> wrote: > I am quite happy that the annoyance over a patch of mine "helped" you starting > to work on this ;-) > Thanks for all the work. You're welcome, thanks for all your reviewing. For the record, I wasn't annoyed BY the patch; I was annoyed by the inability of the patch to be applied. >> > I think it would be nice in the future to add some sort of >> > 'category={planner,timing,..}' attribute, but I think that should be >> > discussed/implemented separately. >> Agree that there are more things to be added. But I haven't taken the >> time to figure out exactly what. One of things I would really like to >> be able to get is the number of buckets and batches (expected and >> actual) for a hash join. Other things I've wished for: > I think after the patch is committed there should be a big collection of > wishes so we can see what further infrastructure work is going to be needed... > Depending on the amount and kind of different options it might not be sufficient > to simply extent struct Instrumentation/the current instrumentation > infrastructure... We'll have to see. The basic options framework is already in, but I think a more far-ranging discussion should wait until post-CommitFest, whether explain (format ...) ... is committed by then or not. >> > Documentation: >> > I think it would be nice to add some more documentation about the xml >> > format for application writers, but I think this should be a separate >> > patch anyway. >> Suggestions? > I think extending, correcting and commenting a schema like the one I provided > sometime ago would be a good start. Anybody wanting to use the output should > be familiar enough with that... > I can try to do some of that if somebody goes over my english afterwards... Happy to copy-edit. ...Robert
On Fri, Jul 31, 2009 at 5:36 PM, Andres Freund<andres@anarazel.de> wrote: > Hi Robert, > > On Friday 31 July 2009 23:13:54 Robert Haas wrote: >> On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote: >> I have posted a new version of this patch on a separate thread; do you >> have time to re-review? > Yes, I have seen it. I plan to spent some time on it tonight and/or tomorrow. > > You havent pushed out to your git mirror? Ah, fixed. Sorry, forgot I made a new local branch. ...Robert