Thread: generic options for explain
Well, here we are! Yet another thread about some piece of information that's omitted from EXPLAIN and can't easily be added because there are a zillion things we want to add to EXPLAIN and it's not OK to bury the user[1]! I've long been of the opinion that the right way to fix this problem is to extend the syntax with some sort of extensible options syntax[2]. The current "EXPLAIN [ANALYZE] [VERBOSE] <query>" syntax does not scale to large numbers of options - it requires that the options occur in a fixed order, and that the option names all be keywords. Having gotten throughly fed up with having this conversation for the ump-teenth time, I wrote a patch to introduce just such a syntax. See attached. What I did is borrowed the generic options stuff that Peter Eisentraut introduced for FOREIGN DATA WRAPPER et. al, so you can write: EXPLAIN (option_name1 "option_value1", option_name2 "option_value2") query e.g. EXPLAIN (ANALYZE "on") query As written, this patch doesn't introduce any actual new functionality, but I think it's pretty easy to see how we could build on the syntax to add things like different types of output formats, different types of instrumentation, etc. A few other random notes: - This currently lacks documentation. If we have any consensus that this is a reasonable approach, I'll add some. - I noticed that we currently accept as a top-level SQL command an arbitrarily parenthesized SELECT statement, like ((SELECT 3)). But you can't put parentheses around any other type of statement. Even more oddly, we also accept things like (SELECT 3) ORDER BY 1, which to me makes no sense at all. But that's neither here nor there as far as this patch is concerned, except that it required some minor grammar hackery and a long comment explaining the hackery. Thoughts? ...Robert [1] http://archives.postgresql.org/message-id/4A16A8AF.2080508@anarazel.de [2] http://archives.postgresql.org/message-id/603c8f070904151758w6af25641xac831b4cb71c4184@mail.gmail.com
Attachment
Hello why we develop a new syntax? we should have a secondary function explain_query(query_string, option) that returns setof some. Next function should be explain_query_xml. I thing so for typical use EXPLAIN statement is enough. And for machine procession some new function should be perfect. regards Pavel Stehule 2009/5/24 Robert Haas <robertmhaas@gmail.com>: > Well, here we are! Yet another thread about some piece of information > that's omitted from EXPLAIN and can't easily be added because there > are a zillion things we want to add to EXPLAIN and it's not OK to bury > the user[1]! I've long been of the opinion that the right way to fix > this problem is to extend the syntax with some sort of extensible > options syntax[2]. The current "EXPLAIN [ANALYZE] [VERBOSE] <query>" > syntax does not scale to large numbers of options - it requires that > the options occur in a fixed order, and that the option names all be > keywords. Having gotten throughly fed up with having this > conversation for the ump-teenth time, I wrote a patch to introduce > just such a syntax. See attached. > > What I did is borrowed the generic options stuff that Peter Eisentraut > introduced for FOREIGN DATA WRAPPER et. al, so you can write: > > EXPLAIN (option_name1 "option_value1", option_name2 "option_value2") query > e.g. EXPLAIN (ANALYZE "on") query > > As written, this patch doesn't introduce any actual new functionality, > but I think it's pretty easy to see how we could build on the syntax > to add things like different types of output formats, different types > of instrumentation, etc. A few other random notes: > > - This currently lacks documentation. If we have any consensus that > this is a reasonable approach, I'll add some. > - I noticed that we currently accept as a top-level SQL command an > arbitrarily parenthesized SELECT statement, like ((SELECT 3)). But > you can't put parentheses around any other type of statement. Even > more oddly, we also accept things like (SELECT 3) ORDER BY 1, which to > me makes no sense at all. But that's neither here nor there as far as > this patch is concerned, except that it required some minor grammar > hackery and a long comment explaining the hackery. > > Thoughts? > > ...Robert > > [1] http://archives.postgresql.org/message-id/4A16A8AF.2080508@anarazel.de > [2] http://archives.postgresql.org/message-id/603c8f070904151758w6af25641xac831b4cb71c4184@mail.gmail.com > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
Hi Robert, On 05/24/2009 02:47 AM, Robert Haas wrote: > Well, here we are! Yet another thread about some piece of information > that's omitted from EXPLAIN and can't easily be added because there > are a zillion things we want to add to EXPLAIN and it's not OK to bury > the user[1]! I've long been of the opinion that the right way to fix > this problem is to extend the syntax with some sort of extensible > options syntax[2]. The current "EXPLAIN [ANALYZE] [VERBOSE]<query>" > syntax does not scale to large numbers of options - it requires that > the options occur in a fixed order, and that the option names all be > keywords. Having gotten throughly fed up with having this > conversation for the ump-teenth time, I wrote a patch to introduce > just such a syntax. See attached. > What I did is borrowed the generic options stuff that Peter Eisentraut > introduced for FOREIGN DATA WRAPPER et. al, so you can write: > EXPLAIN (option_name1 "option_value1", option_name2 "option_value2") query > e.g. EXPLAIN (ANALYZE "on") query Beeing the latest cause for the frustration leading to this patch I obviously would like something like that - and I would gladly implement some additional stats suggested by others(if implementable in a reasonable timeframe) if this approach is agreed uppon. > - I noticed that we currently acce pt as a top-level SQL command an > arbitrarily parenthesized SELECT statement, like ((SELECT 3)). But > you can't put parentheses around any other type of statement. Even > more oddly, we also accept things like (SELECT 3) ORDER BY 1, which to > me makes no sense at all. I would guess that stems from supporting syntax like: (SELECT 1) UNION (SELECT 2) ORDER BY and not wanting to introduce a special path for (SELECT 1) ORDER BY For additional stats to be kept another discussion about appropriate, extensible representation suitable for different output formats probably would be needed - but thats a discussion for another day. Andres
On Sun, May 24, 2009 at 12:31 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Hello > > why we develop a new syntax? > > we should have a secondary function explain_query(query_string, > option) that returns setof some. Next function should be > explain_query_xml. I thing so for typical use EXPLAIN statement is > enough. And for machine procession some new function should be > perfect. I agree. We shouldn't be overloading EXPLAIN with a new option every time someone thinks of something new they'd like to see. XML EXPLAIN output would obviously be extensible and machine readable. We could easily produce a library of XSLT stylesheets to format the output in different ways without cluttering the server with extra code. The benefits to applications that want to read the output are also pretty obvious. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Sun, May 24, 2009 at 8:44 AM, Dave Page <dpage@pgadmin.org> wrote: > On Sun, May 24, 2009 at 12:31 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> Hello >> >> why we develop a new syntax? >> >> we should have a secondary function explain_query(query_string, >> option) that returns setof some. Next function should be >> explain_query_xml. I thing so for typical use EXPLAIN statement is >> enough. And for machine procession some new function should be >> perfect. I don't understand - why do we want to switch from a command interface to a functional interface? I was thinking: EXPLAIN (format 'xml') query... EXPLAIN (format 'json') query... What you're proposing will certainly be harder to code as well as more different from what we have now. My approach has the advantage of being able to leverage the main parser to parse both the query and the options; with this approach, you'll need to pass the query text back to the main parser and then write separate code of some sort to parse the options. Seems like extra work for no gain. The only possible advantage I can see of a functional interface is that the current implementation of EXPLAIN is treated as a utility command, so you can't do something like this: INSERT INTO foo (EXPLAIN SELECT 1); In my mind, however, fixing that would be preferable to (and probably easier than) changing the whole syntax. > I agree. We shouldn't be overloading EXPLAIN with a new option every > time someone thinks of something new they'd like to see. XML EXPLAIN > output would obviously be extensible and machine readable. We could > easily produce a library of XSLT stylesheets to format the output in > different ways without cluttering the server with extra code. The > benefits to applications that want to read the output are also pretty > obvious. Well, the major benefit of the existing EXPLAIN output is that you don't need an XSLT stylesheet to read it. You can just be sitting there in psql and do an EXPLAIN, and look at the results, and solve your problem. As we add options to EXPLAIN, I would like to see us support those in both the plaintext output and the XML output (and hopefully the JSON output) using the same syntax. If, for example, I need more information on what my hash joins are doing, I'd like to be able to do: EXPLAIN ('hash_detail', 'on') query... ...and have it tell me the # of buckets and batches for each hash join and whatever other information might be useful in that context. I *don't* want to have the solution to that problem be: run explain_xml() on your query, then install an XSLT parser, then use this pregenerated XSLT stylesheet to extract your data from the XML document you got back from step #1, then view the resulting HTML in your web browser. That may be a good solution for someone, but it certainly doesn't sound convenient for anyone who works primarily at the command line. It's also worth noting that there are some kinds of statistics (for example, I/O statistics) which can't be gathered without significantly impacting the performance of the query. So you can't just take the approach of turning on every possible instrumentation when XML output is requested. I do agree with your point that even with an extensible options syntax, we can't just add an option for any old thing that someone wants. We should try to come up with a somewhat comprehensive list of types of instrumentation that someone might want and then group them into categories, with one toggle per category. ...Robert
2009/5/24 Robert Haas <robertmhaas@gmail.com>: > On Sun, May 24, 2009 at 8:44 AM, Dave Page <dpage@pgadmin.org> wrote: >> On Sun, May 24, 2009 at 12:31 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> Hello >>> >>> why we develop a new syntax? >>> >>> we should have a secondary function explain_query(query_string, >>> option) that returns setof some. Next function should be >>> explain_query_xml. I thing so for typical use EXPLAIN statement is >>> enough. And for machine procession some new function should be >>> perfect. > > I don't understand - why do we want to switch from a command interface > to a functional interface? I was thinking: > > EXPLAIN (format 'xml') query... > EXPLAIN (format 'json') query... Personally, I thing, so this format is strange - and is some MySQLism. There are statements like SHOW TABLES and similar. My proposal doesn't change of current EXPLAIN. The sence is don't more complicate current syntax and do things simply. Function has more advantage then one: a) don't need changes in parser b) simply add new variants c) simply add new parameters > > What you're proposing will certainly be harder to code as well as more > different from what we have now. My approach has the advantage of > being able to leverage the main parser to parse both the query and the > options; with this approach, you'll need to pass the query text back > to the main parser and then write separate code of some sort to parse > the options. Seems like extra work for no gain. > > The only possible advantage I can see of a functional interface is > that the current implementation of EXPLAIN is treated as a utility > command, so you can't do something like this: > > INSERT INTO foo (EXPLAIN SELECT 1); > > In my mind, however, fixing that would be preferable to (and probably > easier than) changing the whole syntax. > >> I agree. We shouldn't be overloading EXPLAIN with a new option every >> time someone thinks of something new they'd like to see. XML EXPLAIN >> output would obviously be extensible and machine readable. We could >> easily produce a library of XSLT stylesheets to format the output in >> different ways without cluttering the server with extra code. The >> benefits to applications that want to read the output are also pretty >> obvious. > > Well, the major benefit of the existing EXPLAIN output is that you > don't need an XSLT stylesheet to read it. You can just be sitting > there in psql and do an EXPLAIN, and look at the results, and solve > your problem. As we add options to EXPLAIN, I would like to see us > support those in both the plaintext output and the XML output (and > hopefully the JSON output) using the same syntax. > > If, for example, I need more information on what my hash joins are > doing, I'd like to be able to do: > > EXPLAIN ('hash_detail', 'on') query... > I am sorry - this is really strange syntax . Who will use this syntax? For some parser is little bit better function call, than parametrized statement. Some dificulties with options should be fixed with named param (we are speaking about 8.5). select explain_xml("select ...", true as hash_detail, ...) Pavel > ...and have it tell me the # of buckets and batches for each hash join > and whatever other information might be useful in that context. I > *don't* want to have the solution to that problem be: run > explain_xml() on your query, then install an XSLT parser, then use > this pregenerated XSLT stylesheet to extract your data from the XML > document you got back from step #1, then view the resulting HTML in > your web browser. That may be a good solution for someone, but it > certainly doesn't sound convenient for anyone who works primarily at > the command line. > > It's also worth noting that there are some kinds of statistics (for > example, I/O statistics) which can't be gathered without significantly > impacting the performance of the query. So you can't just take the > approach of turning on every possible instrumentation when XML output > is requested. > > I do agree with your point that even with an extensible options > syntax, we can't just add an option for any old thing that someone > wants. We should try to come up with a somewhat comprehensive list of > types of instrumentation that someone might want and then group them > into categories, with one toggle per category. > > ...Robert >
>> EXPLAIN ('hash_detail', 'on') query... Oops, I should have written EXPLAIN (hash_detail 'on') query... can't follow my own syntax. > I am sorry - this is really strange syntax . Who will use this syntax? > For some parser is little bit better function call, than parametrized > statement. Some dificulties with options should be fixed with named > param (we are speaking about 8.5). > > select explain_xml("select ...", true as hash_detail, ...) See to me THAT is a really strange syntax, so I guess we need some more votes. ...Robert
2009/5/24 Robert Haas <robertmhaas@gmail.com>: >>> EXPLAIN ('hash_detail', 'on') query... > > Oops, I should have written EXPLAIN (hash_detail 'on') query... can't > follow my own syntax. > >> I am sorry - this is really strange syntax . Who will use this syntax? >> For some parser is little bit better function call, than parametrized >> statement. Some dificulties with options should be fixed with named >> param (we are speaking about 8.5). >> >> select explain_xml("select ...", true as hash_detail, ...) > I prefere little bit different syntax for named params like param = value, or param => value, but syntax with AS is one confirmed. > See to me THAT is a really strange syntax, so I guess we need some more votes. ok > > ...Robert >
Robert Haas wrote: >>> EXPLAIN ('hash_detail', 'on') query... >>> > > Oops, I should have written EXPLAIN (hash_detail 'on') query... can't > follow my own syntax. > > >> I am sorry - this is really strange syntax . Who will use this syntax? >> For some parser is little bit better function call, than parametrized >> statement. Some dificulties with options should be fixed with named >> param (we are speaking about 8.5). >> >> select explain_xml("select ...", true as hash_detail, ...) >> > > See to me THAT is a really strange syntax, so I guess we need some more votes. > > > Both of these seem both odd an unnecessary. Why not just have a setting called, say, explain_format which governs the output? set explain_format = 'xml, verbose'; explain select * from foo; No new function or syntax would be required. cheers andrew
On Sun, May 24, 2009 at 11:57 AM, Andrew Dunstan <andrew@dunslane.net> wrote: >> Oops, I should have written EXPLAIN (hash_detail 'on') query... can't >> follow my own syntax. >>> I am sorry - this is really strange syntax . Who will use this syntax? >>> For some parser is little bit better function call, than parametrized >>> statement. Some dificulties with options should be fixed with named >>> param (we are speaking about 8.5). >>> select explain_xml("select ...", true as hash_detail, ...) >> See to me THAT is a really strange syntax, so I guess we need some more >> votes. > Both of these seem both odd an unnecessary. Why not just have a setting > called, say, explain_format which governs the output? > > set explain_format = 'xml, verbose'; > explain select * from foo; > > No new function or syntax would be required. Well, then you have to issue two commands to do one thing. I don't much like the practice of using GUCs to control behavior that you may only want for the duration of one statement. The fundamental problem here, at least as it seems to me, is that we want more options for EXPLAIN, but the current syntax won't support it, because it requires making everything a keyword (and fixing the order). So we can either add enough punctuation to de-confuse the parser (which is what I tried to do) or we can switch to a totally different method of controlling EXPLAIN behavior (as you and Pavel are advocating). I wouldn't mind having a GUC to set the *default* explain behavior - but I'd still like to be able to override it for a particular command if I so choose. And that's not going to be possible with your syntax: if explain_format is set to 'xml, verbose' and I want plain text output for one command, how do I get it? Presumably I have to change explain_format, run my EXPLAIN, and then change it back again. Blech! My initial thought was to just use a list of key words for the parameters, like this: EXPLAIN (XML, VERBOSE) query... ...but I decided that wasn't a great idea, because it means that every parameter has to be a boolean, which is probably more limiting than we want to be. It also gets confusing because some parameters (like "XML" and "JSON") are incompatible while others (like "HASH_DETAIL" and "MERGE_DETAIL") can be used together, but there's not a lot of clue in the syntax itself that this is the case. On the other hand, if you write: EXPLAIN (FORMAT 'XML', FORMAT 'JSON') ... EXPLAIN (HASH_DETAIL 'ON', MERGE_DETAIL 'ON') ... ...it's obvious that the first case is qualitatively different from the second one. I think this makes it easier to code the behavior in a way that is consistent and correct, and also easier to document for the benefit of our users. Now, we can do this part of it even if we ultimately end up with a different overall syntax, for example: explain_query(...query..., 'xml' as format, true as hash_detail) I do suspect this will end up being a more complicated patch, and as I said before, I don't really see the point. As a point of historical interest, way back when (early 1997), we used to support this syntax: EXPLAIN WITH [COST|PLAN|FULL] query... ...but of course that's now impossible because WITH can be the first word of a select query. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > I wouldn't mind having a GUC to set the *default* explain behavior - > but I'd still like to be able to override it for a particular command > if I so choose. And that's not going to be possible with your syntax: > if explain_format is set to 'xml, verbose' and I want plain text > output for one command, how do I get it? Presumably I have to change > explain_format, run my EXPLAIN, and then change it back again. Blech! You know about SET LOCAL, no? I don't think this argument is very convincing. On the other side of the coin, I'm strongly against inventing more than one new output format for EXPLAIN, and so any argument that depends on examples such as "xml vs json" is falling on deaf ears here. I think that we only need an XML option, and so EXPLAIN ANALYZE XML ... doesn't seem untenable. What other options than those do you really need? Not ones to add or remove output fields; we'd expect the client to ignore fields it doesn't care about. regards, tom lane
On Sun, May 24, 2009 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I wouldn't mind having a GUC to set the *default* explain behavior - >> but I'd still like to be able to override it for a particular command >> if I so choose. And that's not going to be possible with your syntax: >> if explain_format is set to 'xml, verbose' and I want plain text >> output for one command, how do I get it? Presumably I have to change >> explain_format, run my EXPLAIN, and then change it back again. Blech! > > You know about SET LOCAL, no? I don't think this argument is very > convincing. I completely fail to see how that helps me. It's not faster or fewer commands to type: begin transaction; set local whatever; explain blah; commit transaction; than it is to type: set whatever; explain blah; set oldwhatever; > On the other side of the coin, I'm strongly against inventing more than > one new output format for EXPLAIN, and so any argument that depends on > examples such as "xml vs json" is falling on deaf ears here. I think > that we only need an XML option, and so EXPLAIN ANALYZE XML ... doesn't > seem untenable. What other options than those do you really need? > Not ones to add or remove output fields; we'd expect the client to > ignore fields it doesn't care about. It's not just about me; we've had many requests for new EXPLAIN features. Personally, I'd like to see the number of buckets and batches that a hash join uses (predicted and actual), and maybe (with ANALYZE) the number of hash collisions. I'd like to see memory utilization statistics (peak memory utilization of hash join, for example). Other people have requested I/O statistics (which you objected to on the grounds that it would be too much overhead, so clearly if we're ever going to do it it will have to be optional), and most recently number of tuples discarded by the filter condition. We've also had requests to suppress some information (like costs) for planner regression testing. I really don't see the point in restricting the syntax of EXPLAIN in this way. I don't know exactly what sorts of useful options people will come up with, but I do know that as long as we have an extremely limiting options syntax, they can all be shot down on the grounds that including them in the default output is too cumbersome for regular use (which is absolutely true). On the other hand, I think it's incredibly naive to suppose that EXPLAIN XML is going to make anyone very happy. There are only two ways this can work out: 1. We'll make EXPLAIN XML output everything and the kitchen sink. In this case, we'll soon get complaints along the lines of: "I can't use regular EXPLAIN because it doesn't include the field that I need, but the output of EXPLAIN XML is so voluminous that I can't read through it by hand." -or- 2. We'll be very restrictive about adding fields to EXPLAIN XML just as we are now for plain EXPLAIN, in which case we haven't solved anything. I think XML output format is a complete distraction from the real issue here, which is that there are certain pieces of information that are sometimes useful but are not useful enough to justify including them in the EXPLAIN output 100% of the time. By just punting all that stuff to EXPLAIN XML, we're just saying that we're not interested in creating a workable set of options to allow users to pick and choose the information they care about - so instead we're going to dump a huge chunk of unreadable XML and then make it the user's problem to find a tool that will extract the details that they care about. Boo, hiss. Anyway, I'm suprised by the reaction to this patch, but I'll drop it. I would like to make the EXPLAIN syntax more powerful for command-line use, and I'd implement XML format and JSON along the way just for completeness. But I don't have much interest in creating an XML output format that is the ONLY way of getting more information, because I'm a command-line user and it does me no good at all. :-( ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, May 24, 2009 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> You know about SET LOCAL, no? I don't think this argument is very >> convincing. > I completely fail to see how that helps me. Mainly, what it does is avoid having to know exactly what the old setting was. regards, tom lane
On Sun, May 24, 2009 at 11:57:13AM -0400, Andrew Dunstan wrote: > > > Robert Haas wrote: >>>> EXPLAIN ('hash_detail', 'on') query... >>>> >> >> Oops, I should have written EXPLAIN (hash_detail 'on') query... can't >> follow my own syntax. >> >> >>> I am sorry - this is really strange syntax . Who will use this syntax? >>> For some parser is little bit better function call, than parametrized >>> statement. Some dificulties with options should be fixed with named >>> param (we are speaking about 8.5). >>> >>> select explain_xml("select ...", true as hash_detail, ...) >>> >> >> See to me THAT is a really strange syntax, so I guess we need some more votes. >> >> >> > > Both of these seem both odd an unnecessary. Why not just have a setting > called, say, explain_format which governs the output? > > set explain_format = 'xml, verbose'; > explain select * from foo; > > No new function or syntax would be required. A further possibility: Oracle's equivalent of EXPLAIN doesn't actually output anything to the screen, but rather fills in a (temporary?) table somewhere with details of the query plan. I mostly found this irritating when working with Oracle, because each time I used it I had to look up an example query to generate output like PostgreSQL's EXPLAIN, which is generally what I really wanted. But since we'd still have the old EXPLAIN behavior available, perhaps something such as an Oracle-like table filler would be useful. Such a proposal doesn't answer the need to allow users to specify, for performance and other reasons, the precise subset of statistics they're interested in; for whatever it's worth, my current favorite contender in that field is EXPLAIN (a, b, c) <query>. - Josh / eggyknap
On Sun, 24 May 2009, Pavel Stehule wrote: > we should have a secondary function explain_query(query_string, > option) that returns setof some. +1. The incremental approach here should first be adding functions that actually do the work required. Then, if there's a set of those that look to be extremely useful, maybe at that point it's worth talking about how to integrate them into the parser. Starting with the parser changes rather than the parts that actually do the work is backwards. If you do it the other way around, at all times you have a patch that actually provides immediate useful value were it to be committed. Something that returns a setof can also be easily used to implement the "dump EXPLAIN to a table" feature Josh Tolley brought up (which is another common request in this area). -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Greg Smith <gsmith@gregsmith.com> writes: > On Sun, 24 May 2009, Pavel Stehule wrote: >> we should have a secondary function explain_query(query_string, >> option) that returns setof some. > +1. The incremental approach here should first be adding functions that > actually do the work required. Then, if there's a set of those that look > to be extremely useful, maybe at that point it's worth talking about how > to integrate them into the parser. Starting with the parser changes > rather than the parts that actually do the work is backwards. If you do > it the other way around, at all times you have a patch that actually > provides immediate useful value were it to be committed. > Something that returns a setof can also be easily used to implement the > "dump EXPLAIN to a table" feature Josh Tolley brought up (which is another > common request in this area). A serious problem with EXPLAIN via a function returning set, or with putting the result into a table, is that set results are logically unordered, just as table contents are. So from a strict point of view this only makes sense when the output format is designed to not depend on row ordering to convey information. We could certainly invent such a format, but I think it's a mistake to go in this direction for EXPLAIN output that is similar to the current output. regards, tom lane
On Sun, May 24, 2009 at 4:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, May 24, 2009 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> You know about SET LOCAL, no? I don't think this argument is very >>> convincing. > >> I completely fail to see how that helps me. > > Mainly, what it does is avoid having to know exactly what the old > setting was. Ah, OK, I see what you're going for. That's not really what I'm complaining about with that syntax, though.... ...Robert
On Sun, May 24, 2009 at 6:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Greg Smith <gsmith@gregsmith.com> writes: >> On Sun, 24 May 2009, Pavel Stehule wrote: >>> we should have a secondary function explain_query(query_string, >>> option) that returns setof some. > >> +1. The incremental approach here should first be adding functions that >> actually do the work required. Then, if there's a set of those that look >> to be extremely useful, maybe at that point it's worth talking about how >> to integrate them into the parser. Starting with the parser changes >> rather than the parts that actually do the work is backwards. If you do >> it the other way around, at all times you have a patch that actually >> provides immediate useful value were it to be committed. > >> Something that returns a setof can also be easily used to implement the >> "dump EXPLAIN to a table" feature Josh Tolley brought up (which is another >> common request in this area). > > A serious problem with EXPLAIN via a function returning set, or with > putting the result into a table, is that set results are logically > unordered, just as table contents are. So from a strict point of view > this only makes sense when the output format is designed to not depend > on row ordering to convey information. We could certainly invent such > a format, but I think it's a mistake to go in this direction for > EXPLAIN output that is similar to the current output. The current output format basically prints out the node type and then a collection of properties that are associated with that node, where applicable: join type, scan direction, strategy, relation, alias, startup cost, total cost, rows, width, loops, filter, join filter, various types of condition (index/recheck/tid/merge/hash), sort key. However, we tend to omit certain fields (particularly scan direction and alias) when they contain information that isn't sufficiently interesting. That's probably not a good idea for anything that's intended for machine-parsing; I think for XML or JSON or output-to-a-table we'd want to include those fields whenever possible. With that minor complaint (and the difficulty of figuring out how to avoid duplicating large portions of explain.c), XML or JSON output doesn't seem that difficult. In JSON there aren't too many ways of doing this that make sense. I would guess we'd want something like this: { "node" : "Hash Join", "jointype" : "Inner", "startup_cost" : "11.49", "total_cost" : "92.59", "rows" : "1877", "width" : "325", 'outer' : { "node" : "Seq Scan", <more stuff> } 'inner': { "node" : "Hash", <more stuff> } } XML, being XML, has 10 ways of doing something when 1 is more than sufficient, so there are a couple of ways we could go. I think this is my favorite. <HashJoin jointype="inner" startup_cost="11.49" total_cost="92.59" rows="1877" width="325"> <SeqScan ... /> <Hash ...> <SeqScan ... /> </Hash> </HashJoin> or you could just use one node type: <node type="HashJoin" jointype="inner" startup_cost="11.49" total_cost="92.59" rows="1877" width="325"> <node type="SeqScan" ... /> <node type="Hash" ...> <node type="SeqScan"... /> </node> </node> ...and then there's this style: <HashJoin> <jointype>inner</jointype> <startup_cost>11.49</startup_cost> ... <outer/> ... </outer> <inner> ... </inner> </HashJoin> ...which is incredibly long and redundant, but some people who use XML like such things. I'm sure there are other possibilities as well. With respect to table output things are a little bit less straightforward. There are two issues. With XML or JSON, any properties that aren't relevant to a particular node can be omitted altogether, whereas for a table the column list has to be consistent throughout. We can just leave the unused columns as NULL, but it will be a fairly wide table. Also, with apologies to Josh Berkus, we'll need to introduce some sort of non-natural primary key to allow children to be related to parents, because unlike XML and JSON, there's no built-in way to make one object the child of another. All that having been said, making EXPLAIN into a query (so that you can do INSERT INTO foo (EXPLAIN SELECT query...) might be useful to some people even without changing the output format at all. I think you could throw a windowing function in there to at least tag each line with its original position in the output, and some people might want just that. ...Robert
On Sun, May 24, 2009 at 6:05 PM, Greg Smith <gsmith@gregsmith.com> wrote: > On Sun, 24 May 2009, Pavel Stehule wrote: > >> we should have a secondary function explain_query(query_string, >> option) that returns setof some. > > +1. The incremental approach here should first be adding functions that > actually do the work required. Then, if there's a set of those that look to > be extremely useful, maybe at that point it's worth talking about how to > integrate them into the parser. Starting with the parser changes rather > than the parts that actually do the work is backwards. If you do it the > other way around, at all times you have a patch that actually provides > immediate useful value were it to be committed. Well, perhaps I ought to be asking what sort of features people would like to see, other than variant output formats? Maybe if we can develop some kind of wish list for EXPLAIN, it will become more obvious what the best option syntax is. ...Robert
On Sun, 24 May 2009, Tom Lane wrote: > A serious problem with EXPLAIN via a function returning set, or with > putting the result into a table, is that set results are logically > unordered, just as table contents are. Fair enough; I think Pavel and myself were presuming an implied "line number" in the output there that, as you point out, doesn't actually exist. The actual implementation detail there doesn't really change the spirit of what I was trying to suggest though: that this feature should get designed starting with the expected output and how to generate it, then work backwards towards how you pass it parameters. On that topic, I though Tom Raney was the latest to update code for the XML output at least. Code and a presentation going over everything is at http://web.cecs.pdx.edu/~raneyt/gsoc/ and there's a video at http://www.postgresqlconference.org/2008/west/talks/ A note about that got sent to this list at one point but I don't see any follow-up: http://archives.postgresql.org/pgsql-hackers/2008-06/msg00802.php Anybody have a better idea of what happened with that project than me? -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Sun, May 24, 2009 at 06:53:29PM -0400, Tom Lane wrote: > Greg Smith <gsmith@gregsmith.com> writes: > > On Sun, 24 May 2009, Pavel Stehule wrote: > >> we should have a secondary function explain_query(query_string, > >> option) that returns setof some. > > > +1. The incremental approach here should first be adding functions that > > actually do the work required. Then, if there's a set of those that look > > to be extremely useful, maybe at that point it's worth talking about how > > to integrate them into the parser. Starting with the parser changes > > rather than the parts that actually do the work is backwards. If you do > > it the other way around, at all times you have a patch that actually > > provides immediate useful value were it to be committed. > > > Something that returns a setof can also be easily used to implement the > > "dump EXPLAIN to a table" feature Josh Tolley brought up (which is another > > common request in this area). > > A serious problem with EXPLAIN via a function returning set, or with > putting the result into a table, is that set results are logically > unordered, just as table contents are. So from a strict point of view > this only makes sense when the output format is designed to not depend > on row ordering to convey information. We could certainly invent such > a format, but I think it's a mistake to go in this direction for > EXPLAIN output that is similar to the current output. The Oracle version, as it fills the table of explain results, gives each number an id and the id of its parent row, which behavior we could presumably copy. I'm definitely keen to keep a human-readable EXPLAIN such as we have now, to augment the table-based proposal, but a table would provide the more flexible output we'd need for more detailed reporting, a simple interface for applications to consume the EXPLAIN data without human intervention, and a convenient platform from whence the data can be transformed to XML, JSON, etc. for those that are so inclined. - Josh / eggyknap
2009/5/25 Tom Lane <tgl@sss.pgh.pa.us>: > Greg Smith <gsmith@gregsmith.com> writes: >> On Sun, 24 May 2009, Pavel Stehule wrote: >>> we should have a secondary function explain_query(query_string, >>> option) that returns setof some. > >> +1. The incremental approach here should first be adding functions that >> actually do the work required. Then, if there's a set of those that look >> to be extremely useful, maybe at that point it's worth talking about how >> to integrate them into the parser. Starting with the parser changes >> rather than the parts that actually do the work is backwards. If you do >> it the other way around, at all times you have a patch that actually >> provides immediate useful value were it to be committed. > >> Something that returns a setof can also be easily used to implement the >> "dump EXPLAIN to a table" feature Josh Tolley brought up (which is another >> common request in this area). > > A serious problem with EXPLAIN via a function returning set, or with > putting the result into a table, is that set results are logically > unordered, just as table contents are. So from a strict point of view > this only makes sense when the output format is designed to not depend > on row ordering to convey information. We could certainly invent such > a format, but I think it's a mistake to go in this direction for > EXPLAIN output that is similar to the current output. I don't expect so functional EXPLAIN will be used by users directly. It' data source for some "GUI". And currently with CTE, there are not problem transform query to similar output like current EXPLAIN. I am able to understand some new parameters for explain statement (when result will be directly read by user), but some output options (or formating options) I would to see in some other functions. regards Pavel Stehule > > regards, tom lane >
Hi, After having read all the followups I already received, I prefer to answer to this particular message. Robert Haas <robertmhaas@gmail.com> writes: > On Sun, May 24, 2009 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> On the other side of the coin, I'm strongly against inventing more than >> one new output format for EXPLAIN, and so any argument that depends on >> examples such as "xml vs json" is falling on deaf ears here. I think >> that we only need an XML option, and so EXPLAIN ANALYZE XML ... doesn't >> seem untenable. What other options than those do you really need? >> Not ones to add or remove output fields; we'd expect the client to >> ignore fields it doesn't care about. > > It's not just about me; we've had many requests for new EXPLAIN > features. [...] > I think XML output format is a complete distraction from the real > issue here, which is that there are certain pieces of information that > are sometimes useful but are not useful enough to justify including > them in the EXPLAIN output 100% of the time. I think the summary here is to say that we want two modes of operations:- the current one, which continues to get refinements - a new one conveying all possible information in machine readable formats, possibly with some tools to handle it easily:XML output and maybe XSLT stylesheets > Anyway, I'm suprised by the reaction to this patch, but I'll drop it. > I would like to make the EXPLAIN syntax more powerful for command-line > use, and I'd implement XML format and JSON along the way just for > completeness. But I don't have much interest in creating an XML > output format that is the ONLY way of getting more information, > because I'm a command-line user and it does me no good at all. :-( That's only because you seem to be thinking that having core PostgreSQL do the first half of the work means you as a user will have to do the second part. I assume pgadmin and phppgadmin developers will offer their users some graphical approach to the output reading, with dynamic filtering, eg. I don't see anything stopping you to provide a simple way to have the same facility into psql. You can already have the query output filtered by any script you want this way:=# \o |my_presentation_script <style name> | <stylesheet full path>=# explain XML ...=# \o Now, we talked about a better filtering integration into psql more than once in the past, so there's a path for you to have both complete EXPLAIN and command line tool suite integration, I'd say. Oh and by the way, even if I don't like XML that much, this battled is lost already. The way I understand it, -core will accept only two EXPLAIN outputs: fit on screen and complete. The second one will have to be easy for machine consumption, and so use a format anyone can work with. The choice has been made, it's XML. If you want to develop analysis tools which will parse JSON rather than XML, nothing stops you writing a stylesheet to provide JSON output from XML. Of course then you want this to happen server-side, so you want the EXPLAIN output to be processed from within the database, all the more if it's XML and we have the ability to apply XSLT to XML content already in the server. But being able to apply SQL level function calls to EXPLAIN output seems a different subject alltogether... Regards, -- dim
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [Sent by mistake to Robert Haas only at first try. No cure for fat fingers, I guess] On Sun, May 24, 2009 at 04:05:18PM -0400, Robert Haas wrote: [...] > I think XML output format is a complete distraction from the real > issue here, which is that there are certain pieces of information that > are sometimes useful but are not useful enough to justify including > them in the EXPLAIN output 100% of the time. By just punting all that > stuff to EXPLAIN XML, we're just saying that we're not interested in > creating a workable set of options to allow users to pick and choose > the information they care about - so instead we're going to dump a > huge chunk of unreadable XML and then make it the user's problem to > find a tool that will extract the details that they care about. Boo, > hiss. +1 In my experience, this happens with "XML the data description language" many times, but I haven' seen the problem as well-stated as in your mail. Hard for humans to read, hard for machines to read, and often trying to solve a problem it can't (in this case, selecting the needed information _beforehand_). Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKGntjBcgs9XrR2kYRAk1/AJ4rnZFnU4PFM8AJkaYYYLRInYHJDQCbBbt2 lTwxydBBnXP1MgDxz+vcpM8= =o2qW -----END PGP SIGNATURE-----
On Mon, May 25, 2009 at 6:24 AM, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > I think the summary here is to say that we want two modes of operations: > - the current one, which continues to get refinements > > - a new one conveying all possible information in machine readable > formats, possibly with some tools to handle it easily: XML output and > maybe XSLT stylesheets I don't agree with that summary. Many people who responded to this thread were fine with the idea of some sort of options syntax, but we had at least four different proposals for how to implement it: Robert Haas: EXPLAIN (foo 'bar', baz 'bletch', ...) query Pavel Stehule: explain_query(query, options...) [exact format of options not specified] Andrew Dunstan: SET explain_format = 'foo, baz'; EXPLAIN query Josh Tolley: EXPLAIN (foo, baz, ...) query [also suggested by me as an idea I rejected] Tom Lane was the only person to suggest that we only ever need one more option to EXPLAIN and that it should be called XML. Even though I prefer my format to the other options suggested (which I would probably rank in order of descending preference Josh-Andrew-Pavel), I am actually someone encouraged that we might have some kind of fragile consensus that an extensible options syntax is useful (a point that Andres Freund and Greg Smith also seemed to agree with). >> Anyway, I'm suprised by the reaction to this patch, but I'll drop it. >> I would like to make the EXPLAIN syntax more powerful for command-line >> use, and I'd implement XML format and JSON along the way just for >> completeness. But I don't have much interest in creating an XML >> output format that is the ONLY way of getting more information, >> because I'm a command-line user and it does me no good at all. :-( > > That's only because you seem to be thinking that having core PostgreSQL > do the first half of the work means you as a user will have to do the > second part. I assume pgadmin and phppgadmin developers will offer their > users some graphical approach to the output reading, with dynamic > filtering, eg. > > I don't see anything stopping you to provide a simple way to have the > same facility into psql. You can already have the query output filtered > by any script you want this way: > =# \o |my_presentation_script <style name> | <stylesheet full path> > =# explain XML ... > =# \o This is all much more complicated than what I proposed, and I fail to see what it buys us. I'd say that you're just reinforcing the point I made upthread, which is that insisting that XML is the only way to get more detailed information will just create a cottage industry of beating that XML output format into submission. ...Robert
On Mon, May 25, 2009 at 07:14:56AM -0400, Robert Haas wrote: > Many people who responded to this > thread were fine with the idea of some sort of options syntax, but we > had at least four different proposals for how to implement it: > > Robert Haas: EXPLAIN (foo 'bar', baz 'bletch', ...) query > Pavel Stehule: explain_query(query, options...) [exact format of > options not specified] > Andrew Dunstan: SET explain_format = 'foo, baz'; EXPLAIN query > Josh Tolley: EXPLAIN (foo, baz, ...) query [also suggested by me as an > idea I rejected] I hadn't actually caught that there were two ideas on the table with syntax similar to "EXPLAIN (...) <query>", and don't mean to champion either of the two specifically (at least until I've read closely enough to note the difference). I just kinda liked the "EXPLAIN (some options of some sort) <query>" syntax better than other proposals. That said, I think I'm changing my vote in favor of Pavel. It's my guess that some variant of his version would be the easiest to make compliant with the bit I'm most interested in, which is not being limited to, say, XML/JSON/YAML/etc. in the output. Applications that use PostgreSQL will of necessity need to know how to handle data presented in tables, so let's present our explain results as a table. As has been said in other branches of this thread, that way we don't force applications also to support XML/JSON/YAML/etc. We might consider providing functions to convert the tabular result to one or more of those formats, but at its inception, the data should live as tuples in a relation. In other messages, I've advocated actually inserting the data into a table. I think that was a mistake. Who makes the table? What's it called? What schema is it in? Who cleans it up when we're done with it? ...etc. I'd much rather see a bunch of rows returned as a set, which I can then insert into a table, pass into a function for reformatting, or just consume in an application. All of which leads me to this variant of the functional approach as my answer: SELECT * FROM pg_explain_query("<query>", <options in a still-unspecified format>); I could then do things like this: CREATE TABLE explain_results AS SELECT * FROM pg_explain_query(...); and this: SELECT xmlify_a_record(pg_explain_query(...)); - Josh / eggyknap
Joshua Tolley <eggyknap@gmail.com> writes: > The Oracle version, as it fills the table of explain results, gives > each number an id and the id of its parent row, which behavior we > could presumably copy. I'm definitely keen to keep a human-readable > EXPLAIN such as we have now, to augment the table-based proposal, but > a table would provide the more flexible output we'd need for more > detailed reporting, a simple interface for applications to consume the > EXPLAIN data without human intervention, and a convenient platform > from whence the data can be transformed to XML, JSON, etc. for those > that are so inclined. I would think a table would be considerably *less* flexible --- you could not easily change the output column set. Unless you're imagining just dumping something equivalent to the current output into a text column. Which would be flexible, but it'd hardly have any of the other desirable properties you list. regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: > This is all much more complicated than what I proposed, and I fail to > see what it buys us. I'd say that you're just reinforcing the point I > made upthread, which is that insisting that XML is the only way to get > more detailed information will just create a cottage industry of > beating that XML output format into submission. The impression I have is that (to misquote Churchill) XML is the worst option available, except for all the others. We need something that can represent a fairly complex data structure, easily supports addition or removal of particular fields in the structure (including fields not foreseen in the original design), is not hard for programs to parse, and is widely supported --- ie, "not hard" includes "you don't have to write your own parser, in most languages". How many realistic alternatives are there? regards, tom lane
On Mon, May 25, 2009 at 10:55:48AM -0400, Tom Lane wrote: > Joshua Tolley <eggyknap@gmail.com> writes: > > The Oracle version, as it fills the table of explain results, gives > > each number an id and the id of its parent row, which behavior we > > could presumably copy. I'm definitely keen to keep a human-readable > > EXPLAIN such as we have now, to augment the table-based proposal, but > > a table would provide the more flexible output we'd need for more > > detailed reporting, a simple interface for applications to consume the > > EXPLAIN data without human intervention, and a convenient platform > > from whence the data can be transformed to XML, JSON, etc. for those > > that are so inclined. > > I would think a table would be considerably *less* flexible --- you > could not easily change the output column set. Unless you're imagining > just dumping something equivalent to the current output into a text > column. Which would be flexible, but it'd hardly have any of the > other desirable properties you list. I'm not sure I see why it would be less flexible. I'm imagining we define some record type, and a function that returns a set of those records. The fields in the record would include data element this version of explain could possibly return. Then you call a function to explain a query, passing it some options to select which of those data elements you're interested in. The function returns the same data type at each call, filling with NULLs the fields you've told it you're uninterested in. Changes between versions would modify this data type, but provided consumer applications have specified the columns they're interested in (rather than, say, SELECT *) that shouldn't bother anyone. The functions to change this into some other format would probably need to be more intelligent than just that. That seems a fair price to pay. - Josh / eggyknap
<br /><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin:0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> The impression I have is that (to misquote Churchill) XML is the worst<br/> option available, except for all the others. We need something that can<br /> represent a fairly complex datastructure, easily supports addition or<br /> removal of particular fields in the structure (including fields not<br />foreseen in the original design), is not hard for programs to parse,<br /> and is widely supported --- ie, "not hard" includes"you don't have to<br /> write your own parser, in most languages". How many realistic<br /> alternatives are there?<br/><br /></blockquote></div>One realistic alternative may be JSON: parsers for nearly all languages are freely available;everything web-affected speeks it perfectly; it's easier on the eye then XML, less bytes per information, additionand removal of fields as well as complex structures are possible.<br /><br />Harald<br /><br /><br /><br clear="all"/><br />-- <br />GHUM Harald Massa<br />persuadere et programmare<br />Harald Armin Massa<br />Spielberger Straße49<br />70435 Stuttgart<br />0173/9409607<br />no fx, no carrier pigeon <br />-<br />LASIK good, steroids bad?<br />
Joshua Tolley <eggyknap@gmail.com> writes: > I'm not sure I see why it would be less flexible. I'm imagining we define some > record type, and a function that returns a set of those records. I'm unimpressed by the various proposals to change EXPLAIN into a function. Quoting the command-to-explain is going to be a pain in the neck. And can you really imagine using it manually, especially if it returns so many fields that you *have to* write out the list of fields you actually want, else the result is unreadable? It's going to be just as much of something you can only use through a helper application as the XML way would be. regards, tom lane
On Mon, May 25, 2009 at 11:02:53AM -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > This is all much more complicated than what I proposed, and I fail > > to see what it buys us. I'd say that you're just reinforcing the > > point I made upthread, which is that insisting that XML is the > > only way to get more detailed information will just create a > > cottage industry of beating that XML output format into > > submission. > > The impression I have is that (to misquote Churchill) XML is the > worst option available, except for all the others. We need > something that can represent a fairly complex data structure, easily > supports addition or removal of particular fields in the structure > (including fields not foreseen in the original design), is not hard > for programs to parse, and is widely supported --- ie, "not hard" > includes "you don't have to write your own parser, in most > languages". How many realistic alternatives are there? JSON for one, and it's *much* lighter in just about every way. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter wrote: > On Mon, May 25, 2009 at 11:02:53AM -0400, Tom Lane wrote: > >> Robert Haas <robertmhaas@gmail.com> writes: >> >>> This is all much more complicated than what I proposed, and I fail >>> to see what it buys us. I'd say that you're just reinforcing the >>> point I made upthread, which is that insisting that XML is the >>> only way to get more detailed information will just create a >>> cottage industry of beating that XML output format into >>> submission. >>> >> The impression I have is that (to misquote Churchill) XML is the >> worst option available, except for all the others. We need >> something that can represent a fairly complex data structure, easily >> supports addition or removal of particular fields in the structure >> (including fields not foreseen in the original design), is not hard >> for programs to parse, and is widely supported --- ie, "not hard" >> includes "you don't have to write your own parser, in most >> languages". How many realistic alternatives are there? >> > > JSON for one, and it's *much* lighter in just about every way. > > Cheers, > David. > For what it's worth, if this revised output form is destined for consumption by a machine, it really doesn't matter what protocol is used and how 'readable' it is by humans, as long as the protocol can express all the present and anticipated variations of the data without breaking parsers along the way. While building the Visual Planner tool, I selected XML output for no other reason than it was easy to parse on the receiving end and was hierarchical, making it perfect for representing a plan tree - or thousands. I'm sure other alternatives would have been fine as well. But, once that decision was made, I never had any reason again to look at the XML stream. If we're worried about the excess 'weight' of XML, I found this to be a non-issue in practice. The output generated by the Visual-Planner-Enabled Postgres server contains MUCH more information that one would typically see with standard EXPLAIN. The tool returns not only the most-optimal plan, but all discarded plans as well. A four way join results in output of 24k lines of XML. While it parses nearly instantly, the biggest delay is in the network. And, even this is minimal. So, why not put ALL interesting data in the EXPLAIN XML feed? I'm not suggesting for this discussion that we include discarded plans, but that we include every piece of data that may be of interest to folks building connecting tools. The parsers can pick and choose what they use easily and, because the feed isn't positional, won't break when addition data is added. A GUC parameter could govern the data included in this variant of EXPLAIN, but even that seems unnecessary. This approach will allow the standard EXPLAIN to evolve in whatever way pleases the humans without interfering with the machines. Regards, Tom Raney
On May 25, 2009, at 0:47 , Joshua Tolley wrote: > On Sun, May 24, 2009 at 06:53:29PM -0400, Tom Lane wrote: >> Greg Smith <gsmith@gregsmith.com> writes: >>> On Sun, 24 May 2009, Pavel Stehule wrote: >>>> we should have a secondary function explain_query(query_string, >>>> option) that returns setof some. >> >>> +1. The incremental approach here should first be adding >>> functions that >>> actually do the work required. Then, if there's a set of those >>> that look >>> to be extremely useful, maybe at that point it's worth talking >>> about how >>> to integrate them into the parser. Starting with the parser changes >>> rather than the parts that actually do the work is backwards. If >>> you do >>> it the other way around, at all times you have a patch that actually >>> provides immediate useful value were it to be committed. >> >>> Something that returns a setof can also be easily used to >>> implement the >>> "dump EXPLAIN to a table" feature Josh Tolley brought up (which is >>> another >>> common request in this area). >> >> A serious problem with EXPLAIN via a function returning set, or with >> putting the result into a table, is that set results are logically >> unordered, just as table contents are. So from a strict point of >> view >> this only makes sense when the output format is designed to not >> depend >> on row ordering to convey information. We could certainly invent >> such >> a format, but I think it's a mistake to go in this direction for >> EXPLAIN output that is similar to the current output. > > The Oracle version, as it fills the table of explain results, gives > each number > an id and the id of its parent row, which behavior we could > presumably copy. Or some other schema that allows us to preserve the tree. Michael Glaesemann grzm seespotcode net
On Mon, May 25, 2009 at 11:22:24AM -0400, Tom Lane wrote: > Joshua Tolley <eggyknap@gmail.com> writes: > > I'm not sure I see why it would be less flexible. I'm imagining we define some > > record type, and a function that returns a set of those records. > > I'm unimpressed by the various proposals to change EXPLAIN into a > function. Quoting the command-to-explain is going to be a pain in the > neck. Yeah, that's been bugging me, despite my recent support of that plan. > And can you really imagine using it manually, especially if it > returns so many fields that you *have to* write out the list of fields > you actually want, else the result is unreadable? It's going to be just > as much of something you can only use through a helper application as > the XML way would be. Good point. The reason, as I remember it, that we wanted to be able to specify what fields are returned was so that fields that are expensive to calculate are calculated only when the user wants them. If that's the only consideration, perhaps we should have a standard version and a "FULL" version, e.g. EXPLAIN [ANALYZE] [FULL] <query> ...where FULL would indicate the user wanted the all available statistics, not just the cheap ones. Somewhere in there we'd also need an indicator to say we wanted an output format other than the usual text version (such as the "WITH XML" clause I think someone suggested); I maintain it's all just a table of data, and should be represented the same way we represent any other table of data. - Josh / eggyknap
Hi Tom, On 05/25/2009 08:04 PM, Tom Raney wrote: > So, why not put ALL interesting data in the EXPLAIN XML feed? I'm not > suggesting for this discussion that we include discarded plans, but that > we include every piece of data that may be of interest to folks building > connecting tools. The parsers can pick and choose what they use easily > and, because the feed isn't positional, won't break when addition data > is added. A GUC parameter could govern the data included in this variant > of EXPLAIN, but even that seems unnecessary. This approach will allow > the standard EXPLAIN to evolve in whatever way pleases the humans > without interfering with the machines. Well, there is the problem Robert Haas described - some stats may be too expensive to gather (like the io-stats) for regular use, but still be quite usefull. Andres
On Mon, May 25, 2009 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Joshua Tolley <eggyknap@gmail.com> writes: >> I'm not sure I see why it would be less flexible. I'm imagining we define some >> record type, and a function that returns a set of those records. > > I'm unimpressed by the various proposals to change EXPLAIN into a > function. Quoting the command-to-explain is going to be a pain in the > neck. I agree with this, but there is a lot of sentiment (which I share) that it should be possible to capture EXPLAIN output using subselect or CTAS syntax, regardless of exactly what that output ends up being. That seems to require that EXPLAIN be a fully-reserved keyword, so I wonder what we think about that. (The way I tested this quickly is to make '(' ExplainStmt ')' a third production for select_with_parens. I'm not 100% sure that's the right place for it, but a couple of other reasonable-looking places produced non-obvious parsing conflicts.) > And can you really imagine using it manually, especially if it > returns so many fields that you *have to* write out the list of fields > you actually want, else the result is unreadable? It's going to be just > as much of something you can only use through a helper application as > the XML way would be. Nothing could possibly be as bad as XML. I'm with Josh: if we produce table-formatted output, someone can always turn it into XML or JSON or whatever they want. The reverse figures to be a whole lot more difficult. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > I agree with this, but there is a lot of sentiment (which I share) > that it should be possible to capture EXPLAIN output using subselect > or CTAS syntax, regardless of exactly what that output ends up being. Well, it should be possible to capture the output, but let's not prejudge the syntax. > That seems to require that EXPLAIN be a fully-reserved keyword, so I > wonder what we think about that. Nonstarter, especially when it's so obviously possible to do it without that. The threshold for reserving words that aren't reserved by SQL spec has to be really high, because you will break innocent applications that way. Before anyone gets overly excited about having special syntax for this, I should point out that you can do it today, for instance like so: regression=# create function expl(q text) returns setof text as $$ regression$# declare r record; regression$# begin regression$# for r in execute 'explain ' || q loop regression$# return next r."QUERY PLAN"; regression$# end loop; regression$# end$$ language plpgsql; CREATE FUNCTION regression=# select * from expl('select * from tenk1'); expl -------------------------------------------------------------Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (1 row) Admittedly this is a bit inconvenient, but the point is that the functionality does exist. There is no need to have a built-in version of this function unless we get significant advantages from having it built-in, and right now I'm not seeing those. regards, tom lane
On Mon, May 25, 2009 at 6:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I agree with this, but there is a lot of sentiment (which I share) >> that it should be possible to capture EXPLAIN output using subselect >> or CTAS syntax, regardless of exactly what that output ends up being. > > Well, it should be possible to capture the output, but let's not > prejudge the syntax. > >> That seems to require that EXPLAIN be a fully-reserved keyword, so I >> wonder what we think about that. > > Nonstarter, especially when it's so obviously possible to do it without > that. The threshold for reserving words that aren't reserved by SQL > spec has to be really high, because you will break innocent applications > that way. > > Before anyone gets overly excited about having special syntax for this, > I should point out that you can do it today, for instance like so: > > regression=# create function expl(q text) returns setof text as $$ > regression$# declare r record; > regression$# begin > regression$# for r in execute 'explain ' || q loop > regression$# return next r."QUERY PLAN"; > regression$# end loop; > regression$# end$$ language plpgsql; > CREATE FUNCTION > > regression=# select * from expl('select * from tenk1'); > expl > ------------------------------------------------------------- > Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) > (1 row) > > Admittedly this is a bit inconvenient, but the point is that the > functionality does exist. There is no need to have a built-in > version of this function unless we get significant advantages > from having it built-in, and right now I'm not seeing those. The only problem I see with this is that there's no convenient way of specifying the options you want. Granted, it wouldn't be all that difficult to add a couple of boolean options to specify the state of EXPLAIN and ANALYZE, but if we have more options, it starts to get a bit complicated, especially if they do things like change the set of output columns. I'm still liking that idea, but even if we don't end up implementing that particular thing, I'm really doubtful that the need to make EXPLAIN do more things is likely to go away. ...Robert
On Mon, May 25, 2009 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Admittedly this is a bit inconvenient, but the point is that the > functionality does exist. There is no need to have a built-in > version of this function unless we get significant advantages > from having it built-in, and right now I'm not seeing those. I assume people don't want the *text* of the current output format but the actual values in separate columns. So you could do things like accumulate the data in a table and later use sql to search for queries using specific indexes or where estimates are off etc. -- greg
Greg Stark <stark@enterprisedb.com> writes: > On Mon, May 25, 2009 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Admittedly this is a bit inconvenient, but the point is that the >> functionality does exist. �There is no need to have a built-in >> version of this function unless we get significant advantages >> from having it built-in, and right now I'm not seeing those. > I assume people don't want the *text* of the current output format but > the actual values in separate columns. Well, I notice that everyone is carefully dodging the subject of exactly what columns they want, but my example would clearly scale easily to any specific set of output columns that EXPLAIN might return instead of one text column. Since we were previously told that any particular release of PG need only offer one set of possible output columns, I figured the problem was solved ;-) regards, tom lane
On Mon, May 25, 2009 at 8:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Greg Stark <stark@enterprisedb.com> writes: >> On Mon, May 25, 2009 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Admittedly this is a bit inconvenient, but the point is that the >>> functionality does exist. There is no need to have a built-in >>> version of this function unless we get significant advantages >>> from having it built-in, and right now I'm not seeing those. > >> I assume people don't want the *text* of the current output format but >> the actual values in separate columns. > > Well, I notice that everyone is carefully dodging the subject of exactly > what columns they want, I had a try at this upthread, actually, but it's not a real easy problem. > but my example would clearly scale easily to any > specific set of output columns that EXPLAIN might return instead of one > text column. Since we were previously told that any particular release > of PG need only offer one set of possible output columns, I figured the > problem was solved ;-) I was totally unconvinced by that argument. I actually think that the best data structure for this would be something like hstore. It would sure be nice to be able to manipulate this data using SQL: I am sure there are people on this mailing list who hate XML and maybe a few who hate JSON, but if they hate SQL then they're off my list of people I care about making happy. :-) At the same time, the variable number of output columns is problematic for a flat table representation. It may not be so problematic that we can't work around it, but it's definitely not great. It's really the pits to think that our data model is so impoverished that it can't in a reasonable way handle the output of our EXPLAIN command. It would be awfully sweet to be able to do things like: show me all the plan nodes where the expected and actual row counts differed by more than a factor of 10. And why should I need an external XML/JSON parser to do that, rather than just a WHERE clause? ...Robert
On Monday 25 May 2009 18:02:53 Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > This is all much more complicated than what I proposed, and I fail to > > see what it buys us. I'd say that you're just reinforcing the point I > > made upthread, which is that insisting that XML is the only way to get > > more detailed information will just create a cottage industry of > > beating that XML output format into submission. > > The impression I have is that (to misquote Churchill) XML is the worst > option available, except for all the others. We need something that can > represent a fairly complex data structure, easily supports addition or > removal of particular fields in the structure (including fields not > foreseen in the original design), is not hard for programs to parse, > and is widely supported --- ie, "not hard" includes "you don't have to > write your own parser, in most languages". How many realistic > alternatives are there? I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. OK, there are historically about three people that want one, but they have already solved the problem of parsing the current format. And without having writtens such a parser myself I think that the current format is not inherently hard to parse. What people really want is optional additional information in the human- readable format. Giving them a machine readable format does not solve the problem. Giving them a machine readable format with all-or-none of the optional information and saying "figure it out yourself" does not solve anything either. The same people who currently complain will continue to complain.
Peter Eisentraut wrote: > On Monday 25 May 2009 18:02:53 Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> This is all much more complicated than what I proposed, and I fail to >>> see what it buys us. I'd say that you're just reinforcing the point I >>> made upthread, which is that insisting that XML is the only way to get >>> more detailed information will just create a cottage industry of >>> beating that XML output format into submission. >> The impression I have is that (to misquote Churchill) XML is the worst >> option available, except for all the others. We need something that can >> represent a fairly complex data structure, easily supports addition or >> removal of particular fields in the structure (including fields not >> foreseen in the original design), is not hard for programs to parse, >> and is widely supported --- ie, "not hard" includes "you don't have to >> write your own parser, in most languages". How many realistic >> alternatives are there? > > I think we are going in the wrong direction. No one has said that they want a > machine-readable EXPLAIN format. That is not true. Tool developers like pgAdmin (I know that one for sure), phpPgAdmin (I think they have said it too) and third party tools have asked for this. Right now we parse the EXPLAIN output. Which doesn't get easier with each new thing we add to it :-) It would be very nice to have it tool parseable. I'm also fairly certain that people using auto_explain would have use for a format that's easier to parse. > What people really want is optional additional information in the human- > readable format. Giving them a machine readable format does not solve the > problem. Giving them a machine readable format with all-or-none of the > optional information and saying "figure it out yourself" does not solve > anything either. The same people who currently complain will continue to > complain. I agree that this is a separate issue. But that doesn't mean they don't both exist. //Magnus
On Tue, May 26, 2009 at 8:15 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > I think we are going in the wrong direction. No one has said that they want a > machine-readable EXPLAIN format. OK, there are historically about three > people that want one, but they have already solved the problem of parsing the > current format. Pretty sure I've said I want one. And whilst it's true, we already parse the current output in pgAdmin, it's a PITA whenever the format changes. I also want a format in which Tom is not going to refuse to include additional data (such as the schema a relation is in) because it clutters the output. A machine readable format would seem to the idea way to include all data we may need, without making human-readable output an unreadable mess. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On May 26, 2009, at 8:15 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > On Monday 25 May 2009 18:02:53 Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> This is all much more complicated than what I proposed, and I fail >>> to >>> see what it buys us. I'd say that you're just reinforcing the >>> point I >>> made upthread, which is that insisting that XML is the only way to >>> get >>> more detailed information will just create a cottage industry of >>> beating that XML output format into submission. >> >> The impression I have is that (to misquote Churchill) XML is the >> worst >> option available, except for all the others. We need something >> that can >> represent a fairly complex data structure, easily supports addition >> or >> removal of particular fields in the structure (including fields not >> foreseen in the original design), is not hard for programs to parse, >> and is widely supported --- ie, "not hard" includes "you don't have >> to >> write your own parser, in most languages". How many realistic >> alternatives are there? > > I think we are going in the wrong direction. No one has said that > they want a > machine-readable EXPLAIN format. OK, there are historically about > three > people that want one, but they have already solved the problem of > parsing the > current format. And without having writtens such a parser myself I > think that > the current format is not inherently hard to parse. > > What people really want is optional additional information in the > human- > readable format. Giving them a machine readable format does not > solve the > problem. Giving them a machine readable format with all-or-none of > the > optional information and saying "figure it out yourself" does not > solve > anything either. The same people who currently complain will > continue to > complain. Peter, The check is in the mail. :-) In all seriousness, I have no problem at all with providing machine- readable formats, but the problem you're describing here is definitely my primary pain point. ...Robert
Well I want an SQL query-able format. I also want a way to retrieve the data for a query run from within an application without disturbing the application i.e. while still returning the regular result set. But I also like being able to conveniently run explain and get the results formatted to fit on the screen in a single step. I don't see anything wrong with Robert's direction to pass options to explain. It doesn't solve every problem but it doesn't make any of the other things we need harder either. On a bike-shedding note I would rather have the rhs of the option be optional and default to true for boolean options. Actually if we make a set of explain_* guc options we could make the options just locally set those options. -- Greg On 26 May 2009, at 13:15, Peter Eisentraut <peter_e@gmx.net> wrote: > On Monday 25 May 2009 18:02:53 Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> This is all much more complicated than what I proposed, and I fail >>> to >>> see what it buys us. I'd say that you're just reinforcing the >>> point I >>> made upthread, which is that insisting that XML is the only way to >>> get >>> more detailed information will just create a cottage industry of >>> beating that XML output format into submission. >> >> The impression I have is that (to misquote Churchill) XML is the >> worst >> option available, except for all the others. We need something >> that can >> represent a fairly complex data structure, easily supports addition >> or >> removal of particular fields in the structure (including fields not >> foreseen in the original design), is not hard for programs to parse, >> and is widely supported --- ie, "not hard" includes "you don't have >> to >> write your own parser, in most languages". How many realistic >> alternatives are there? > > I think we are going in the wrong direction. No one has said that > they want a > machine-readable EXPLAIN format. OK, there are historically about > three > people that want one, but they have already solved the problem of > parsing the > current format. And without having writtens such a parser myself I > think that > the current format is not inherently hard to parse. > > What people really want is optional additional information in the > human- > readable format. Giving them a machine readable format does not > solve the > problem. Giving them a machine readable format with all-or-none of > the > optional information and saying "figure it out yourself" does not > solve > anything either. The same people who currently complain will > continue to > complain. > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On May 26, 2009, at 8:46 AM, Greg Stark <greg.stark@enterprisedb.com> wrote: > Well I want an SQL query-able format. I also want a way to retrieve > the data for a query run from within an application without > disturbing the application i.e. while still returning the regular > result set. > > But I also like being able to conveniently run explain and get the > results formatted to fit on the screen in a single step. I don't see > anything wrong with Robert's direction to pass options to explain. > It doesn't solve every problem but it doesn't make any of the other > things we need harder either. Your check is in the mail, too. > On a bike-shedding note I would rather have the rhs of the option be > optional and default to true for boolean options. I was thinking about that, too, so +1. > Actually if we make a set of explain_* guc options we could make the > options just locally set those options. I think that's probably over-complicated, but that's just MHO. ...Robert > > > -- > Greg > > > On 26 May 2009, at 13:15, Peter Eisentraut <peter_e@gmx.net> wrote: > >> On Monday 25 May 2009 18:02:53 Tom Lane wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> This is all much more complicated than what I proposed, and I >>>> fail to >>>> see what it buys us. I'd say that you're just reinforcing the >>>> point I >>>> made upthread, which is that insisting that XML is the only way >>>> to get >>>> more detailed information will just create a cottage industry of >>>> beating that XML output format into submission. >>> >>> The impression I have is that (to misquote Churchill) XML is the >>> worst >>> option available, except for all the others. We need something >>> that can >>> represent a fairly complex data structure, easily supports >>> addition or >>> removal of particular fields in the structure (including fields not >>> foreseen in the original design), is not hard for programs to parse, >>> and is widely supported --- ie, "not hard" includes "you don't >>> have to >>> write your own parser, in most languages". How many realistic >>> alternatives are there? >> >> I think we are going in the wrong direction. No one has said that >> they want a >> machine-readable EXPLAIN format. OK, there are historically about >> three >> people that want one, but they have already solved the problem of >> parsing the >> current format. And without having writtens such a parser myself I >> think that >> the current format is not inherently hard to parse. >> >> What people really want is optional additional information in the >> human- >> readable format. Giving them a machine readable format does not >> solve the >> problem. Giving them a machine readable format with all-or-none of >> the >> optional information and saying "figure it out yourself" does not >> solve >> anything either. The same people who currently complain will >> continue to >> complain. >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers
Peter Eisentraut <peter_e@gmx.net> writes: > I think we are going in the wrong direction. No one has said that > they want a machine-readable EXPLAIN format. OK, there are > historically about three people that want one, but they have already > solved the problem of parsing the current format. Well, obviously the set of tool designers is smaller than the set of casual users of EXPLAIN, but their problems are none the less real and very important. > What people really want is optional additional information in the human- > readable format. Giving them a machine readable format does not solve the > problem. Actually, the exact problem is this: those two goals are in conflict. There'd be little objection to adding any random set of optional stuff to EXPLAIN's textual output, if it weren't for the fact that it would make machine parsing of that output even harder than it is already. So my feeling is that we need a machine-readable format containing all the data in order to satisfy the needs of tool designers. Once they are freed from having to parse EXPLAIN's textual output, we can whack the textual output around all we want. (Which kills my previous argument that we only need one new option, but such is life.) Now there is a third set of desires having to do with being able to do simple SQL-based analysis of EXPLAIN output. That's the piece I think we don't have a good handle on. In particular, it's not clear whether a SQL-friendly output format can be the same as either of the other two. (I don't personally find this goal very compelling --- there is no natural law saying that SQL is a good tool for analyzing EXPLAIN output --- but I'm willing to look at it to see if it's feasible.) regards, tom lane
Tom Lane wrote: > Now there is a third set of desires having to do with being able to do > simple SQL-based analysis of EXPLAIN output. That's the piece I think > we don't have a good handle on. In particular, it's not clear whether > a SQL-friendly output format can be the same as either of the other > two. (I don't personally find this goal very compelling --- there is > no natural law saying that SQL is a good tool for analyzing EXPLAIN > output --- but I'm willing to look at it to see if it's feasible.) > > > In libxml-enabled builds at least, this could presumably be done fairly easily via the XML functions, especially if we get XSLT processing into the core XML functionality as I hope we can do this release. In fact, the ability to leverage existing XML functionality to munge the output is the thing that swings me in favor of XML as the machine readable output format instead of JSON, since we don't have and aren't terribly likely to get an inbuilt JSON parser. It means we wouldn't need some external tool at all. cheers andrew
On Tue, May 26, 2009 at 9:52 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > > In libxml-enabled builds at least, this could presumably be done fairly > easily via the XML functions, especially if we get XSLT processing into the > core XML functionality as I hope we can do this release. In fact, the > ability to leverage existing XML functionality to munge the output is the > thing that swings me in favor of XML as the machine readable output format > instead of JSON, since we don't have and aren't terribly likely to get an > inbuilt JSON parser. It means we wouldn't need some external tool at all. I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Dave Page wrote: > On Tue, May 26, 2009 at 9:52 AM, Andrew Dunstan <andrew@dunslane.net> wrote: >> In libxml-enabled builds at least, this could presumably be done fairly >> easily via the XML functions, especially if we get XSLT processing into the >> core XML functionality as I hope we can do this release. In fact, the >> ability to leverage existing XML functionality to munge the output is the >> thing that swings me in favor of XML as the machine readable output format >> instead of JSON, since we don't have and aren't terribly likely to get an >> inbuilt JSON parser. It means we wouldn't need some external tool at all. Actually, I think a number of users would be *very* happy if we had a builtin JSON parser. I'm unsure on how feasible that is though. > I was thinking something similar, but from the pgAdmin perspective. We > already use libxml2, but JSON would introduce another dependency for > us. Yeah, but probably not a huge one. There is one for wx, but I don't think it's included by default. -- Magnus HaganderSelf: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Tuesday 26 May 2009 16:55:55 Dave Page wrote: > I was thinking something similar, but from the pgAdmin perspective. We > already use libxml2, but JSON would introduce another dependency for > us. I was actually looking for a C library for JSON (json type for PostgreSQL; you know it is coming :-) ), but only found a library tied to glib, which, considering the experience with libxml, did not excite me. If someone knows of a different, small, and independent JSON library for C, I would like to hear about it.
Peter Eisentraut wrote: > On Tuesday 26 May 2009 16:55:55 Dave Page wrote: >> I was thinking something similar, but from the pgAdmin perspective. We >> already use libxml2, but JSON would introduce another dependency for >> us. > > I was actually looking for a C library for JSON (json type for PostgreSQL; you > know it is coming :-) ), but only found a library tied to glib, which, > considering the experience with libxml, did not excite me. If someone knows > of a different, small, and independent JSON library for C, I would like to > hear about it. The JSon page (http://json.org/) lists for example http://fara.cs.uni-potsdam.de/~jsg/json_parser/ which appears to not need it. But it seems very simple - though I haven't actually looked into the details. -- Magnus HaganderSelf: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Peter Eisentraut wrote: > On Tuesday 26 May 2009 16:55:55 Dave Page wrote: > >> I was thinking something similar, but from the pgAdmin perspective. We >> already use libxml2, but JSON would introduce another dependency for >> us. >> > > I was actually looking for a C library for JSON (json type for PostgreSQL; you > know it is coming :-) ), but only found a library tied to glib, which, > considering the experience with libxml, did not excite me. If someone knows > of a different, small, and independent JSON library for C, I would like to > hear about it. > > There are several listed at <http://www.json.org/> cheers andrew
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, May 26, 2009 at 04:36:56PM +0200, Magnus Hagander wrote: > > I was thinking something similar, but from the pgAdmin perspective. We > > already use libxml2, but JSON would introduce another dependency for > > us. > > Yeah, but probably not a huge one. There is one for wx, but I don't > think it's included by default. ...and to put things into perspective: tomas@floh:~$ apt-cache show libxml2 libjson-glib-1.0-0 | grep "^Size" Size: 814356 Size: 33538 (not that I would recommend this one, since that's the one tied to glib, but seems that XML parsing is nearly one and a half orders of magnitude more complex than JSON). - -- tomás who thinks that XML-as-a-data-description-language is a denial of service attack on us all -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKHATbBcgs9XrR2kYRArJmAJ4wJlvbnuWKYTvIDrSoLJccCyMTLwCbBM39 NCVSrDaEVad3NfogJrwRtiY= =Volp -----END PGP SIGNATURE-----
* tomas@tuxteam.de <tomas@tuxteam.de> [090526 11:03]: > ...and to put things into perspective: > > tomas@floh:~$ apt-cache show libxml2 libjson-glib-1.0-0 | grep "^Size" > Size: 814356 > Size: 33538 And including glib, which does all the work for libjson-glib: mountie@pumpkin:~/projects/postgresql/PostgreSQL$ apt-cache show libxml2 libjson-glib-1.0-0 libglib2.0-0 | grep ^Size Size: 870188 Size: 36132 Size: 845166 glib also pulls in libpcre: Size: 214650 So:- XML: 870188(libxml) + 76038 (zlib1g) = 946226- JSON: 36132 (json) + 845166 (glib) + 214650 (pcre) = 1095948 ;-) -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Hi, Peter Eisentraut <peter_e@gmx.net> writes: > I was actually looking for a C library for JSON (json type for PostgreSQL; you > know it is coming :-) ), but only found a library tied to glib, which, > considering the experience with libxml, did not excite me. If someone knows > of a different, small, and independent JSON library for C, I would like to > hear about it. Looking at http://json.org/, it seems this particular project could fit: http://lloyd.github.com/yajl/ Yet Another JSON Library. YAJL is a small event-driven (SAX-style) JSON parser written in ANSI C, and a small validatingJSON generator. YAJL is released under the BSD license. ... It's all ANSI C. It's been successfully compiled on debian linux, OSX 10.4 i386 & ppc, OSX 10.5 i386, winXP, FreeBSD 4.10,FreeBSD 6.1 amd64, FreeBSD 7 i386, and windows vista. More platforms and binaries as time permits. ... A second motivation for writing YAJL, was that many available free JSON parsers fall over on large or complex inputs. YAJLis careful to minimize memory copying and input re-scanning when possible. The result is a parser that should be fastenough for most applications or tunable for any application. On my mac pro (2.66 ghz) it takes 1s to verify a 60meg jsonfile. Minimizing that same file with json_reformat takes 4s. Largely because YAJL deals with streams, it's possible to parse JSON in low memory environments. Oftentimes with other parsersan application must hold both the input text and the memory representation of the tree in memory at one time. WithYAJL you can incrementally read the input stream and hold only the in memory representation. Or for filtering or validationtasks, it's not required to hold the entire input text in memory. Hope this helps, regards, -- dim
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, May 26, 2009 at 11:15:21AM -0400, Aidan Van Dyk wrote: > * tomas@tuxteam.de <tomas@tuxteam.de> [090526 11:03]: > > > ...and to put things into perspective: > > > > tomas@floh:~$ apt-cache show libxml2 libjson-glib-1.0-0 | grep "^Size" > > Size: 814356 > > Size: 33538 > > And including glib, which does all the work for libjson-glib: > > mountie@pumpkin:~/projects/postgresql/PostgreSQL$ apt-cache show libxml2 > libjson-glib-1.0-0 libglib2.0-0 | grep ^Size > Size: 870188 > Size: 36132 > Size: 845166 > > glib also pulls in libpcre: > Size: 214650 > > So: > - XML: 870188(libxml) + 76038 (zlib1g) = 946226 > - JSON: 36132 (json) + 845166 (glib) + 214650 (pcre) = 1095948 > > ;-) OK, OK, you win (darn: should have known those bloatophile gnomies. Surprise that they don't pull in Mono :-( But json-c (just downloaded & compiled) is more in the ballpark of 100K, if I count all produced *.o And it's BSD. Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKHAvaBcgs9XrR2kYRAkasAJwPzzw3Os8e7QA2HvMSkQ0iRGWz+ACfYlp+ Y/v3EO+8sRiPzJNumADatdM= =EjCU -----END PGP SIGNATURE-----
On Tue, May 26, 2009 at 09:55:55AM -0400, Dave Page wrote: > from the pgAdmin perspective. We > already use libxml2, but JSON would introduce another dependency for > us. ...and using XML introduces a dependency for those that apps that don't already use some XML parser. I realize that since the pool of apps that care to mechanically parse EXPLAIN output is small, it wouldn't necessarily be a big deal to hand each of them a new dependency in the form of a parser for XML, JSON, etc. But we know the least common denominator is to return a set of tuples; let's make sure that really is unworkable before forcing even that dependency. - Josh / eggyknap
On Tue, May 26, 2009 at 10:36 AM, Magnus Hagander <magnus@hagander.net> wrote: > Dave Page wrote: >> On Tue, May 26, 2009 at 9:52 AM, Andrew Dunstan <andrew@dunslane.net> wrote: >>> In libxml-enabled builds at least, this could presumably be done fairly >>> easily via the XML functions, especially if we get XSLT processing into the >>> core XML functionality as I hope we can do this release. In fact, the >>> ability to leverage existing XML functionality to munge the output is the >>> thing that swings me in favor of XML as the machine readable output format >>> instead of JSON, since we don't have and aren't terribly likely to get an >>> inbuilt JSON parser. It means we wouldn't need some external tool at all. > > Actually, I think a number of users would be *very* happy if we had a > builtin JSON parser. I'm unsure on how feasible that is though. I think it's likely that with proper design the amount of extra code that is required to support both XML and JSON is likely to be very small. I don't think we're going to get away without supporting XML because there are so many people already using XML-based tools, and I find Andrew's argument that we already have some built-in XML support that could possibly be used to smooth the road here as well pretty compelling. On the other hand, XML can be a really difficult technology to work with because it doesn't map cleanly to the data structures that most modern scripting languages (Perl, Python, Ruby, and probably Java and others) use. As a simple example, if you have a hash like { a => 1, b => 2 } (using the Perl syntax) you can map it to <hash><a>1</a><b>2</b></hash>. That's easy to generate, but the reverse transformation is full of error-handling cases, like <hash><a>1</a><b>2<c/></b></hash> and <hash><a>1</a><a>2</a></hash>. I'm sure experienced XML hackers have ways to work around these problems, but the XML libraries I've worked with basically don't even try to turn the thing into any sort of general-purpose data structure.They just let you ask questions like "What is the rootelement? OK, now what elements does it contain? OK, there's an <a> tag there, what does that have inside it? Any more-deeply-nested tags?". On the other hand, JSON is explicitly designed to serialize and deserialize data structures of this type, and it pretty much just works, even between completely different programming languages. So to summarize that - if we're only going to support one machine-readable output format, it's probably got to be XML. But if the additional effort to also support JSON is small, which I believe to be the case, then I think it's worth doing because it's actually better technology for this type of application. Maybe someone will feel inspired to work up a contrib/json. ...Robert
Robert Haas wrote: > On the other hand, XML can be a really difficult technology to work > with because it doesn't map cleanly to the data structures that most > modern scripting languages (Perl, Python, Ruby, and probably Java and > others) use. As a simple example, if you have a hash like { a => 1, b > => 2 } (using the Perl syntax) you can map it to > <hash><a>1</a><b>2</b></hash>. That's easy to generate, but the > reverse transformation is full of error-handling cases, like > <hash><a>1</a><b>2<c/></b></hash> and <hash><a>1</a><a>2</a></hash>. > I'm sure experienced XML hackers have ways to work around these > problems, but the XML libraries I've worked with basically don't even > try to turn the thing into any sort of general-purpose data structure. > They just let you ask questions like "What is the root element? OK, > now what elements does it contain? OK, there's an <a> tag there, what > does that have inside it? Any more-deeply-nested tags?". On the > other hand, JSON is explicitly designed to serialize and deserialize > data structures of this type, and it pretty much just works, even > between completely different programming languages. > > > Since we will be controlling the XML output, we can restrict it to a form that is equivalent to what JSON and similar serialisation languages use. We can even produce an XSD schema specifying what is allowed, if anyone is so minded, and a validating parser could be told to validate the XML against that schema. And XSLT processing is a very powerful transformation tool. We could even provide a stylesheet that would turn the XML into JSON. :-) Anyway, I think we're getting closer to consensus here. I think there's a good case for being able to stash the EXPLAIN output in a table as XML - that way we could slice and dice it several ways without having to rerun the EXPLAIN. cheers andrew
On Tue, May 26, 2009 at 1:48 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > Robert Haas wrote: >> On the other hand, XML can be a really difficult technology to work >> with because it doesn't map cleanly to the data structures that most >> modern scripting languages (Perl, Python, Ruby, and probably Java and >> others) use. As a simple example, if you have a hash like { a => 1, b >> => 2 } (using the Perl syntax) you can map it to >> <hash><a>1</a><b>2</b></hash>. That's easy to generate, but the >> reverse transformation is full of error-handling cases, like >> <hash><a>1</a><b>2<c/></b></hash> and <hash><a>1</a><a>2</a></hash>. >> I'm sure experienced XML hackers have ways to work around these >> problems, but the XML libraries I've worked with basically don't even >> try to turn the thing into any sort of general-purpose data structure. >> They just let you ask questions like "What is the root element? OK, >> now what elements does it contain? OK, there's an <a> tag there, what >> does that have inside it? Any more-deeply-nested tags?". On the >> other hand, JSON is explicitly designed to serialize and deserialize >> data structures of this type, and it pretty much just works, even >> between completely different programming languages. > > Since we will be controlling the XML output, we can restrict it to a form > that is equivalent to what JSON and similar serialisation languages use. We > can even produce an XSD schema specifying what is allowed, if anyone is so > minded, and a validating parser could be told to validate the XML against > that schema. And XSLT processing is a very powerful transformation tool. We > could even provide a stylesheet that would turn the XML into JSON. :-) Yeah, that's fine. I think we should target 4/1/2010 as the submission date for that stylesheet. :-) > Anyway, I think we're getting closer to consensus here. > > I think there's a good case for being able to stash the EXPLAIN output in a > table as XML - that way we could slice and dice it several ways without > having to rerun the EXPLAIN. Yes, I think there is an excellent case for being able to stash any output format into a table. ...Robert
(sorry for top posting - stupid apple) So the real elephant in the room is that the existing explain code is not really designed to be extensible, configurable, or to be printed in different formats. The current code is basically just gobs of text printed by different routines all over the code base. There are no data structures which represent what explain prints. The closest thing is the instrumentation objects which obtain the timing and counts but not the planner expectations or any associated data. If we're going to support multiple output formats or options to turn off and on sections I think we need to build a data structure independent of the format, have code to include or exclude stats as requested and then pass that to the requested formatter. -- Greg On 26 May 2009, at 18:53, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, May 26, 2009 at 1:48 PM, Andrew Dunstan > <andrew@dunslane.net> wrote: >> Robert Haas wrote: >>> On the other hand, XML can be a really difficult technology to work >>> with because it doesn't map cleanly to the data structures that most >>> modern scripting languages (Perl, Python, Ruby, and probably Java >>> and >>> others) use. As a simple example, if you have a hash like { a => >>> 1, b >>> => 2 } (using the Perl syntax) you can map it to >>> <hash><a>1</a><b>2</b></hash>. That's easy to generate, but the >>> reverse transformation is full of error-handling cases, like >>> <hash><a>1</a><b>2<c/></b></hash> and <hash><a>1</a><a>2</a></hash>. >>> I'm sure experienced XML hackers have ways to work around these >>> problems, but the XML libraries I've worked with basically don't >>> even >>> try to turn the thing into any sort of general-purpose data >>> structure. >>> They just let you ask questions like "What is the root element? >>> OK, >>> now what elements does it contain? OK, there's an <a> tag there, >>> what >>> does that have inside it? Any more-deeply-nested tags?". On the >>> other hand, JSON is explicitly designed to serialize and deserialize >>> data structures of this type, and it pretty much just works, even >>> between completely different programming languages. >> >> Since we will be controlling the XML output, we can restrict it to >> a form >> that is equivalent to what JSON and similar serialisation languages >> use. We >> can even produce an XSD schema specifying what is allowed, if >> anyone is so >> minded, and a validating parser could be told to validate the XML >> against >> that schema. And XSLT processing is a very powerful transformation >> tool. We >> could even provide a stylesheet that would turn the XML into >> JSON. :-) > > Yeah, that's fine. I think we should target 4/1/2010 as the > submission date for that stylesheet. :-) > >> Anyway, I think we're getting closer to consensus here. >> >> I think there's a good case for being able to stash the EXPLAIN >> output in a >> table as XML - that way we could slice and dice it several ways >> without >> having to rerun the EXPLAIN. > > Yes, I think there is an excellent case for being able to stash any > output format into a table. > > ...Robert > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Greg Stark <greg.stark@enterprisedb.com> writes: > So the real elephant in the room is that the existing explain code is > not really designed to be extensible, configurable, or to be printed > in different formats. These are implementation details ;-). Let's get a definition that everyone can sign off on, and then worry about what has to be done to the code to make it happen. Even if we end up throwing away and rewriting all of explain.c, that's not *that* much code. regards, tom lane
On Tue, May 26, 2009 at 3:04 PM, Greg Stark <greg.stark@enterprisedb.com> wrote: > (sorry for top posting - stupid apple) > > So the real elephant in the room is that the existing explain code is not > really designed to be extensible, configurable, or to be printed in > different formats. > > The current code is basically just gobs of text printed by different > routines all over the code base. There are no data structures which All over the code base? It looks to me like most of it is in explain.c, specifically explain_outNode(). (On an unrelated point, it's difficult to imagine why someone thought that was a good way of capitalizing & punctuating that function name.) > represent what explain prints. The closest thing is the instrumentation > objects which obtain the timing and counts but not the planner expectations > or any associated data. > > If we're going to support multiple output formats or options to turn off and > on sections I think we need to build a data structure independent of the > format, have code to include or exclude stats as requested and then pass > that to the requested formatter. That sounds about right to me. I think that representation can be pretty thin, though, maybe just a big struct with all the attributes that are applicable to any node type and pointers to its left and right children. ...Robert
On Tue, May 26, 2009 at 3:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Greg Stark <greg.stark@enterprisedb.com> writes: >> So the real elephant in the room is that the existing explain code is >> not really designed to be extensible, configurable, or to be printed >> in different formats. > > These are implementation details ;-). Let's get a definition that > everyone can sign off on, and then worry about what has to be done > to the code to make it happen. Even if we end up throwing away and > rewriting all of explain.c, that's not *that* much code. I'm actually not sure there's a whole lot to hash out... I was going to take a crack at writing some code. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, May 26, 2009 at 3:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> These are implementation details ;-). �Let's get a definition that >> everyone can sign off on, and then worry about what has to be done >> to the code to make it happen. > I'm actually not sure there's a whole lot to hash out... I was going > to take a crack at writing some code. I still haven't seen anything but formless handwaving as far as the "SQL table" output format goes. For that matter, there's not much more than handwaving behind the "XML" meme either. Show us a spec for the output format, then think about code. (This was somewhere around slide ten here: http://momjian.us/main/writings/pgsql/patch.pdf ;-)) regards, tom lane
On Tue, May 26, 2009 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, May 26, 2009 at 3:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> These are implementation details ;-). Let's get a definition that >>> everyone can sign off on, and then worry about what has to be done >>> to the code to make it happen. > >> I'm actually not sure there's a whole lot to hash out... I was going >> to take a crack at writing some code. > > I still haven't seen anything but formless handwaving as far as the "SQL > table" output format goes. For that matter, there's not much more than > handwaving behind the "XML" meme either. Show us a spec for the output > format, then think about code. (This was somewhere around slide ten > here: http://momjian.us/main/writings/pgsql/patch.pdf ;-)) OK, how about this: http://archives.postgresql.org/message-id/603c8f070905241827g74c8bf9cie9d98e38037a8356@mail.gmail.com I note in passing that there have been 51 messages posted to this thread since I wrote that email, and none of the were responses to it. At any rate, that email might not be as detailed as what you're looking for, but it's certainly a start. I don't really know how the table-format output is going to work out; I have to look at the code more to get a feeling for that. But I think with respect to XML or JSON, there really aren't too many options for how it can look, modulo minor syntax tweaks like arguing about whether the join type should be labelled "JoinType" or "jointype" or "join_type". Still, if you have comments or think I'm overlooking something important, I definitely would like to know about that now before I put more time into it. I recognize that we haven't come to a consensus on the best possible syntax for EXPLAIN options, but it seems to me that threshold issue for improving EXPLAIN is everyone agreeing that we're going to allow for some kind of extendable syntax that doesn't rely on all options being keywords (presented in a fixed order, no less!). You caved in on that point upthread and I don't think we have any other holdouts. Now, of course, my syntax is the best possible one in the entire universe, but if by chance there is a technically feasible alternative syntax on which more than one person can agree (note: this has not happened yet), adjusting my patch to use that syntax rather than the one I stole from Peter shouldn't be too hard. A second issue on which we don't have consensus is a method to capture explain output. I am 100% of the opinion that there are only two sensible things to do here: (1) make EXPLAIN a fully reserved keyword so that we can use it just like a SELECT, or (2) provide a built-in function like pg_explain() that calls EXPLAIN with a user-specified set of arguments, and which third-party tools can count on to be installed. Since you labelled (1) as a non-starter and AFAICS you're the only holdout on making (2) a built-in rather than something everyone has to define for themselves, I'm hopeful that we'll bring you around. :-) The final issue on which we don't have a clear consensus is what OTHER new options we want for EXPLAIN aside from choice of output format. I posted a few ideas that I have and solicited some others upthread, but I think that the volume of email on other aspects of this patch has deprived people of the necessary time and space to think about how they might like to use an extensible options syntax once we have it - not to mention that the original patch was only posted 3 days ago and on a day when many of us were on airplanes, about to get on airplanes, or still jet-lagged. Personally, I think that that's the most interesting aspect of this whole project so I hope it gets some attention going forward, but I'm not too concerned about the exact timing of that attention. The point is that people not-infrequently come up with more stuff they'd like to see in EXPLAIN output, and those ideas get shot down because we don't have the syntax. If we fix the syntax, those ideas will come back around again in due course, and we'll be able to consider them on their merits rather than peremptorily shooting them down. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, May 26, 2009 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I still haven't seen anything but formless handwaving as far as the "SQL >> table" output format goes. �For that matter, there's not much more than >> handwaving behind the "XML" meme either. > OK, how about this: > http://archives.postgresql.org/message-id/603c8f070905241827g74c8bf9cie9d98e38037a8356@mail.gmail.com > I note in passing that there have been 51 messages posted to this > thread since I wrote that email, and none of the were responses to it. Well, we were having too much fun arguing about trivia ;-). And I suspect a number of people were too jet-lagged to keep track of what they'd read and what not. Anyway, good, we have a starting point. Some issues that I see here: 1. You seem to be assuming that each table row will represent exactly one plan node, no more, no less. That's okay as a first approximation but it breaks down on closer examination. In particular, where will you hang the information that's already available about trigger execution costs? Those are not associated with any particular plan node, as they occur atop the whole plan. The same goes for the total execution time of course, and I can foresee other types of stats that we might gather someday that would be hard to tie to any specific plan node. In XML this is soluble by having a toplevel node <ExplainResults> that contains not only the plan tree but other children. I'm not seeing how to translate that into a SQL table, though. Or at least not just one SQL table. 2. You didn't say anything about how any but simple scalar fields will be represented. Filter conditions and sort keys are particularly interesting here. I'm not really happy with just plopping down the same textual output we have now --- that is just as human-friendly-and-not- machine-friendly as before, only with a slightly smaller scope. I can foresee for example that someone might wish to extract the second or third sort key expression from a Sort node's sort key list. Or what about problems such as "find which nodes this field is used in"? 3. You left us with a handwave about how the tree structure will be represented in a table. Needs to be explicit. And it's not just simple child relationships that should be represented ... tell us about initplans and subplans, too. 4. The point about having lots of NULL columns is an annoyance that could escalate to the point of near unusability. To get a feeling for how workable that is, we need a pretty exact list of the set of output columns, not just a rough list of the kinds of things that will be there. regards, tom lane
On Tue, May 26, 2009 at 5:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, May 26, 2009 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I still haven't seen anything but formless handwaving as far as the "SQL >>> table" output format goes. For that matter, there's not much more than >>> handwaving behind the "XML" meme either. > >> OK, how about this: >> http://archives.postgresql.org/message-id/603c8f070905241827g74c8bf9cie9d98e38037a8356@mail.gmail.com > >> I note in passing that there have been 51 messages posted to this >> thread since I wrote that email, and none of the were responses to it. > > Well, we were having too much fun arguing about trivia ;-). And I > suspect a number of people were too jet-lagged to keep track of what > they'd read and what not. Anyway, good, we have a starting point. > > Some issues that I see here: > > 1. You seem to be assuming that each table row will represent exactly > one plan node, no more, no less. That's okay as a first approximation > but it breaks down on closer examination. In particular, where will you > hang the information that's already available about trigger execution > costs? Those are not associated with any particular plan node, as they > occur atop the whole plan. The same goes for the total execution time > of course, and I can foresee other types of stats that we might gather > someday that would be hard to tie to any specific plan node. > > In XML this is soluble by having a toplevel node <ExplainResults> that > contains not only the plan tree but other children. I'm not seeing how > to translate that into a SQL table, though. Or at least not just one > SQL table. > > 2. You didn't say anything about how any but simple scalar fields will > be represented. Filter conditions and sort keys are particularly > interesting here. I'm not really happy with just plopping down the same > textual output we have now --- that is just as human-friendly-and-not- > machine-friendly as before, only with a slightly smaller scope. I can > foresee for example that someone might wish to extract the second or > third sort key expression from a Sort node's sort key list. Or what > about problems such as "find which nodes this field is used in"? > > 3. You left us with a handwave about how the tree structure will be > represented in a table. Needs to be explicit. And it's not just > simple child relationships that should be represented ... tell us > about initplans and subplans, too. > > 4. The point about having lots of NULL columns is an annoyance that > could escalate to the point of near unusability. To get a feeling for > how workable that is, we need a pretty exact list of the set of output > columns, not just a rough list of the kinds of things that will be > there. Responding to these in bulk, I think that 1, 3, and 4 are pretty convincing arguments that the SQL-based output format is underspecified. I hereby promise not to do anything about that without further discussion, which is an easy promise to make considering that in light of those comments I have no idea what it should look like. I think (1) is the most damning point. However, as far as I can see, none of these will affect XML or JSON. With respect to (2), I think we should output the same text format that we have now, for starters. I agree that's not the only thing that someone might want, but I think there's a pretty good argument that it's ONE thing that someone might reasonably want, depending on the application. If someone cares to build a better mousetrap in this area, it can be added on once we figure out the design, and without breaking anything! - that's sort of the whole point of this exercise. ...Robert
Sorry to come in on this discussion so late. Just catching up.... Robert Haas <robertmhaas@gmail.com> wrote: > Responding to these in bulk, I think that 1, 3, and 4 are pretty > convincing arguments that the SQL-based output format is > underspecified. I hereby promise not to do anything about that > without further discussion, which is an easy promise to make > considering that in light of those comments I have no idea what it > should look like. I think (1) is the most damning point. However, > as far as I can see, none of these will affect XML or JSON. Personally, I find XML to be very hard to read; however, I can see the value of writing to that and having someone who can tolerate XSLT turn XML into anything else we want. (That could include morphing it into SELECT statements with the literals to present it as a tuple set, I should think.) As long as nobody considers this issue "done" until there are useful and convenient ways to display and use the data within psql without having to look at the XML, that seems a reasonable approach. The big plus of the current technique is that it is so convenient to Ctrl+C something which is running too long, arrow up, hit Home, and put the EXPLAIN word in front. Turning the query into a character string literal and feeding it to a function would be a big step backward. A big down side of the current technique is that you can't get both the results of a SELECT and its plan. I haven't seen any discussion here about emitting the EXPLAIN output through some INFO messages or some such, and letting the query return its normal results, but I feel that would be a significant improvement, if it that be done. Also, something I miss from previous database products is a way to control the verbosity of the output when planning. I do think that needs to be some sort of option up front, not a filter phase, because of the high cost it can have. If there was a way to show all the candidate plans and their cost estimates in a run time environment, without any special build or configuration needed, I'd use it every now and then. -Kevin
Magnus Hagander a écrit : > Dave Page wrote: > >> I was thinking something similar, but from the pgAdmin perspective. We >> already use libxml2, but JSON would introduce another dependency for >> us. >> > > Yeah, but probably not a huge one. There is one for wx, but I don't > think it's included by default. > > +1 for the machine readable explain. FWIW, I have an early patch for phpPgAdmin about a graphical explain. IIRC when I wrote it, I told myself the parser might actually be broken with multi-level sub-queries or something. But I ended with the same parsing code than pgAdmin anyway. About the format, JSON would be the best here, as it is a one function call in PHP to retrieve an associative array from the JSON code. -- Guillaume (ioguix) de Rorthais
On Wed, May 27, 2009 at 1:30 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > The big plus of the current technique is that it is so convenient to > Ctrl+C something which is running too long, arrow up, hit Home, and > put the EXPLAIN word in front. Turning the query into a character > string literal and feeding it to a function would be a big step > backward. > > A big down side of the current technique is that you can't get both > the results of a SELECT and its plan. I haven't seen any discussion > here about emitting the EXPLAIN output through some INFO messages or > some such, and letting the query return its normal results, but I feel > that would be a significant improvement, if it that be done. Would something like this address both of your issues? http://article.gmane.org/gmane.comp.db.postgresql.devel.patches/21614/match=siginfo It let you hit a control character while the query was running to view the explain analyze for the results so far. The query kept running and you could request further updates whenever you wanted. -- greg
Greg Stark <stark@enterprisedb.com> wrote: > http://article.gmane.org/gmane.comp.db.postgresql.devel.patches/21614/match=siginfo > > It let you hit a control character while the query was running to view > the explain analyze for the results so far. The query kept running and > you could request further updates whenever you wanted. I'll have to check that out. It almost seems too easy.... :-) -Kevin