Thread: explain output infelicity in psql
I have just noticed while checking the EXPLAIN YAML patch that the non-text explain formats are output as a single line with embedded line feeds, while the text format is delivered as a set of text records, one per line. The practical effect of this is that psql decorates the non-text format output with continuation characters: andrew=# explain select count(*) from pg_class where relname ~ 'pg_'; QUERY PLAN ---------------------------------------------------------------- Aggregate (cost=9.67..9.68 rows=1width=0) -> Seq Scan on pg_class (cost=0.00..9.16 rows=204 width=0) Filter: (relname ~ 'pg_'::text) (3 rows) Time: 5.813 ms andrew=# explain (format yaml) select count(*) from pg_class where relname ~ 'pg_'; QUERY PLAN ----------------------------------------- - Plan: + Node Type: Aggregate + Strategy: Plain + Startup Cost: 9.67 + Total Cost: 9.68 + Plan Rows: 1 + Plan Width: 0 + Plans: + - Node Type: Seq Scan + ParentRelationship: Outer + Relation Name: pg_class + Alias: pg_class + Startup Cost: 0.00 + Total Cost: 9.16 + Plan Rows: 204 + Plan Width: 0 + Filter: (relname ~ 'pg_'::text) (1 row) Those + chars at the end of the line are ugly, to say the least, and they make the supposedly machine-readable formats not so machine readable if anyone wanted to c&p the output into a parser. (I'm mildly surprised this hasn't been noticed before). Maybe we need to teach psql not to do this formatting for EXPLAIN output? cheers andrew
On Wed, Dec 9, 2009 at 2:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > > I have just noticed while checking the EXPLAIN YAML patch that the non-text > explain formats are output as a single line with embedded line feeds, while > the text format is delivered as a set of text records, one per line. The > practical effect of this is that psql decorates the non-text format output > with continuation characters: > > andrew=# explain select count(*) from pg_class where relname ~ 'pg_'; > QUERY PLAN > ---------------------------------------------------------------- > Aggregate (cost=9.67..9.68 rows=1 width=0) > -> Seq Scan on pg_class (cost=0.00..9.16 rows=204 width=0) > Filter: (relname ~ 'pg_'::text) > (3 rows) > > Time: 5.813 ms > andrew=# explain (format yaml) select count(*) from pg_class where > relname ~ 'pg_'; > QUERY PLAN > ----------------------------------------- > - Plan: + > Node Type: Aggregate + > Strategy: Plain + > Startup Cost: 9.67 + > Total Cost: 9.68 + > Plan Rows: 1 + > Plan Width: 0 + > Plans: + > - Node Type: Seq Scan + > Parent Relationship: Outer + > Relation Name: pg_class + > Alias: pg_class + > Startup Cost: 0.00 + > Total Cost: 9.16 + > Plan Rows: 204 + > Plan Width: 0 + > Filter: (relname ~ 'pg_'::text) > (1 row) > > Those + chars at the end of the line are ugly, to say the least, and they > make the supposedly machine-readable formats not so machine readable if > anyone wanted to c&p the output into a parser. (I'm mildly surprised this > hasn't been noticed before). > > Maybe we need to teach psql not to do this formatting for EXPLAIN output? Oh, dear. I think that line continuation syntax was recently added - subsequent to the machine-readable EXPLAIN patch. The reason why it's coded to emit everything as a single row is because that will be most convenient for programs that are sucking down this data programatically. Otherwise, they'll have to concatenate all the lines that are returned. And in fact for XML format, it's even worse: the data is returned as type xml, but that obviously won't fly if we return each line as a separate tuple. On first blush, I'm inclined to suggest that the addition of + signs to mark continuation lines is a misfeature. ...Robert
Robert Haas wrote: > On Wed, Dec 9, 2009 at 2:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > >> I have just noticed while checking the EXPLAIN YAML patch that the non-text >> explain formats are output as a single line with embedded line feeds, while >> the text format is delivered as a set of text records, one per line. The >> practical effect of this is that psql decorates the non-text format output >> with continuation characters: >> >> [snip] >> Those + chars at the end of the line are ugly, to say the least, and they >> make the supposedly machine-readable formats not so machine readable if >> anyone wanted to c&p the output into a parser. (I'm mildly surprised this >> hasn't been noticed before). >> >> Maybe we need to teach psql not to do this formatting for EXPLAIN output? >> > > Oh, dear. I think that line continuation syntax was recently added - > subsequent to the machine-readable EXPLAIN patch. The reason why it's > coded to emit everything as a single row is because that will be most > convenient for programs that are sucking down this data > programatically. Otherwise, they'll have to concatenate all the lines > that are returned. > > And in fact for XML format, it's even worse: the data is returned as > type xml, but that obviously won't fly if we return each line as a > separate tuple. > > On first blush, I'm inclined to suggest that the addition of + signs > to mark continuation lines is a misfeature. > > > I certainly agree we don't want to break up the non-text formats. A simple if ugly hack would make psql use old-ascii print style (which doesn't use these contionuation chars) if the first attribute in the resultset was named 'QUERY PLAN' If someone has a better suggestion I'm all ears. cheers andrew
2009/12/9 Robert Haas <robertmhaas@gmail.com>: > On Wed, Dec 9, 2009 at 2:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> >> I have just noticed while checking the EXPLAIN YAML patch that the non-text >> explain formats are output as a single line with embedded line feeds, while >> the text format is delivered as a set of text records, one per line. The >> practical effect of this is that psql decorates the non-text format output >> with continuation characters: >> >> andrew=# explain select count(*) from pg_class where relname ~ 'pg_'; >> QUERY PLAN >> ---------------------------------------------------------------- >> Aggregate (cost=9.67..9.68 rows=1 width=0) >> -> Seq Scan on pg_class (cost=0.00..9.16 rows=204 width=0) >> Filter: (relname ~ 'pg_'::text) >> (3 rows) >> >> Time: 5.813 ms >> andrew=# explain (format yaml) select count(*) from pg_class where >> relname ~ 'pg_'; >> QUERY PLAN >> ----------------------------------------- >> - Plan: + >> Node Type: Aggregate + >> Strategy: Plain + >> Startup Cost: 9.67 + >> Total Cost: 9.68 + >> Plan Rows: 1 + >> Plan Width: 0 + >> Plans: + >> - Node Type: Seq Scan + >> Parent Relationship: Outer + >> Relation Name: pg_class + >> Alias: pg_class + >> Startup Cost: 0.00 + >> Total Cost: 9.16 + >> Plan Rows: 204 + >> Plan Width: 0 + >> Filter: (relname ~ 'pg_'::text) >> (1 row) >> >> Those + chars at the end of the line are ugly, to say the least, and they >> make the supposedly machine-readable formats not so machine readable if >> anyone wanted to c&p the output into a parser. (I'm mildly surprised this >> hasn't been noticed before). >> >> Maybe we need to teach psql not to do this formatting for EXPLAIN output? > > Oh, dear. I think that line continuation syntax was recently added - > subsequent to the machine-readable EXPLAIN patch. The reason why it's > coded to emit everything as a single row is because that will be most > convenient for programs that are sucking down this data > programatically. Otherwise, they'll have to concatenate all the lines > that are returned. > > And in fact for XML format, it's even worse: the data is returned as > type xml, but that obviously won't fly if we return each line as a > separate tuple. > > On first blush, I'm inclined to suggest that the addition of + signs > to mark continuation lines is a misfeature. +1 Pavel > > ...Robert > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Robert Haas escribió: > On first blush, I'm inclined to suggest that the addition of + signs > to mark continuation lines is a misfeature. -1 EXPLAIN is a special case. IMHO it should be dealt with accordingly. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Dec 10, 2009 at 10:12:32AM -0300, Alvaro Herrera wrote: > Robert Haas escribió: > > > On first blush, I'm inclined to suggest that the addition of + signs > > to mark continuation lines is a misfeature. > > -1 > > EXPLAIN is a special case. IMHO it should be dealt with accordingly. If the formatting code can be taught that it's outputting for explain, we can skip the wrap/newline markup easily. I don't think we necessarily need to fall back to the old-ascii format, we just conditionally disable that specific part. Alternatively, would it make more sense just to add a boolean pset parameter to enable/disable the use of wrap/newline marks? It may be that people may wish to disable it for use cases in addition to EXPLAIN. Regards, Roger -- .''`. Roger Leigh: :' : Debian GNU/Linux http://people.debian.org/~rleigh/`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
Roger Leigh wrote: > On Thu, Dec 10, 2009 at 10:12:32AM -0300, Alvaro Herrera wrote: > >> Robert Haas escribió: >> >> >>> On first blush, I'm inclined to suggest that the addition of + signs >>> to mark continuation lines is a misfeature. >>> >> -1 >> >> EXPLAIN is a special case. IMHO it should be dealt with accordingly. >> > > If the formatting code can be taught that it's outputting for explain, > we can skip the wrap/newline markup easily. I don't think we > necessarily need to fall back to the old-ascii format, we just > conditionally disable that specific part. > > Alternatively, would it make more sense just to add a boolean pset > parameter to enable/disable the use of wrap/newline marks? It may > be that people may wish to disable it for use cases in addition > to EXPLAIN. > > > > We have already added a lot of knobs to twist, and I don't want to add any more. cheers andrew
On Wed, Dec 9, 2009 at 6:41 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> Oh, dear. I think that line continuation syntax was recently added - >> subsequent to the machine-readable EXPLAIN patch. The reason why it's >> coded to emit everything as a single row is because that will be most >> convenient for programs that are sucking down this data >> programatically. Otherwise, they'll have to concatenate all the lines >> that are returned. >> >> And in fact for XML format, it's even worse: the data is returned as >> type xml, but that obviously won't fly if we return each line as a >> separate tuple. >> >> On first blush, I'm inclined to suggest that the addition of + signs >> to mark continuation lines is a misfeature. > > I certainly agree we don't want to break up the non-text formats. > > A simple if ugly hack would make psql use old-ascii print style (which > doesn't use these contionuation chars) if the first attribute in the > resultset was named 'QUERY PLAN' > > If someone has a better suggestion I'm all ears. I don't believe that machine-readable EXPLAIN output is the only multi-line output value that anyone would ever wish to cut and paste into an editor without picking up a lot of stray garbage, so I don't think this is a solution. ...Robert
Robert Haas wrote: > I don't believe that machine-readable EXPLAIN output is the only > multi-line output value that anyone would ever wish to cut and paste > into an editor without picking up a lot of stray garbage, so I don't > think this is a solution. > > > Well, yes, another example that has just occurred to me is pg_proc.prosrc. So maybe this really is a misfeature. cheers andrew
Alvaro Herrera wrote: > Robert Haas escribió: >> On first blush, I'm inclined to suggest that the addition of + signs >> to mark continuation lines is a misfeature. > > EXPLAIN is a special case. IMHO it should be dealt with accordingly. > Is it? Wouldn't it affect anyone who stuck XML in a txt column and wanted to copy and paste it into a parser? Perhaps single column output usually won't want the + signs (because it's copy&pasteable) but multi-column output could?
Ron Mayer wrote: > Alvaro Herrera wrote: > >> Robert Haas escribió: >> >>> On first blush, I'm inclined to suggest that the addition of + signs >>> to mark continuation lines is a misfeature. >>> >> EXPLAIN is a special case. IMHO it should be dealt with accordingly. >> >> > > Is it? > > Wouldn't it affect anyone who stuck XML in a txt column and wanted > to copy and paste it into a parser? > > Perhaps single column output usually won't want the + signs (because > it's copy&pasteable) but multi-column output could? > Yeah, I'm thinking we should probably suppress output of format.nl_right (no matter what the format) where there is only one column. (This is even uglier with unicode linestyle, btw). That's a sane rule and it's not an ugly hack. cheers andrew
On Thu, Dec 10, 2009 at 11:44 AM, Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote: > Alvaro Herrera wrote: >> Robert Haas escribió: >>> On first blush, I'm inclined to suggest that the addition of + signs >>> to mark continuation lines is a misfeature. >> >> EXPLAIN is a special case. IMHO it should be dealt with accordingly. >> > > Is it? > > Wouldn't it affect anyone who stuck XML in a txt column and wanted > to copy and paste it into a parser? > > Perhaps single column output usually won't want the + signs (because > it's copy&pasteable) but multi-column output could? I don't think inconsistency is a good thing here. Apparently the old format is available via \pset linestyle old-ascii. We can either decide that people should use that format if they want that behavior, or we can decide that changing the default was a mistake and revert it. I don't think a half-way-in-between solution is a good option. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > Andrew Dunstan <andrew@dunslane.net> wrote: >> A simple if ugly hack would make psql use old-ascii print style >> (which doesn't use these contionuation chars) if the first >> attribute in the resultset was named 'QUERY PLAN' > I don't believe that machine-readable EXPLAIN output is the only > multi-line output value that anyone would ever wish to cut and > paste into an editor without picking up a lot of stray garbage, so > I don't think this is a solution. Agreed. This would be a significant annoyance for me on a regular basis. If I can't turn it off, it would probably cause me to create my own locally patched version of psql. Another alternative would be to use some other tool to run queries where I wanted long values without this, but psql has so many nice features that I'd be switching back and forth, so the patch would probably be easier. -Kevin
Andrew Dunstan <andrew@dunslane.net> writes: > Yeah, I'm thinking we should probably suppress output of format.nl_right > (no matter what the format) where there is only one column. (This is > even uglier with unicode linestyle, btw). That's a sane rule and it's > not an ugly hack. Yes it is. The real problem here is expecting the tabular format to be copy and pasteable, which is not a design goal it's ever had, and not one that we can start imposing on it at this late date. Why don't you just do "\pset format unaligned" (or "\a" if you're lazy)? regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Yeah, I'm thinking we should probably suppress output of format.nl_right >> (no matter what the format) where there is only one column. (This is >> even uglier with unicode linestyle, btw). That's a sane rule and it's >> not an ugly hack. >> > > Yes it is. I don't see much virtue of having these characters when there is only one column. > The real problem here is expecting the tabular format to be > copy and pasteable, which is not a design goal it's ever had, and not > one that we can start imposing on it at this late date. Why don't you > just do "\pset format unaligned" (or "\a" if you're lazy)? > > > Well, I haven't had to up to now. I'm not sure what we have is exactly an advance. I guess I can set the linestyle to old-ascii in my .psqlrc, but having to do that all over the place is annoying. And clearly I'm not the only person who doesn't like this behaviour. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I don't see much virtue of having these characters when there is only > one column. So you can tell a newline in the data from a wrap due to line length. The need to be able to do that is not dependent on how many columns there are. > And clearly I'm not the only person who doesn't like this behaviour. It's just our usual negative reaction to any change whatsoever ;-). I was unimpressed with Leigh's changes too at the start, but I can see that there is value in it. I think that changing the behavior depending on how many columns there are is an incredibly ugly hack, and your assertions to the contrary aren't going to change my mind. If we think that this is such a bad idea it should be reverted, then let's revert it altogether. Another possibility, which I don't understand why it was dismissed so cavalierly, is to have EXPLAIN put out one row per logical line instead of using embedded newlines. regards, tom lane
On Thu, Dec 10, 2009 at 12:43 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > Tom Lane wrote: >> >> Andrew Dunstan <andrew@dunslane.net> writes: >> >>> >>> Yeah, I'm thinking we should probably suppress output of format.nl_right >>> (no matter what the format) where there is only one column. (This is even >>> uglier with unicode linestyle, btw). That's a sane rule and it's not an ugly >>> hack. >>> >> >> Yes it is. > > I don't see much virtue of having these characters when there is only one > column. I don't see much virtue in having these characters, period. The reason for having them is presumably to avoid confusion between two rows and one row with an embedded newline. If anything, this confusion is more likely with a single column than it is with multiple columns, since you don't have the context of the surrounding output. ...Robert
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> I don't see much virtue of having these characters when there is only >> one column. >> > > So you can tell a newline in the data from a wrap due to line length. > The need to be able to do that is not dependent on how many columns > there are. > If that's really what we want then I think we're doing a terrible job of it. Have a look at the output of: select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' as a,1 as b; How do we know from that where the linefeeds are, exactly? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> So you can tell a newline in the data from a wrap due to line length. >> The need to be able to do that is not dependent on how many columns >> there are. > If that's really what we want then I think we're doing a terrible job of > it. Have a look at the output of: > select > E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' > as a, 1 as b; > How do we know from that where the linefeeds are, exactly? It works quite nicely for me ... in HEAD that is: regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' as a, 1 as b; a | b ------------------------------------------------------+---xxxxxxx +| 1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +| xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | (1 row) regression=# The point here is exactly that previous versions didn't show the distinction well. regards, tom lane
Tom Lane wrote: > Why don't you > just do "\pset format unaligned" (or "\a" if you're lazy)? That's fair. Now that I see it, I guess I should have been doing that for copy&paste work anyway.
Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote: > Tom Lane wrote: >> Why don't you >> just do "\pset format unaligned" (or "\a" if you're lazy)? > > That's fair. Now that I see it, I guess I should have been > doing that for copy&paste work anyway. That'll cover my use cases. -Kevin
Tom Lane wrote: > regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' > as a, 1 as b; > a | b > ------------------------------------------------------+--- > xxxxxxx +| 1 > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +| > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | > (1 row) > > regression=# > > The point here is exactly that previous versions didn't show the > distinction well. > > > If we really want to make linefeeds visible, I think we should place the indicators immediately after the character preceding the line feed, not next to the column separator. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > If we really want to make linefeeds visible, I think we should place the > indicators immediately after the character preceding the line feed, not > next to the column separator. Then it's hard to tell it apart from an instance of that character in the data. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> If we really want to make linefeeds visible, I think we should place the >> indicators immediately after the character preceding the line feed, not >> next to the column separator. >> > > Then it's hard to tell it apart from an instance of that character in > the data. > > > Yeah, I just thought of that. Oh, well, old-ascii for me ;-) cheers andrew
Andrew Dunstan wrote: > > > Tom Lane wrote: > > regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' > > as a, 1 as b; > > a | b > > ------------------------------------------------------+--- > > xxxxxxx +| 1 > > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +| > > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | > > (1 row) > > > > regression=# > > > > The point here is exactly that previous versions didn't show the > > distinction well. > > > > > > > > If we really want to make linefeeds visible, I think we should place the > indicators immediately after the character preceding the line feed, not > next to the column separator. One idea would be to change the column _separator_ for newlines --- that way, they don't show up for single-column output. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, Dec 10, 2009 at 8:11 PM, Bruce Momjian <bruce@momjian.us> wrote: > Andrew Dunstan wrote: >> >> >> Tom Lane wrote: >> > regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' >> > as a, 1 as b; >> > a | b >> > ------------------------------------------------------+--- >> > xxxxxxx +| 1 >> > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +| >> > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | >> > (1 row) >> > >> > regression=# >> > >> > The point here is exactly that previous versions didn't show the >> > distinction well. >> > >> > >> > >> >> If we really want to make linefeeds visible, I think we should place the >> indicators immediately after the character preceding the line feed, not >> next to the column separator. > > One idea would be to change the column _separator_ for newlines --- that > way, they don't show up for single-column output. Hilariously enough, that's exactly what we used to do. I am leaning toward the view that we should revert all the ASCII output format changes vs. 8.4 and let people use the new unicode mode if they want all the spiffy bells and whistles. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > Hilariously enough, that's exactly what we used to do. I am leaning > toward the view that we should revert all the ASCII output format > changes vs. 8.4 and let people use the new unicode mode if they want > all the spiffy bells and whistles. There are clearly use-cases for this feature; I think arguing to revert it is an extreme overreaction. We could reconsider which behavior ought to be default, perhaps. regards, tom lane
On tor, 2009-12-10 at 22:02 -0500, Robert Haas wrote: > On Thu, Dec 10, 2009 at 8:11 PM, Bruce Momjian <bruce@momjian.us> wrote: > > One idea would be to change the column _separator_ for newlines --- that > > way, they don't show up for single-column output. > > Hilariously enough, that's exactly what we used to do. Well, the only reason why that "worked" for this case is that it didn't work at all when the column in question was the first one.