Thread: [patch] A \pivot command for psql
Hi, I want to suggest a client-side \pivot command in psql, implemented in the attached patch. \pivot takes the current query in the buffer, execute it and display it pivoted by interpreting the result as: column1 => row in pivoted output column2 => column in pivoted output column3 => value at (row,column) in pivoted ouput The advantage over the server-side crosstab() from contrib is in ease of use, mostly because \pivot doesn't need in advance the list of columns that result from pivoting. Typical use cases are queries that produce values along two axes: SELECT a,b,aggr(something) FROM tbl GROUP a,b; \pivot displays immediately the matrix-like representation Or displaying "pairing" between columns, as in SELECT a,b,'X' FROM tblA [LEFT|RIGHT|FULL] JOIN tblB... which once pivoted shows in an easily readable way what "a" is/isn't in relation with any "b". Columns are sorted with strcmp(). I think a more adequate sort could be obtained through a separate query with ORDER BY just for these values (casted to their original type), but the patch doesn't do that yet. Also, \pivot could take optionally the query as an argument instead of getting it only from the query buffer. Anyway, does it look useful enough to be part of psql? I guess I should push this to commitfest if that's the case. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Attachment
"Daniel Verite" <daniel@manitou-mail.org> writes: > I want to suggest a client-side \pivot command in psql, implemented > in the attached patch. > \pivot takes the current query in the buffer, execute it and > display it pivoted by interpreting the result as: > column1 => row in pivoted output > column2 => column in pivoted output > column3 => value at (row,column) in pivoted ouput I can see the value of a feature like this, but doing it in psql sure seems like the wrong place. It would be unavailable to anything except interactive use. Is there a way to implement pivoting as a set-returning function? regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/09/2015 10:37 AM, Tom Lane wrote: > I can see the value of a feature like this, but doing it in psql > sure seems like the wrong place. It would be unavailable to > anything except interactive use. > > Is there a way to implement pivoting as a set-returning function? That's exactly what crosstab() in the tablefunc contrib module is. I've heard rumblings of someone looking to build crosstab/pivot into the core grammar, but have not yet seen any patches... Joe - -- Joe Conway Crunchy Data Enterprise PostgreSQL http://crunchydata.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJVx6b7AAoJEDfy90M199hlOJYP/3WJTFrB22SlC7pxTFl/L8La 9F0DbNlCyyK/oUkYH+dy5MXBnwHTFAwj2sEik7xNhax7aPIMnXh095f0AaMjouVU S0J0dGb5quYYK+TUWBBL745nRIl786H2/XbZbP+L7pz2W72ITtTbKqMHpXVyzMiF DfEN9xnCd05MF0WiaAXtwtz8HXQkFJxD/r5kgmHvwMIBPvvzrAg6CwHh/8kQMlY2 1kvkYnKvFSNp+PrZ0CvANs6ZyqaDJN1w7nsXul7HAWu+KhBkHxAilnBkCjjOT5JD beF8E1KNo60k+3zjphEN1yKBl5tT7r9mYLhuOLuI0UdpNpdd8u+rSRTSFaIRMGQ7 UDQIXOtVHSlKSSOpXHH6FYvksUQMDVvPeSwKoI2imwAjSStkInHJMj6cs2uKxsZi 5P0sRnulx7Ur4M1mCq7t3+IKiaZ2KzO0WZ5ewJ1mMt5hIqD7QADjvDoPn0qozV5T lX3pY4PUaL+N8XwlgBb69vGgUeG3N4nEvDSiUyLbFC3au/osczRGloYBq8AgDNuX Dta9AOkzbNAA82ODUzFoFts8/1Ydu8vhwnpQiNcl772Hf0fpNvdFGtclc4K+2ToX 4k2WmezT5y8d6IdPPDqM92wbWXAOBIy1I+kaYnbnQpxn2QYzkKrPbJ2unTTKTdUa 5/uFA0fg37acwloBMux/ =5lhb -----END PGP SIGNATURE-----
Tom Lane wrote: > Is there a way to implement pivoting as a set-returning function? Not with the same ease of use. We have crosstab functions in contrib/tablefunc already, but the killer problem with PIVOT is that truly dynamic columns are never reachable directly. If we could do this: SELECT * FROM crosstab('select a,b,c FROM tbl'); and the result came back pivoted, with a column for each distinct value of b, there will be no point in a client-side pivot. But postgres (or I suppose any SQL interpreter) won't execute this, for not knowing beforehand what structure "*" is going to have. So what is currently required from the user, with dynamic columns, is more like: 1st pass: identify the columnsSELECT DISTINCT a FROM tbl; 2nd pass: inject the columns, in a second embedded query and in a record definition, with careful quoting: select * from crosstab( 'SELECT a,b,c FROM tbl ORDER BY 1', ' VALUES (col1),(col2),(col3)...' -- or 'select distinct...'again ) AS ct(b type, "col1" type, "col2" type, "col3" type) Compared to this, \pivot limited to the psql interpreter is a no-brainer, we could just write instead: => select a,b,c FROM tbl; => \pivot This simplicity is the whole point. It's the result of doing the operation client-side, where the record structure can be pivoted without the target structure being formally declared. Some engines have a built-in PIVOT syntax (Oracle, SQL server). I have looked only at their documentation. Their pivot queries look nicer and are possibly more efficient than with SET functions, but AFAIK one still needs to programmatically inject the list of column values into them, when that list is not static. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
"Daniel Verite" <daniel@manitou-mail.org> writes: > Tom Lane wrote: >> Is there a way to implement pivoting as a set-returning function? > Not with the same ease of use. We have crosstab functions > in contrib/tablefunc already, but the killer problem with PIVOT > is that truly dynamic columns are never reachable directly. I'm not sure how pushing it out to psql makes that better. There is no way to do further processing on something that psql has printed, so you've punted on solving that issue just as much if not more. I agree that the crosstab solution leaves a lot to be desired as far as ease of use goes, but I don't want to define fixing its ease-of-use problem as being "it's easy for manual use in psql". psql is a minority API, you know. Besides which, psql is not all that great for looking at very wide tables, so I'm not sure that this would be very useful for dynamic column sets anyway. regards, tom lane
On Sun, Aug 09, 2015 at 07:29:40PM +0200, Daniel Verite wrote: > Hi, > > I want to suggest a client-side \pivot command in psql, implemented > in the attached patch. > > \pivot takes the current query in the buffer, execute it and > display it pivoted by interpreting the result as: > > column1 => row in pivoted output > column2 => column in pivoted output > column3 => value at (row,column) in pivoted ouput This is really neat work, and thanks for the patch. This issue in this one as in the crosstab extension is that it's available only if you are using some particular piece of extra software, in this case the psql client. I'm working up a proposal to add (UN)PIVOT support to the back-end. Would you like to join in on that? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
2015-08-10 6:04 GMT+02:00 David Fetter <david@fetter.org>:
On Sun, Aug 09, 2015 at 07:29:40PM +0200, Daniel Verite wrote:
> Hi,
>
> I want to suggest a client-side \pivot command in psql, implemented
> in the attached patch.
>
> \pivot takes the current query in the buffer, execute it and
> display it pivoted by interpreting the result as:
>
> column1 => row in pivoted output
> column2 => column in pivoted output
> column3 => value at (row,column) in pivoted ouput
This is really neat work, and thanks for the patch.
This issue in this one as in the crosstab extension is that it's
available only if you are using some particular piece of extra
software, in this case the psql client.
I'm working up a proposal to add (UN)PIVOT support to the back-end.
Would you like to join in on that?
PIVOT, UNPIVOT should be preferred solution
Pavel
I can look on implementations in other db
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Fetter wrote: > I'm working up a proposal to add (UN)PIVOT support to the back-end. I was under the impression that a server-side PIVOT *with dynamic columns* was just unworkable as an SQL query, because it couldn't be prepared if it existed. I am wrong on that? I feel like you guys are all telling me that \pivot should happen on the server, but the point that it would not be realistic to begin with is not considered. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Tom Lane wrote: > I'm not sure how pushing it out to psql makes that better. There is > no way to do further processing on something that psql has printed, > so you've punted on solving that issue just as much if not more. It's the same spirit as \x : the only thing it achieves is better readability in certain cases, I don't expect more from it. But I admit that \pivot would be much more of a "niche use case" than \x > Besides which, psql is not all that great for looking at very wide tables, > so I'm not sure that this would be very useful for dynamic column sets > anyway. I use \x all the time with wide tables and \x and \pivot happen to play out well together (In fact I was pleasantly surprised by this) Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
On Mon, Aug 10, 2015 at 07:10:41PM +0200, Daniel Verite wrote: > David Fetter wrote: > > > I'm working up a proposal to add (UN)PIVOT support to the back-end. > > I was under the impression that a server-side PIVOT *with dynamic > columns* was just unworkable as an SQL query, because it couldn't be > prepared if it existed. That depends on what you mean by "dynamic columns." The approach taken in the tablefunc extension is to use functions which return SETOF RECORD, which in turn need to be cast at runtime. At least one other implementation takes two separate approaches, both interesting at least from my point of view. The first is to spell out all the columns in the query, which is not "dynamic columns" in any reasonable sense, as "hand-craft one piece of SQL whose purpose is to generate the actual pivot SQL" is not a reasonable level of dynamic. The second, more on point, is to specify a serialization for the rows in the "dynamic columns" case. Their syntax is "PIVOT XML", but I would rather do something more like "PIVOT (SERIALIZATION XML)". A third idea I have only roughed out feels a bit like cheating: creating a function which returns two distinct rowtypes via REFCURSORs or similar. The first result, used to create a CAST, spells out the row type of the second. > I am wrong on that? I feel like you guys are all telling me that > \pivot should happen on the server, but the point that it would not > be realistic to begin with is not considered. I think that starting the conversation again, especially at this stage of the 9.6 cycle, is a very good thing, whatever its outcome. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/10/2015 04:03 PM, David Fetter wrote: >> I am wrong on that? I feel like you guys are all telling me that >> \pivot should happen on the server, but the point that it would >> not be realistic to begin with is not considered. > > I think that starting the conversation again, especially at this > stage of the 9.6 cycle, is a very good thing, whatever its > outcome. Talk to Atri Sharma -- he recently indicated to me that he might be working on built-in pivot in the near term future (if not already started). - -- Joe Conway Crunchy Data Enterprise PostgreSQL http://crunchydata.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJVyUVSAAoJEDfy90M199hlhEgP/1mQpZ0JOR2kbaP5Iw6FY4q+ P1SLWbCWMhVDp97OAjbP0T34YmG5Rw0fDh4WLPbNBp6xyPTQQgIVFvQAu6kDSngk X9kbQHKGE+pNqc9hMc7CyuSse8Pw8VeQGRDU5a+8E3fPdi9rbB2YTDt7SJIXlavc zJ8kZlUj59Xw9Kdkpon8Jb/Nxn3JV4GWLTe4+nxRZoH/9POjslyM+rVGtrMlHA59 0NWWPnTsLfU1HNk+olf72ihZpmQM4KPog8PdWo0GyFqJhMwmYtE/WTJh4jNDygBe NXTQE7/I5OA6fYQniq+7Wsyhc5raOH16lVSSo0QquuTtGG2mrYsvd82Zx7J0SDQp NfVk4qgjJYMNBN9/hvPXZZ2+LReYqliloR2PqLqxgDn3DyGgSpSUs1JyDZRtoJEj P4jEVGVWnUbjKuNldRJZi1DmMTKVSS2RSnoC0RJ7DzAfUyQ4oH4FmX8jX5rZpoXN nJLtVPhIRpp0A2Lq849hXB3/LuNzjYmZ3VvGNUwffTD7d3XxQ/Zeb9ZtWZszfJUo zjMAh0wFwDtdVBYdFStzlByGTSEIqeqdGCDSHhiZlNw/E0xV2YjNzdUP9N34aWml i+KUgPMxTJ/GAineSXpjt+I6Y+cHA10JpQLOf6fVdrOp/R1Z2EqFfZl2GzbyXJ5n lz4QjhOfx6YG1Tdi5qYW =nyuc -----END PGP SIGNATURE-----
On Sun, Aug 9, 2015 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > psql is a minority API, you know. Not for me. psql has already got a bunch of bells and whistles to format things in particular ways that people have wanted, and I'm not really sure why the bar for this proposal should be any higher. Is this less useful than \pset linestype unicode, complete with trying to auto-detect whether to enable the feature? Than automatically switching between expanded mode and regular mode based on the width of the TTY? I considered those features rather frivolous, but it seemed to me that their inclusion was widely supported and that a number of people were really quite happy about them. I guess you could argue that those are inherently client-side features and this is not, and I'll grant that point. But a client-side feature in 9.5 beats a server-side feature in 10.6 every day of the week. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Aug 11, 2015 at 10:13:48AM -0400, Robert Haas wrote: > On Sun, Aug 9, 2015 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > psql is a minority API, you know. > > Not for me. psql has already got a bunch of bells and whistles to > format things in particular ways that people have wanted, and I'm not > really sure why the bar for this proposal should be any higher. Is > this less useful than \pset linestype unicode, complete with trying to > auto-detect whether to enable the feature? Than automatically > switching between expanded mode and regular mode based on the width of > the TTY? I considered those features rather frivolous, but it seemed > to me that their inclusion was widely supported and that a number of > people were really quite happy about them. > > I guess you could argue that those are inherently client-side features > and this is not, and I'll grant that point. But a client-side feature > in 9.5 9.5?!? I hope you meant 9.6 because if 9.5 is still open for new features, we're going to have to explain to all the people who thought they missed the cut-off why this one is jumping the line in front of them. > beats a server-side feature in 10.6 every day of the week. Having done some of the preliminary research into a server-side implementation, I see it a different way. First, this is more of a 9.6 feature than a 10.6, at least in terms of the first cut functionality, and I don't see good reasons why the performance would need to be terrible in the first cut. In particular, the PIVOT case looks like it could pretty easily use the FILTER machinery directly, while the UNPIVOT case could use what LATERAL does. Doubtless, further optimizations are possible for each, and for interesting sub-cases, but that's a project for later, as performance optimizations should be. Second, if we put this feature as-is in psql, we're stuck supporting it in psql until the end of time, even if (when, I believe) we have a fuller and likely not perfectly compatible feature on the back-end. That said, a thing in psql that could slice serialized output into columns would be handy as a broad, general part of reporting in psql, and would mesh quite nicely with a back-end PIVOT (SERIALIZATION FOO) or whatever syntax we land on. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter wrote: > That depends on what you mean by "dynamic columns." The approach > taken in the tablefunc extension is to use functions which return > SETOF RECORD, which in turn need to be cast at runtime. For me, "PIVOT with dynamic columns" would be a pivot query whose output columns are not enumerated as input in the SQL query itself, in any form. > The second, more on point, is to specify a serialization for the rows > in the "dynamic columns" case. Their syntax is "PIVOT XML", but I > would rather do something more like "PIVOT (SERIALIZATION XML)". The SERIALIZATION looks interesting, but I believe these days JSON would make more sense than XML, both as easier for the client-side and because of all the json_* functions we now have to mix json with relational structures. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
David Fetter wrote: > Second, if we put this feature as-is in psql, we're stuck supporting > it in psql until the end of time, even if (when, I believe) we have a > fuller and likely not perfectly compatible feature on the back-end. To me, doing \pivot in psql vs PIVOT in the backend is a false dichotomy, both would be desirable to have in any order. Having PIVOT in SQL is more important because it will help in a broader range of cases. But, even if I fail to be convincing on this, it will still be much easier in a psql context to just type \pivot than turning a non-pivot query into a pivoted equivalent, even if we achieve the best possible (in ease of use) server-side SQL implementation. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
On Tue, Aug 11, 2015 at 05:13:03PM +0200, Daniel Verite wrote: > David Fetter wrote: > > > That depends on what you mean by "dynamic columns." The approach > > taken in the tablefunc extension is to use functions which return > > SETOF RECORD, which in turn need to be cast at runtime. > > For me, "PIVOT with dynamic columns" would be a pivot query > whose output columns are not enumerated as input in the > SQL query itself, in any form. I'm pretty sure people will want to be able to specify them in some form. On one implementation, it looks like: select * from ( select times_purchased as "Purchase Frequency", state_code from customers t ) pivot xml ( count(state_code) for state_code in (select state_code from preferred_states) ) order by 1 Another basically punts by making you responsible for generating the SQL dynamically, a move I regard as a horrible UX failure. > > The second, more on point, is to specify a serialization for the rows > > in the "dynamic columns" case. Their syntax is "PIVOT XML", but I > > would rather do something more like "PIVOT (SERIALIZATION XML)". > > The SERIALIZATION looks interesting, but I believe these days JSON > would make more sense than XML, both as easier for the client-side and > because of all the json_* functions we now have to mix json with > relational structures. I proposed SERIALIZATION as a parameter precisely so we could use different ones for different cases. JSON is certainly popular this year, as XML was in prior years. I may be wrong, but I'm certain that there will be new ones, even popular ones, that haven't yet been invented. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter wrote: > That said, a thing in psql that could slice serialized output into > columns would be handy as a broad, general part of reporting in > psql To avoid any confusion with server-side PIVOT, I suggest that the currently proposed command in psql should have a different name than \pivot. The general idea is indeed pivoting, but what it precisely does is project a resultset from a 3-column query onto cells in a 2D grid-like arrangement. It differs significantly from what existing PIVOT SQL commands do, looking closely at them it appears in particular that: - they don't care about a leftmost column to hold "row titles", whereas it's essential to the psql feature. - as mentioned upthread, the need to enumerate in advance the output columns is a crucial point with the SQL pivot, whereaspsql doesn't care, as it just displays a resultset that is already obtained. - they operate with an aggregate function at their heart, whereas it's also irrelevant to the psql display feature whetherthere's an aggregate in the query. For a different name, I've thought of these alternatives that belong to client-side vocabulary: \rotate \sheetview \gridview \grid3view (to insist that it works on 3 columns). \matrix \matview \matrixview \crosstab (at the risk of confusion with the contrib feature) Opinions? I'd go for \rotate personally. Also I'll try to demonstrate use case with concrete examples. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite