Re: [patch] Proposal for \crosstabview in psql - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: [patch] Proposal for \crosstabview in psql
Date
Msg-id CAKFQuwZB=Eb3xGrM8zqvo-f650rxPmYHshqHZQUiEQePaCcezw@mail.gmail.com
Whole thread Raw
In response to Re: [patch] Proposal for \crosstabview in psql  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Thu, Apr 7, 2016 at 1:26 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I wonder if the business of appending values of multiple columns
separated with spaces is doing us any good.  Why not require that
there's a single column in the cell?  If the user wants to put things
together, they can use format() or just || the fields together.  What
benefit is there to the ' '?  When I ran my first test queries over
pg_class I was surprised about this behavior:

alvherre=# select * from pg_class
alvherre=# \crosstabview relnatts relkind

 relnatts |                                                                       r                                                                        |                                               t                                                |                                                        i                                                         |                                                                           v
----------+------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------
       26 | pg_statistic 11 11397 0 10 0 2619 0 15 380 15 2840 t f p 0 f f f f f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)                          |                                                                                                |                                                                                                                  |
       30 | pg_type 11 71 0 10 0 0 0 9 358 9 0 t f p 0 t f f f f f f t n 540 1 {=r/alvherre} (null)                                                        |                                                                                                |                                                                                                                  |
        3 | pg_user_mapping 11 11633 0 10 0 1418 0 0 0 0 0 t f p 0 t f f f f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)                             +| pg_toast_2604 99 11642 0 10 0 2830 0 0 0 0 0 t f p 0 f f f f f f f t n 540 1 (null) (null)    +| pg_amop_opr_fam_index 11 0 0 10 403 2654 0 5 688 0 0 f f p 0 f f f f f f f t n 0 0 (null) (null)                +| pg_group 11 11661 0 10 0 11660 0 0 0 0 0 f f p 0 f f t f f f f t n 0 0 {=r/alvherre} (null)                                                          +


I'm tempted to rip that out, unless you have a reason not to.

In fact, I think even the grouping of values of multiple rows with \n is
not terribly great either.  Why not just require people to group the
values beforehand?  You can use "string_agg(column, E'\n')" to get the
same behavior, plus you can do other things such as sum() etc.

​Went and looked at the examples page and at first blush it seems like this module only understands text.  My specific concern here is dealing with "numbers-as-text" sorting.​

​As to the question of behavior when multiple columns (and rows?) are present: ​we need some sort of default do we not.  Nothing is precluding the user from doing their own aggregates and limiting the select-list.  That said I'm more inclined to error if the input data in not unique on (v,h).  I feel the possibility of a user query bug going unnoticed in that scenario is reasonably large since its likely that only some combinations of duplicates appear.  I'm a bit less tentative regarding column concatenation since I would expect that nearly every cell involved in the output would be noticeably affected.  Though, if we are going to protect against extra rows extending that to protect against extra columns seems fair.

Another option is, possibly conditioned on the first two columns being the headers, to only take the column in the third position (or, the first unassigned column).and display it.

Otherwise if multiple candidate columns are present and none are chosen for the cell we could just error and force the user to explicitly choose.

The concatenation behavior seems like the least useful default.  I'm inclined to favor the first unassigned input column.  And never allow (v,h) is violate uniqueness.

David J.




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [patch] Proposal for \crosstabview in psql
Next
From: Jeff Janes
Date:
Subject: Re: GIN data corruption bug(s) in 9.6devel