[patch] Proposal for \rotate in psql - Mailing list pgsql-hackers

 Hi,

This is a reboot of my previous proposal for pivoting results in psql,
with a new patch that generalizes the idea further through a command
now named \rotate, and some examples.

So the concept is: having an existing query in the query buffer,
the user can specify two column numbers C1 and C2 (by default the 1st
and 2nd) as an argument to a \rotate command.

The query results are then displayed in a 2D grid such that each tuple
(vx, vy, va, vb,...) is shown as |va vb...| in a cell at coordinates (vx,vy).
The values vx,xy come from columns C1,C2 respectively and are
represented in the output as an horizontal and a vertical header.

A cell may hold several columns from several rows, growing horizontally and
vertically (\n inside the cell) if necessary to show all results.

The examples below should be read with a monospaced font as in psql,
otherwise they will look pretty bad.

1. Example with only 2 columns, querying login/group membership from the
catalog.
   Query:

SELECT r.rolname as username,r1.rolname as groupname
  FROM pg_catalog.pg_roles r LEFT JOIN pg_catalog.pg_auth_members m
  ON (m.member = r.oid)
  LEFT JOIN pg_roles r1 ON (m.roleid=r1.oid)
  WHERE r.rolcanlogin
  ORDER BY 1

Sample results:
  username  | groupname
------------+-----------
 daniel     | mailusers
 drupal     |
 dv        | admin
 dv        | common
 extc        | readonly
 extu        |
 foobar     |
 joel        |
 mailreader | readonly
 manitou    | mailusers
 manitou    | admin
 postgres   |
 u1        | common
 u2        | mailusers
 zaz        | mailusers


Applying \rotate gives:
           Rotated query results
  username  | admin | common | mailusers | readonly
------------+-------+--------+-----------+----------
 daniel     |        |         | X     |
 drupal     |        |         |         |
 dv        | X     | X      |         |
 extc        |        |         |         | X
 extu        |        |         |         |
 foobar     |        |         |         |
 joel        |        |         |         |
 mailreader |        |         |         | X
 manitou    | X     |         | X     |
 postgres   |        |         |         |
 u1        |        | X      |         |
 u2        |        |         | X     |
 zaz        |        |         | X     |

The 'X' inside cells is automatically added as there are only
2 columns. If there was a 3rd column, the content of that column would
be displayed instead (as in the next example).

What's good in that \rotate display compared to the classic output is that
it's more apparent, visually speaking, that such user belongs or not to such
group or another.

2. Example with a unicode checkmark added as 3rd column, and
   unicode linestyle and borders (to be seen with a mono-spaced font):

SELECT r.rolname as username,r1.rolname as groupname, chr(10003)
  FROM pg_catalog.pg_roles r LEFT JOIN pg_catalog.pg_auth_members m
  ON (m.member = r.oid)
  LEFT JOIN pg_roles r1 ON (m.roleid=r1.oid)
  WHERE r.rolcanlogin
  ORDER BY 1

        Rotated query results
┌────────────┬───────┬───��
�────┬───────────┬────────�
��─┐
│  username  │ admin │ common │ mailusers │ readonly │
├────────────┼───────┼───��
�────┼───────────┼────────�
��─┤
│ daniel     │     │        │ ✓        │      │
│ drupal     │     │        │       │           │
│ dv           │ ✓       │ ✓    │          │       │
│ extc       │     │        │       │ ✓     │
│ extu       │     │        │       │           │
│ foobar     │     │        │       │           │
│ joel       │     │        │       │           │
│ mailreader │     │        │       │ ✓     │
│ manitou    │ ✓       │          │ ✓          │       │
│ postgres   │     │        │       │           │
│ u1           │     │ ✓      │        │      │
│ u2           │     │        │ ✓        │      │
│ zaz        │     │        │ ✓        │      │
└────────────┴───────┴───��
�────┴───────────┴────────�
��─┘


What I like in that representation is that it looks good enough
to be pasted directly into a document in a word processor.

3. It can be rotated easily in the other direction, with:
   \rotate 2 1

(Cut horizontally to fit in a mail, the actual output is 116 chars wide).

                           Rotated query results
┌───────────┬────────┬───��
�────┬────┬──────┬──────┬─�
��──────┬──────┬────
│ username  │ daniel │ drupal │ dv │ extc │ extu │ foobar │
joel │ mai...
├───────────┼────────┼───��
�────┼────┼──────┼──────┼─�
��──────┼──────┼────
│ mailusers │ ✓       │          │    │      │      │      │
     │
│ admin     │     │        │ ✓  │      │      │      │
     │
│ common    │     │        │ ✓  │      │      │      │
     │
│ readonly  │     │        │    │ ✓    │      │      │
     │ ✓
└───────────┴────────┴───��
�────┴────┴──────┴──────┴─�
��──────┴──────┴────


4. Example with 3 columns and a count as the value to visualize along
two axis: date and category.
I'm using the number of mails posted per month in a few PG mailing lists,
broken down by list (which are tags in my schema).

Query:
 SELECT date_trunc('month', msg_date)::date as month,
   t.name,
   count(*) as cnt
 FROM mail JOIN mail_tags using(mail_id) JOIN tags t
on(t.tag_id=mail_tags.tag)
 WHERE t.tag_id in (7,8,12,34,79)
 AND msg_date>='2014-05-01'::date and msg_date<'2015-01-01'::date
 GROUP BY date_trunc('month', msg_date)::date, t.name
 ORDER BY 1,2;

Results:
   month    |     name      | cnt
------------+-------------+------
 2014-05-01 | announce      |   19
 2014-05-01 | general      |  550
 2014-05-01 | hackers      | 1914
 2014-05-01 | interfaces  |    4
 2014-05-01 | performance |  122
 2014-06-01 | announce      |   10
 2014-06-01 | general      |  499
 2014-06-01 | hackers      | 2008
 2014-06-01 | interfaces  |   10
 2014-06-01 | performance |  137
 2014-07-01 | announce      |   12
 2014-07-01 | general      |  703
 2014-07-01 | hackers      | 1504
 2014-07-01 | interfaces  |    6
 2014-07-01 | performance |  142
 2014-08-01 | announce      |    9
 2014-08-01 | general      |  616
 2014-08-01 | hackers      | 1864
 2014-08-01 | interfaces  |   11
 2014-08-01 | performance |  116
 2014-09-01 | announce      |   10
 2014-09-01 | general      |  645
 2014-09-01 | hackers      | 2364
 2014-09-01 | interfaces  |    3
 2014-09-01 | performance |  105
 2014-10-01 | announce      |   13
 2014-10-01 | general      |  476
 2014-10-01 | hackers      | 2325
 2014-10-01 | interfaces  |   10
 2014-10-01 | performance |  137
 2014-11-01 | announce      |   10
 2014-11-01 | general      |  457
 2014-11-01 | hackers      | 1810
 2014-11-01 | performance |  109
 2014-12-01 | announce      |   11
 2014-12-01 | general      |  623
 2014-12-01 | hackers      | 2043
 2014-12-01 | interfaces  |    1
 2014-12-01 | performance |   71
(39 rows)

\rotate gives:
            Rotated query results
   month    | announce | general | hackers | interfaces | performance
------------+----------+---------+---------+------------+-------------
 2014-05-01 | 19       | 550     | 1914    | 4        | 122
 2014-06-01 | 10       | 499     | 2008    | 10     | 137
 2014-07-01 | 12       | 703     | 1504    | 6        | 142
 2014-08-01 | 9        | 616     | 1864    | 11     | 116
 2014-09-01 | 10       | 645     | 2364    | 3        | 105
 2014-10-01 | 13       | 476     | 2325    | 10     | 137
 2014-11-01 | 10       | 457     | 1810    |        | 109
 2014-12-01 | 11       | 623     | 2043    | 1        | 71

Advantage: we can figure out the trends, and notice empty slots,
  much quicker than with the previous output. It seems smaller
  but there is the same amount of information.


5. Example with an additional column showing if the count grows up or down
   compared to the previous month. This shows how the contents get stacked
   inside cells when they come from several columns and rows.

Query:

SELECT to_char(mon, 'yyyy-mm') as month,
 name,
 CASE when lag(name,1) over(order by name,mon)=name then
   case sign(cnt-(lag(cnt,1) over(order by name,mon)))
    when 1 then chr(8593)
    when 0 then chr(8597)
    when -1 then chr(8595)
    else ' ' end
 END,
 cnt
 from (SELECT date_trunc('month', msg_date)::date as mon, t.name,count(*) as
cnt
   FROM mail JOIN mail_tags using(mail_id) JOIN tags t
on(t.tag_id=mail_tags.tag)
    WHERE t.tag_id in (7,8,12,34,79)
    AND msg_date>='2014-05-01'::date and msg_date<'2015-01-01'::date
    GROUP BY date_trunc('month', msg_date)::date, t.name) l order by 2,1;

Result:
  month  |    name     | case | cnt
---------+-------------+------+------
 2014-05 | announce    |      |   19
 2014-06 | announce    | ↓    |   10
 2014-07 | announce    | ↑    |   12
 2014-08 | announce    | ↓    |    9
 2014-09 | announce    | ↑    |   10
 2014-10 | announce    | ↑    |   13
 2014-11 | announce    | ↓    |   10
 2014-12 | announce    | ↑    |   11
 2014-05 | general     |      |  550
 2014-06 | general     | ↓    |  499
 2014-07 | general     | ↑    |  703
 2014-08 | general     | ↓    |  616
 2014-09 | general     | ↑    |  645
 2014-10 | general     | ↓    |  476
 2014-11 | general     | ↓    |  457
 2014-12 | general     | ↑    |  623
 2014-05 | hackers     |      | 1914
 2014-06 | hackers     | ↑    | 2008
 2014-07 | hackers     | ↓    | 1504
 2014-08 | hackers     | ↑    | 1864
 2014-09 | hackers     | ↑    | 2364
 2014-10 | hackers     | ↓    | 2325
 2014-11 | hackers     | ↓    | 1810
 2014-12 | hackers     | ↑    | 2043
 2014-05 | interfaces  |      |    4
 2014-06 | interfaces  | ↑    |   10
 2014-07 | interfaces  | ↓    |    6
 2014-08 | interfaces  | ↑    |   11
 2014-09 | interfaces  | ↓    |    3
 2014-10 | interfaces  | ↑    |   10
 2014-12 | interfaces  | ↓    |    1
 2014-05 | performance |      |  122
 2014-06 | performance | ↑    |  137
 2014-07 | performance | ↑    |  142
 2014-08 | performance | ↓    |  116
 2014-09 | performance | ↓    |  105
 2014-10 | performance | ↑    |  137
 2014-11 | performance | ↓    |  109
 2014-12 | performance | ↓    |   71
(39 rows)

\rotate:

               Rotated query results
  month  | announce | general | hackers | interfaces | performance
---------+----------+---------+---------+------------+-------------
 2014-05 | 19        | 550     | 1914    | 4         | 122
 2014-06 | ↓ 10     | ↓ 499   | ↑ 2008  | ↑ 10         | ↑ 137
 2014-07 | ↑ 12     | ↑ 703   | ↓ 1504  | ↓ 6         | ↑ 142
 2014-08 | ↓ 9      | ↓ 616   | ↑ 1864  | ↑ 11         | ↓ 116
 2014-09 | ↑ 10     | ↑ 645   | ↑ 2364  | ↓ 3         | ↓ 105
 2014-10 | ↑ 13     | ↓ 476   | ↓ 2325  | ↑ 10         | ↑ 137
 2014-11 | ↓ 10     | ↓ 457   | ↓ 1810  |        | ↓ 109
 2014-12 | ↑ 11     | ↑ 623   | ↑ 2043  | ↓ 1         | ↓ 71
(8 rows)

The output columns 3 and 4 of the same row get projected into the same
cell, laid out horizontally (separated by space).

6. Example with the same query but rotated differently so that
  it's split into two columns: the counts that go up from the previous
  and those that go down. I'm also cheating a bit by
  casting name and cnt to char(N) for a better alignment

SELECT to_char(mon, 'yyyy-mm') as month,
 name::char(12),
 CASE when lag(name,1) over(order by name,mon)=name then
   case sign(cnt-(lag(cnt,1) over(order by name,mon)))
    when 1 then chr(8593)
    when 0 then chr(8597)
    when -1 then chr(8595)
    else ' ' end
 END,
 cnt::char(8)
 from (SELECT date_trunc('month', msg_date)::date as mon, t.name,count(*) as
cnt
   FROM mail JOIN mail_tags using(mail_id) JOIN tags t
on(t.tag_id=mail_tags.tag)
    WHERE t.tag_id in (7,8,12,34,79)
    AND msg_date>='2014-05-01'::date and msg_date<'2015-01-01'::date
    GROUP BY date_trunc('month', msg_date)::date, t.name) l order by 2,1;

 \rotate 1 3

+---------+-----------------------+-----------------------+
|  month  |          ↑        |        ↓          |
+---------+-----------------------+-----------------------+
| 2014-05 |              |              |
| 2014-06 | hackers     2008     +| announce     10     +|
|      | interfaces     10     +| general     499      |
|      | performance  137      |              |
| 2014-07 | announce     12     +| hackers     1504     +|
|      | general     703     +| interfaces     6      |
|      | performance  142      |              |
| 2014-08 | hackers     1864     +| announce     9     +|
|      | interfaces     11      | general     616     +|
|      |              | performance  116      |
| 2014-09 | announce     10     +| interfaces     3     +|
|      | general     645     +| performance  105      |
|      | hackers     2364      |              |
| 2014-10 | announce     13     +| general     476     +|
|      | interfaces     10     +| hackers     2325      |
|      | performance  137      |              |
| 2014-11 |              | announce     10     +|
|      |              | general     457     +|
|      |              | hackers     1810     +|
|      |              | performance  109      |
| 2014-12 | announce     11     +| interfaces     1     +|
|      | general     623     +| performance  71      |
|      | hackers     2043      |              |
+---------+-----------------------+-----------------------+

As there are several rows that match the vertical/horizontal filter,
(for example 3 results for 2014-06 as row and "arrow up" as column),
they are stacked vertically inside the cell, in addition to
"name" and "cnt" being shown side by side horizontally.

Note that no number show up for 2014-05; this is because they're not
associated with arrow up or down; empty as a column is discarded.
Maybe it shouldn't. In this case, the numbers for 2014-05 would be in a
column with an empty name.


Conclusion, the point of \rotate:

When analyzing query results, these rotated representations may be
useful or not depending on the cases, but the point is that they require
no effort to be obtained through \rotate X Y
It's so easy to play with various combinations to see if the result
makes sense, and if it reveals something about the data.
(it still reexecutes the query each time, tough).

We can get more or less the same results with crosstab/pivot, as it's the
same basic concept, but with much more effort spent on getting the SQL right,
plus the fact that columns not known in advance cannot be returned pivoted
in a single pass in SQL, a severe complication that the client-side doesn't
have.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fwd: Core dump with nested CREATE TEMP TABLE
Next
From: Tom Lane
Date:
Subject: Re: Fwd: Core dump with nested CREATE TEMP TABLE