Thread: list of extended statistics on psql

list of extended statistics on psql

From

Tatsuro Yamada

Date:

24 August 2020, 03:22:49

Hi!

I created a POC patch that allows showing a list of extended statistics by
"\dz" command on psql. I believe this feature helps DBA and users who
would like to know all extended statistics easily. :-D

I have not a strong opinion to assign "\dz". I prefer "\dx" or "\de*"
than "\dz" but they were already assigned. Therefore I used "\dz"
instead of them.

Please find the attached patch.
Any comments are welcome!

For Example:
=======================
CREATE TABLE t1 (a INT, b INT);
CREATE STATISTICS stts1 (dependencies) ON a, b FROM t1;
CREATE STATISTICS stts2 (dependencies, ndistinct) ON a, b FROM t1;
CREATE STATISTICS stts3 (dependencies, ndistinct, mcv) ON a, b FROM t1;
ANALYZE t1;

CREATE TABLE t2 (a INT, b INT, c INT);
CREATE STATISTICS stts4 ON b, c FROM t2;
ANALYZE t2;

postgres=# \dz
                     List of extended statistics
  Schema | Table | Name  | Columns | Ndistinct | Dependencies | MCV
--------+-------+-------+---------+-----------+--------------+-----
  public | t1    | stts1 | a, b    | f         | t            | f
  public | t1    | stts2 | a, b    | t         | t            | f
  public | t1    | stts3 | a, b    | t         | t            | t
  public | t2    | stts4 | b, c    | t         | t            | t
(4 rows)

postgres=# \?
...
   \dy     [PATTERN]      list event triggers
   \dz     [PATTERN]      list extended statistics
   \l[+]   [PATTERN]      list databases
...
=======================

For now, I haven't written a document and regression test for that.
I'll create it later.

Thanks,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_poc1.patch

Re: list of extended statistics on psql

From

Pavel Stehule

Date:

24 August 2020, 04:12:42

po 24. 8. 2020 v 5:23 odesílatel Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> napsal:

Hi!

I created a POC patch that allows showing a list of extended statistics by
"\dz" command on psql. I believe this feature helps DBA and users who
would like to know all extended statistics easily. :-D

I have not a strong opinion to assign "\dz". I prefer "\dx" or "\de*"
than "\dz" but they were already assigned. Therefore I used "\dz"
instead of them.

Please find the attached patch.
Any comments are welcome!

For Example:
=======================
CREATE TABLE t1 (a INT, b INT);
CREATE STATISTICS stts1 (dependencies) ON a, b FROM t1;
CREATE STATISTICS stts2 (dependencies, ndistinct) ON a, b FROM t1;
CREATE STATISTICS stts3 (dependencies, ndistinct, mcv) ON a, b FROM t1;
ANALYZE t1;

CREATE TABLE t2 (a INT, b INT, c INT);
CREATE STATISTICS stts4 ON b, c FROM t2;
ANALYZE t2;

postgres=# \dz
List of extended statistics
Schema | Table | Name | Columns | Ndistinct | Dependencies | MCV
--------+-------+-------+---------+-----------+--------------+-----
public | t1 | stts1 | a, b | f | t | f
public | t1 | stts2 | a, b | t | t | f
public | t1 | stts3 | a, b | t | t | t
public | t2 | stts4 | b, c | t | t | t
(4 rows)

postgres=# \?
...
\dy [PATTERN] list event triggers
\dz [PATTERN] list extended statistics
\l[+] [PATTERN] list databases
...
=======================

For now, I haven't written a document and regression test for that.
I'll create it later.

+1 good idea

Pavel

Thanks,
Tatsuro Yamada

Re: list of extended statistics on psql

From

Julien Rouhaud

Date:

24 August 2020, 05:54:36

On Mon, Aug 24, 2020 at 6:13 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> po 24. 8. 2020 v 5:23 odesílatel Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> napsal:
>>
>> Hi!
>>
>> I created a POC patch that allows showing a list of extended statistics by
>> "\dz" command on psql. I believe this feature helps DBA and users who
>> would like to know all extended statistics easily. :-D
>>
>> I have not a strong opinion to assign "\dz". I prefer "\dx" or "\de*"
>> than "\dz" but they were already assigned. Therefore I used "\dz"
>> instead of them.
>>
>> Please find the attached patch.
>> Any comments are welcome!
>>
>> For Example:
>> =======================
>> CREATE TABLE t1 (a INT, b INT);
>> CREATE STATISTICS stts1 (dependencies) ON a, b FROM t1;
>> CREATE STATISTICS stts2 (dependencies, ndistinct) ON a, b FROM t1;
>> CREATE STATISTICS stts3 (dependencies, ndistinct, mcv) ON a, b FROM t1;
>> ANALYZE t1;
>>
>> CREATE TABLE t2 (a INT, b INT, c INT);
>> CREATE STATISTICS stts4 ON b, c FROM t2;
>> ANALYZE t2;
>>
>> postgres=# \dz
>>                      List of extended statistics
>>   Schema | Table | Name  | Columns | Ndistinct | Dependencies | MCV
>> --------+-------+-------+---------+-----------+--------------+-----
>>   public | t1    | stts1 | a, b    | f         | t            | f
>>   public | t1    | stts2 | a, b    | t         | t            | f
>>   public | t1    | stts3 | a, b    | t         | t            | t
>>   public | t2    | stts4 | b, c    | t         | t            | t
>> (4 rows)
>>
>> postgres=# \?
>> ...
>>    \dy     [PATTERN]      list event triggers
>>    \dz     [PATTERN]      list extended statistics
>>    \l[+]   [PATTERN]      list databases
>> ...
>> =======================
>>
>> For now, I haven't written a document and regression test for that.
>> I'll create it later.
>
>
> +1 good idea

+1 that's a good idea.  Please add it to the next commitfest!

You have a typo:

+    if (pset.sversion < 10000)
+    {
+        char        sverbuf[32];
+
+        pg_log_error("The server (version %s) does not support
extended statistics.",
+                     formatPGVersionNumber(pset.sversion, false,
+                                           sverbuf, sizeof(sverbuf)));
+        return true;
+    }

the version test is missing a 0, the feature looks otherwise ok.

How about using \dX rather than \dz?

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

24 August 2020, 07:41:32

Hi!

>> +1 good idea
> 
> +1 that's a good idea.  Please add it to the next commitfest!

Thanks!


> You have a typo:
> 
> +    if (pset.sversion < 10000)
> +    {
> +        char        sverbuf[32];
> +
> +        pg_log_error("The server (version %s) does not support
> extended statistics.",
> +                     formatPGVersionNumber(pset.sversion, false,
> +                                           sverbuf, sizeof(sverbuf)));
> +        return true;
> +    }
> 
> the version test is missing a 0, the feature looks otherwise ok.

Ouch, I fixed on the attached patch.

The new patch includes:

  - Fix the version number check (10000 -> 100000)
  - Fix query to get extended stats info for sort order
  - Add handling [Pattern] e.g \dz stts*
  - Add document and regression test for \dz
  
> How about using \dX rather than \dz?

Thanks for your suggestion!
I'll replace it if I got consensus. :-D

Thanks,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_poc2.patch

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

27 August 2020, 06:13:09

Hi Julien and Pavel!

>> How about using \dX rather than \dz?
> 
> Thanks for your suggestion!
> I'll replace it if I got consensus. :-D

>> How about using \dX rather than \dz?
>
>Thanks for your suggestion!
>I'll replace it if I got consensus. :-D


I re-read a help message of \d* commands and realized it's better to
use "\dX".
There are already cases where the commands differ due to differences
in case, so I did the same way. Please find attached patch. :-D
  
For example:
==========
   \da[S]  [PATTERN]      list aggregates
   \dA[+]  [PATTERN]      list access methods
==========

Attached patch uses "\dX" instead of "\dz":
==========
   \dx[+]  [PATTERN]      list extensions
   \dX     [PATTERN]      list extended statistics
==========

Results of regress test of the feature are the following:
==========
-- check printing info about extended statistics
create table t1 (a int, b int);
create statistics stts_1 (dependencies) on a, b from t1;
create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;
create table t2 (a int, b int, c int);
create statistics stts_4 on b, c from t2;
create table hoge (col1 int, col2 int, col3 int);
create statistics stts_hoge on col1, col2, col3 from hoge;

\dX
                           List of extended statistics
  Schema | Table |   Name    |     Columns      | Ndistinct | Dependencies | MCV
--------+-------+-----------+------------------+-----------+--------------+-----
  public | hoge  | stts_hoge | col1, col2, col3 | t         | t            | t
  public | t1    | stts_1    | a, b             | f         | t            | f
  public | t1    | stts_2    | a, b             | t         | t            | f
  public | t1    | stts_3    | a, b             | t         | t            | t
  public | t2    | stts_4    | b, c             | t         | t            | t
(5 rows)

\dX stts_?
                     List of extended statistics
  Schema | Table |  Name  | Columns | Ndistinct | Dependencies | MCV
--------+-------+--------+---------+-----------+--------------+-----
  public | t1    | stts_1 | a, b    | f         | t            | f
  public | t1    | stts_2 | a, b    | t         | t            | f
  public | t1    | stts_3 | a, b    | t         | t            | t
  public | t2    | stts_4 | b, c    | t         | t            | t
(4 rows)

\dX *hoge
                           List of extended statistics
  Schema | Table |   Name    |     Columns      | Ndistinct | Dependencies | MCV
--------+-------+-----------+------------------+-----------+--------------+-----
  public | hoge  | stts_hoge | col1, col2, col3 | t         | t            | t
(1 row)
==========


Thanks,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_command.patch

Re: list of extended statistics on psql

From

Julien Rouhaud

Date:

27 August 2020, 13:15:04

Hi Yamada-san,

On Thu, Aug 27, 2020 at 03:13:09PM +0900, Tatsuro Yamada wrote:
> 
> I re-read a help message of \d* commands and realized it's better to
> use "\dX".
> There are already cases where the commands differ due to differences
> in case, so I did the same way. Please find attached patch. :-D
> For example:
> ==========
>   \da[S]  [PATTERN]      list aggregates
>   \dA[+]  [PATTERN]      list access methods
> ==========
> 
> Attached patch uses "\dX" instead of "\dz":
> ==========
>   \dx[+]  [PATTERN]      list extensions
>   \dX     [PATTERN]      list extended statistics
> ==========


Thanks for updating the patch!  This alias will probably be easier to remember.


> 
> Results of regress test of the feature are the following:
> ==========
> -- check printing info about extended statistics
> create table t1 (a int, b int);
> create statistics stts_1 (dependencies) on a, b from t1;
> create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
> create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;
> create table t2 (a int, b int, c int);
> create statistics stts_4 on b, c from t2;
> create table hoge (col1 int, col2 int, col3 int);
> create statistics stts_hoge on col1, col2, col3 from hoge;
> 
> \dX
>                           List of extended statistics
>  Schema | Table |   Name    |     Columns      | Ndistinct | Dependencies | MCV
> --------+-------+-----------+------------------+-----------+--------------+-----
>  public | hoge  | stts_hoge | col1, col2, col3 | t         | t            | t
>  public | t1    | stts_1    | a, b             | f         | t            | f
>  public | t1    | stts_2    | a, b             | t         | t            | f
>  public | t1    | stts_3    | a, b             | t         | t            | t
>  public | t2    | stts_4    | b, c             | t         | t            | t
> (5 rows)
> 
> \dX stts_?
>                     List of extended statistics
>  Schema | Table |  Name  | Columns | Ndistinct | Dependencies | MCV
> --------+-------+--------+---------+-----------+--------------+-----
>  public | t1    | stts_1 | a, b    | f         | t            | f
>  public | t1    | stts_2 | a, b    | t         | t            | f
>  public | t1    | stts_3 | a, b    | t         | t            | t
>  public | t2    | stts_4 | b, c    | t         | t            | t
> (4 rows)
> 
> \dX *hoge
>                           List of extended statistics
>  Schema | Table |   Name    |     Columns      | Ndistinct | Dependencies | MCV
> --------+-------+-----------+------------------+-----------+--------------+-----
>  public | hoge  | stts_hoge | col1, col2, col3 | t         | t            | t
> (1 row)
> ==========


Thanks also for the documentation and regression tests.  This overall looks
good, I just have a two comments:

- there's a whitespace issue in the documentation part:

add_list_extended_stats_for_psql_by_dX_command.patch:10: tab in indent.
      <varlistentry>
warning: 1 line adds whitespace errors.

- You're sorting the output on schema, table, extended statistics and columns
  but I think the last one isn't required since extended statistics names are
  unique.

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

27 August 2020, 23:42:55

Hi Julien!
  

> Thanks also for the documentation and regression tests.  This overall looks
> good, I just have a two comments:


Thank you for reviewing the patch! :-D


> - there's a whitespace issue in the documentation part:
> 
> add_list_extended_stats_for_psql_by_dX_command.patch:10: tab in indent.
>       <varlistentry>
> warning: 1 line adds whitespace errors.


Oops, I forgot to use "git diff --check". I fixed it.

  
> - You're sorting the output on schema, table, extended statistics and columns
>    but I think the last one isn't required since extended statistics names are
>    unique.


You are right.
The sort key "columns" was not necessary so I removed it.

Attached new patch includes the above two fixes:

   - Fix whitespace issue in the documentation part
   - Remove unnecessary sort key from the query
      (ORDER BY 1, 2, 3, 4 -> ORDER BY 1, 2, 3)


Thanks,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_command_r2.patch

Re: list of extended statistics on psql

From

Alvaro Herrera

Date:

27 August 2020, 23:53:23

+1 for the general idea, and +1 for \dX being the syntax to use

IMO the per-type columns should show both the type being enabled as
well as it being built.

(How many more stat types do we expect -- Tomas?  I wonder if having one
column per type is going to scale in the long run.)

Also, the stat obj name column should be first, followed by a single
column listing both table and columns that it applies to.  Keep in mind
that in the future we might want to add stats that cross multiple tables
-- that's why the CREATE syntax is the way it is.  So we should give
room for that in psql's display too.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

28 August 2020, 02:07:43

Hi Alvaro!

It's been ages since we created a progress reporting feature together. :-D

>>> +1 good idea
>>
>> +1 that's a good idea.  Please add it to the next commitfest!
>
>+1 for the general idea, and +1 for \dX being the syntax to use

Thank you for voting!


> IMO the per-type columns should show both the type being enabled as
well as it being built.

Hmm. I'm not sure how to get the status (enabled or disabled) of
extended stats. :(
Could you explain it more?


> Also, the stat obj name column should be first, followed by a single
> column listing both table and columns that it applies to.  Keep in mind
> that in the future we might want to add stats that cross multiple tables
> -- that's why the CREATE syntax is the way it is.  So we should give
> room for that in psql's display too.

I understand your suggestions are the following, right?

* The Current column order:
===================
   Schema | Table |  Name  | Columns | Ndistinct | Dependencies | MCV
--------+-------+--------+---------+-----------+--------------+-----
   public | t1    | stts_1 | a, b    | f         | t            | f
   public | t1    | stts_2 | a, b    | t         | t            | f
   public | t1    | stts_3 | a, b    | t         | t            | t
   public | t2    | stts_4 | b, c    | t         | t            | t
===================

* The suggested column order is like this:
===================
    Name    | Schema | Table |     Columns      | Ndistinct | Dependencies | MCV
-----------+--------+-------+------------------+-----------+--------------+-----
  stts_1    | public | t1    | a, b             | f         | t            | f
  stts_2    | public | t1    | a, b             | t         | t            | f
  stts_3    | public | t1    | a, b             | t         | t            | t
  stts_4    | public | t2    | b, c             | t         | t            | t
===================

*  In the future, Extended stats that cross multiple tables will be
    shown maybe... (t1, t2):
===================
    Name    | Schema | Table  |     Columns      | Ndistinct | Dependencies | MCV
-----------+--------+--------+------------------+-----------+--------------+-----
  stts_5    | public | t1, t2 | a, b             | f         | t            | f
===================

If so, I can revise the column order as you suggested easily.
However, I have no idea how to show extended stats that cross
multiple tables and the status now.

I suppose that the current column order is sufficient if there is
no improvement of extended stats on PG14. Do you know any plan to
improve extended stats such as to allow it to cross multiple tables on PG14?


In addition,
Currently, I use this query to get Extended stats info from pg_statistic_ext.

         SELECT
         stxnamespace::pg_catalog.regnamespace AS "Schema",
         c.relname AS "Table",
         stxname AS "Name",
         (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
          FROM pg_catalog.unnest(stxkeys) s(attnum)
          JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
          a.attnum = s.attnum AND NOT attisdropped)) AS "Columns",
         'd' = any(stxkind) AS "Ndistinct",
         'f' = any(stxkind) AS "Dependencies",
         'm' = any(stxkind) AS "MCV"
         FROM pg_catalog.pg_statistic_ext
         INNER JOIN pg_catalog.pg_class c
         ON stxrelid = c.oid
         ORDER BY 1, 2, 3;

Thanks,
Tatsuro Yamada

Re: list of extended statistics on psql

From

Alvaro Herrera

Date:

28 August 2020, 03:26:17

On 2020-Aug-28, Tatsuro Yamada wrote:

> > IMO the per-type columns should show both the type being enabled as
> > well as it being built.
> 
> Hmm. I'm not sure how to get the status (enabled or disabled) of
> extended stats. :(
> Could you explain it more?

pg_statistic_ext_data.stxdndistinct is not null if the stats have been
built.  (I'm not sure whether there's an easier way to determine this.)


> * The suggested column order is like this:
> ===================
>    Name    | Schema | Table |     Columns      | Ndistinct | Dependencies | MCV
> -----------+--------+-------+------------------+-----------+--------------+-----
>  stts_1    | public | t1    | a, b             | f         | t            | f
>  stts_2    | public | t1    | a, b             | t         | t            | f
>  stts_3    | public | t1    | a, b             | t         | t            | t
>  stts_4    | public | t2    | b, c             | t         | t            | t
> ===================

I suggest to do this

    Name    | Schema | Definition               | Ndistinct | Dependencies | MCV
 -----------+--------+--------------------------+-----------+--------------+-----
  stts_1    | public | (a, b) FROM t1           | f         | t            | f

> I suppose that the current column order is sufficient if there is
> no improvement of extended stats on PG14. Do you know any plan to
> improve extended stats such as to allow it to cross multiple tables on PG14?

I suggest that changing it in the future is going to be an uphill
battle, so better get it right from the get go, without requiring a
future restructure.

> In addition,
> Currently, I use this query to get Extended stats info from pg_statistic_ext.

Maybe something like this would do

SELECT
 stxnamespace::pg_catalog.regnamespace AS "Schema",
 stxname AS "Name",
 format('%s FROM %s',
 (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
  FROM pg_catalog.unnest(stxkeys) s(attnum)
  JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
  a.attnum = s.attnum AND NOT attisdropped)),
  stxrelid::regclass) AS "Definition",
  CASE WHEN stxdndistinct IS NOT NULL THEN 'built' WHEN 'd' = any(stxkind) THEN 'enabled, not built' END AS
"n-distinct",
  CASE WHEN stxddependencies IS NOT NULL THEN 'built' WHEN 'f' = any(stxkind) THEN 'enabled, not built' END AS
"functionaldependencies",
 
  CASE WHEN stxdmcv IS NOT NULL THEN 'built' WHEN 'm' = any(stxkind) THEN 'enabled, not built' END AS mcv
 FROM pg_catalog.pg_statistic_ext es
 INNER JOIN pg_catalog.pg_class c
 ON stxrelid = c.oid
 LEFT JOIN pg_catalog.pg_statistic_ext_data esd ON es.oid = esd.stxoid
 ORDER BY 1, 2, 3;

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

29 August 2020, 21:47:34

On Thu, Aug 27, 2020 at 07:53:23PM -0400, Alvaro Herrera wrote:
>+1 for the general idea, and +1 for \dX being the syntax to use
>
>IMO the per-type columns should show both the type being enabled as
>well as it being built.
>
>(How many more stat types do we expect -- Tomas?  I wonder if having one
>column per type is going to scale in the long run.)
>

I wouldn't expect a huge number of types. I can imagine maybe twice the
current number of types, but not much more. But I'm not sure the output
is easy to read even now ...


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

29 August 2020, 21:54:58

On Thu, Aug 27, 2020 at 11:26:17PM -0400, Alvaro Herrera wrote:
>On 2020-Aug-28, Tatsuro Yamada wrote:
>
>> > IMO the per-type columns should show both the type being enabled as
>> > well as it being built.
>>
>> Hmm. I'm not sure how to get the status (enabled or disabled) of
>> extended stats. :(
>> Could you explain it more?
>
>pg_statistic_ext_data.stxdndistinct is not null if the stats have been
>built.  (I'm not sure whether there's an easier way to determine this.)
>

It's the only way, I think. Which types were requested is stored in

    pg_statistic_ext.stxkind

and what was built is in pg_statistic_ext_data. But if we want the
output to show both what was requested and which types were actually
built, that'll effectively double the number of columns needed :-(

Also, it might be useful to show the size of the statistics built, just
like we show for \d+ etc.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Alvaro Herrera

Date:

29 August 2020, 22:43:47

On 2020-Aug-29, Tomas Vondra wrote:

> But if we want the
> output to show both what was requested and which types were actually
> built, that'll effectively double the number of columns needed :-(

I was thinking it would be one column per type showing either disabled or enabled
or built.  But another idea is to show one type per line that's at least
enabled.

> Also, it might be useful to show the size of the statistics built, just
> like we show for \d+ etc.

\dX+  I  suppose?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

29 August 2020, 22:54:36

On Sat, Aug 29, 2020 at 06:43:47PM -0400, Alvaro Herrera wrote:
>On 2020-Aug-29, Tomas Vondra wrote:
>
>> But if we want the
>> output to show both what was requested and which types were actually
>> built, that'll effectively double the number of columns needed :-(
>
>I was thinking it would be one column per type showing either disabled or enabled
>or built.  But another idea is to show one type per line that's at least
>enabled.
>
>> Also, it might be useful to show the size of the statistics built, just
>> like we show for \d+ etc.
>
>\dX+  I  suppose?
>

Right. I've only used \d+ as an example of an existing command showing
sizes of the objects.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Alvaro Herrera

Date:

30 August 2020, 16:33:29

On 2020-Aug-30, Tomas Vondra wrote:

> On Sat, Aug 29, 2020 at 06:43:47PM -0400, Alvaro Herrera wrote:
> > On 2020-Aug-29, Tomas Vondra wrote:

> > > Also, it might be useful to show the size of the statistics built, just
> > > like we show for \d+ etc.
> > 
> > \dX+  I  suppose?
> 
> Right. I've only used \d+ as an example of an existing command showing
> sizes of the objects.

Yeah, I understood it that way too.

How can you measure the size of the stat objects in a query?  Are you
thinking in pg_column_size()?

I wonder how to report that.  Knowing that psql \-commands are not meant
for anything other than human consumption, maybe we can use a format()
string that says "built: %d bytes" when \dX+ is used (for each stat type),
and just "built" when \dX is used.  What do people think about this?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

30 August 2020, 16:48:18

On Sun, Aug 30, 2020 at 12:33:29PM -0400, Alvaro Herrera wrote:
>On 2020-Aug-30, Tomas Vondra wrote:
>
>> On Sat, Aug 29, 2020 at 06:43:47PM -0400, Alvaro Herrera wrote:
>> > On 2020-Aug-29, Tomas Vondra wrote:
>
>> > > Also, it might be useful to show the size of the statistics built, just
>> > > like we show for \d+ etc.
>> >
>> > \dX+  I  suppose?
>>
>> Right. I've only used \d+ as an example of an existing command showing
>> sizes of the objects.
>
>Yeah, I understood it that way too.
>
>How can you measure the size of the stat objects in a query?  Are you
>thinking in pg_column_size()?
>

Either that or simply length() on the bytea value.

>I wonder how to report that.  Knowing that psql \-commands are not meant
>for anything other than human consumption, maybe we can use a format()
>string that says "built: %d bytes" when \dX+ is used (for each stat type),
>and just "built" when \dX is used.  What do people think about this?
>

I'd use the same approach as \d+, i.e. a separate column with the size.
Maybe that'd mean too many columns, though.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tom Lane

Date:

30 August 2020, 16:59:57

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Sun, Aug 30, 2020 at 12:33:29PM -0400, Alvaro Herrera wrote:
>> I wonder how to report that.  Knowing that psql \-commands are not meant
>> for anything other than human consumption, maybe we can use a format()
>> string that says "built: %d bytes" when \dX+ is used (for each stat type),
>> and just "built" when \dX is used.  What do people think about this?

Seems a little too cute to me.

> I'd use the same approach as \d+, i.e. a separate column with the size.
> Maybe that'd mean too many columns, though.

psql already has \d commands with so many columns that you pretty much
have to use \x mode to make them legible; \df+ for instance.  I don't
mind if \dX+ is also in that territory.  It'd be good though if plain
\dX can fit in a normal terminal window.

            regards, tom lane

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

30 August 2020, 23:56:52

Hi Alvaro,

>>> IMO the per-type columns should show both the type being enabled as
>>> well as it being built.
>>
>> Hmm. I'm not sure how to get the status (enabled or disabled) of
>> extended stats. :(
>> Could you explain it more?
>
> pg_statistic_ext_data.stxdndistinct is not null if the stats have been
> built. (I'm not sure whether there's an easier way to determine this.)


Ah.. I see! Thank you.


> I suggest to do this
>
>    Name    | Schema | Definition               | Ndistinct | Dependencies | MCV
> -----------+--------+--------------------------+-----------+--------------+-----
>  stts_1    | public | (a, b) FROM t1           | f         | t            | f
>
>> I suppose that the current column order is sufficient if there is
>> no improvement of extended stats on PG14. Do you know any plan to
>> improve extended stats such as to allow it to cross multiple tables on PG14?
>
> I suggest that changing it in the future is going to be an uphill
> battle, so better get it right from the get go, without requiring a
> future restructure.


I understand your suggestions. I'll replace "Columns" and "Table" columns with "Definition" column.


>> Currently, I use this query to get Extended stats info from pg_statistic_ext.
>
> Maybe something like this would do
>
> SELECT
> stxnamespace::pg_catalog.regnamespace AS "Schema",
> stxname AS "Name",
> format('%s FROM %s',
>  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
>   FROM pg_catalog.unnest(stxkeys) s(attnum)
>   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
>   a.attnum = s.attnum AND NOT attisdropped)),
>   stxrelid::regclass) AS "Definition",
>   CASE WHEN stxdndistinct IS NOT NULL THEN 'built' WHEN 'd' = any(stxkind) THEN 'enabled, not built' END AS
"n-distinct",
>   CASE WHEN stxddependencies IS NOT NULL THEN 'built' WHEN 'f' = any(stxkind) THEN 'enabled, not built' END AS
"functionaldependencies",
 
>   CASE WHEN stxdmcv IS NOT NULL THEN 'built' WHEN 'm' = any(stxkind) THEN 'enabled, not built' END AS mcv
>  FROM pg_catalog.pg_statistic_ext es
>  INNER JOIN pg_catalog.pg_class c
>  ON stxrelid = c.oid
>  LEFT JOIN pg_catalog.pg_statistic_ext_data esd ON es.oid = esd.stxoid
>  ORDER BY 1, 2, 3;

Great! It helped me a lot to understand your suggestions correctly. Thanks. :-D
I got the below results by your query.

========
create table t1 (a int, b int);
create statistics stts_1 (dependencies) on a, b from t1;
create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;
create table t2 (a int, b int, c int);
create statistics stts_4 on b, c from t2;
create table hoge (col1 int, col2 int, col3 int);
create statistics stts_hoge on col1, col2, col3 from hoge;

insert into t1 select i,i from generate_series(1,100) i;
analyze t1;


Your query gave this result:

  Schema |   Name    |         Definition         |     n-distinct     | functional dependencies |        mcv
--------+-----------+----------------------------+--------------------+-------------------------+--------------------
  public | stts_1    | a, b FROM t1               |                    | built                   |
  public | stts_2    | a, b FROM t1               | built              | built                   |
  public | stts_3    | a, b FROM t1               | built              | built                   | built
  public | stts_4    | b, c FROM t2               | enabled, not built | enabled, not built      | enabled, not built
  public | stts_hoge | col1, col2, col3 FROM hoge | enabled, not built | enabled, not built      | enabled, not built
(5 rows)
========

I guess "enabled, not built" is a little redundant. The status would better to
have three patterns: "built", "not built" or nothing (NULL) like these:

   - "built":  extended stats is defined and built (collected by analyze cmd)
   - "not built": extended stats is defined but have not built yet
   - nothing (NULL): extended stats is not defined

What do you think about it?


I will send a new patch including :

   - Replace "Columns" and "Table" column with "Definition"
   - Show the status (built/not built/null) of extended stats by using
     pg_statistic_ext_data

Thanks,
Tatsuro Yamada

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

31 August 2020, 01:24:23

On 2020/08/31 1:59, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Sun, Aug 30, 2020 at 12:33:29PM -0400, Alvaro Herrera wrote:
>>> I wonder how to report that.  Knowing that psql \-commands are not meant
>>> for anything other than human consumption, maybe we can use a format()
>>> string that says "built: %d bytes" when \dX+ is used (for each stat type),
>>> and just "built" when \dX is used.  What do people think about this?
> 
> Seems a little too cute to me.
> 
>> I'd use the same approach as \d+, i.e. a separate column with the size.
>> Maybe that'd mean too many columns, though.
> 
> psql already has \d commands with so many columns that you pretty much
> have to use \x mode to make them legible; \df+ for instance.  I don't
> mind if \dX+ is also in that territory.  It'd be good though if plain
> \dX can fit in a normal terminal window.


Hmm. How about these instead of "built: %d bytes"?
I added three columns (N_size, D_size, M_size) to show size. See below:

===================
  postgres=# \dX
                                List of extended statistics
  Schema |   Name    |         Definition         | N_distinct | Dependencies |    Mcv
--------+-----------+----------------------------+------------+--------------+-----------
  public | stts_1    | a, b FROM t1               |            | built        |
  public | stts_2    | a, b FROM t1               | built      | built        |
  public | stts_3    | a, b FROM t1               | built      | built        | built
  public | stts_4    | b, c FROM t2               | not built  | not built    | not built
  public | stts_hoge | col1, col2, col3 FROM hoge | not built  | not built    | not built
(5 rows)

postgres=# \dX+
                                             List of extended statistics
  Schema |   Name    |         Definition         | N_distinct | Dependencies |    Mcv    | N_size | D_size | M_size
--------+-----------+----------------------------+------------+--------------+-----------+--------+--------+--------
  public | stts_1    | a, b FROM t1               |            | built        |           |        |     40 |
  public | stts_2    | a, b FROM t1               | built      | built        |           |     13 |     40 |
  public | stts_3    | a, b FROM t1               | built      | built        | built     |     13 |     40 |   6126
  public | stts_4    | b, c FROM t2               | not built  | not built    | not built |        |        |
  public | stts_hoge | col1, col2, col3 FROM hoge | not built  | not built    | not built |        |        |
===================

I used this query to get results of "\dX+".
===================
         SELECT
          stxnamespace::pg_catalog.regnamespace AS "Schema",
          stxname AS "Name",
          format('%s FROM %s',
            (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
             FROM pg_catalog.unnest(stxkeys) s(attnum)
             JOIN pg_catalog.pg_attribute a
             ON (stxrelid = a.attrelid
             AND a.attnum = s.attnum
             AND NOT attisdropped)),
          stxrelid::regclass) AS "Definition",
          CASE WHEN esd.stxdndistinct IS NOT NULL THEN 'built'
               WHEN 'd' = any(stxkind) THEN 'not built'
          END AS "N_distinct",
          CASE WHEN esd.stxddependencies IS NOT NULL THEN 'built'
               WHEN 'f' = any(stxkind) THEN 'not built'
          END AS "Dependencies",
          CASE WHEN esd.stxdmcv IS NOT NULL THEN 'built'
               WHEN 'm' = any(stxkind) THEN 'not built'
          END AS "Mcv",
        pg_catalog.length(stxdndistinct) AS "N_size",
        pg_catalog.length(stxddependencies) AS "D_size",
        pg_catalog.length(stxdmcv) AS "M_size"
        FROM pg_catalog.pg_statistic_ext es
        INNER JOIN pg_catalog.pg_class c
        ON stxrelid = c.oid
        LEFT JOIN pg_catalog.pg_statistic_ext_data esd
        ON es.oid = esd.stxoid
        ORDER BY 1, 2;
===================
  

Attached patch includes:

    - Replace "Columns" and "Table" column with "Definition"
    - Show the status (built/not built/null) of extended stats by
      using pg_statistic_ext_data
    - Add "\dX+" command to show size of extended stats

Please find the attached file! :-D


Thanks,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_and_dXplus_r3.patch

Re: list of extended statistics on psql

From

Alvaro Herrera

Date:

31 August 2020, 14:28:38

On 2020-Aug-30, Tomas Vondra wrote:

> On Sun, Aug 30, 2020 at 12:33:29PM -0400, Alvaro Herrera wrote:

> > I wonder how to report that.  Knowing that psql \-commands are not meant
> > for anything other than human consumption, maybe we can use a format()
> > string that says "built: %d bytes" when \dX+ is used (for each stat type),
> > and just "built" when \dX is used.  What do people think about this?
> 
> I'd use the same approach as \d+, i.e. a separate column with the size.
> Maybe that'd mean too many columns, though.

Are you thinking in one size for all stats, or a combined size?  If the
former, then yes it'd be too many columns.

I'm trying to figure out what can the user *do* with that data.  Can
they make the sample size smaller/bigger if the stats data is too large?
Can they do that for each individual stats type?  If so, it'd make sense
to list each type's size separately.

If we do put each type in its own row -- at least "logical" row, say
string_agg(unnest(array_of_types), '\n') -- then we can put the size of each type
in a separate column with string_agg(unnest(array_of_sizes), '\n') 

 statname |   definition    |         type             |  size
----------+-----------------+--------------------------+-----------
 someobj  | (a, b) FROM tab | n-distinct: built        | 2000 bytes
                            | func-dependencies: built | 4000 bytes
 another  | (a, c) FROM tab | n-distint: enabled       | <null>


-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tom Lane

Date:

31 August 2020, 14:58:11

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> If we do put each type in its own row -- at least "logical" row, say
> string_agg(unnest(array_of_types), '\n') -- then we can put the size of each type
> in a separate column with string_agg(unnest(array_of_sizes), '\n')

>  statname |   definition    |         type             |  size
> ----------+-----------------+--------------------------+-----------
>  someobj  | (a, b) FROM tab | n-distinct: built        | 2000 bytes
>                             | func-dependencies: built | 4000 bytes
>  another  | (a, c) FROM tab | n-distint: enabled       | <null>

I guess I'm wondering why the size is of such interest that we
need it at all here.

            regards, tom lane

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

31 August 2020, 15:20:57

On Mon, Aug 31, 2020 at 10:58:11AM -0400, Tom Lane wrote:
>Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> If we do put each type in its own row -- at least "logical" row, say
>> string_agg(unnest(array_of_types), '\n') -- then we can put the size of each type
>> in a separate column with string_agg(unnest(array_of_sizes), '\n')
>
>>  statname |   definition    |         type             |  size
>> ----------+-----------------+--------------------------+-----------
>>  someobj  | (a, b) FROM tab | n-distinct: built        | 2000 bytes
>>                             | func-dependencies: built | 4000 bytes
>>  another  | (a, c) FROM tab | n-distint: enabled       | <null>
>
>I guess I'm wondering why the size is of such interest that we
>need it at all here.
>

I agree it may not be important enough. I did use it during development
etc. but maybe it's not something we need to include in this list (even
if it's just in the \dX+ variant).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

31 August 2020, 15:30:53

On Mon, Aug 31, 2020 at 10:28:38AM -0400, Alvaro Herrera wrote:
>On 2020-Aug-30, Tomas Vondra wrote:
>
>> On Sun, Aug 30, 2020 at 12:33:29PM -0400, Alvaro Herrera wrote:
>
>> > I wonder how to report that.  Knowing that psql \-commands are not meant
>> > for anything other than human consumption, maybe we can use a format()
>> > string that says "built: %d bytes" when \dX+ is used (for each stat type),
>> > and just "built" when \dX is used.  What do people think about this?
>>
>> I'd use the same approach as \d+, i.e. a separate column with the size.
>> Maybe that'd mean too many columns, though.
>
>Are you thinking in one size for all stats, or a combined size?  If the
>former, then yes it'd be too many columns.
>

I wonder if trying to list info about all stats from the statistics
object in a single line is necessary. Maybe we should split the info
into one line per statistics, so for example

     CREATE STATISTICS s (mcv, ndistinct, dependencies) ON ...

would result in three lines in the \dX output. The statistics name would
identify which lines belong together, but other than that the pieces are
mostly independent.

This would make it somewhat future-proof in case we add more statistics
types, because the number of columns would not increase. OTOH maybe it's
pointless and/or against the purpose of listing statistics objects.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Alvaro Herrera

Date:

31 August 2020, 16:18:09

On 2020-Aug-31, Tomas Vondra wrote:

> I wonder if trying to list info about all stats from the statistics
> object in a single line is necessary. Maybe we should split the info
> into one line per statistics, so for example
> 
>     CREATE STATISTICS s (mcv, ndistinct, dependencies) ON ...
> 
> would result in three lines in the \dX output. The statistics name would
> identify which lines belong together, but other than that the pieces are
> mostly independent.

Yeah, that's what I'm suggesting.  I don't think we need to repeat the
name/definition for each line though.

It might be useful to know how does pspg show a single entry that's
split in three lines, though.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

31 August 2020, 16:32:00

On Mon, Aug 31, 2020 at 12:18:09PM -0400, Alvaro Herrera wrote:
>On 2020-Aug-31, Tomas Vondra wrote:
>
>> I wonder if trying to list info about all stats from the statistics
>> object in a single line is necessary. Maybe we should split the info
>> into one line per statistics, so for example
>>
>>     CREATE STATISTICS s (mcv, ndistinct, dependencies) ON ...
>>
>> would result in three lines in the \dX output. The statistics name would
>> identify which lines belong together, but other than that the pieces are
>> mostly independent.
>
>Yeah, that's what I'm suggesting.  I don't think we need to repeat the
>name/definition for each line though.
>
>It might be useful to know how does pspg show a single entry that's
>split in three lines, though.
>

Ah, I didn't realize you're proposing that - I assumed it's broken
simply to make it readable, or something like that. I think the lines
are mostly independent, so I'd suggest to include the name of the object
on each line. The question is whether this independence will remain true
in the future - for example histograms would be built only on data not
represented by the MCV list, so there's a close dependency there.

Not sure about pspg, and I'm not sure it matters too much.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Pavel Stehule

Date:

31 August 2020, 18:38:11

po 31. 8. 2020 v 18:32 odesílatel Tomas Vondra <tomas.vondra@2ndquadrant.com> napsal:

On Mon, Aug 31, 2020 at 12:18:09PM -0400, Alvaro Herrera wrote:
>On 2020-Aug-31, Tomas Vondra wrote:
>
>> I wonder if trying to list info about all stats from the statistics
>> object in a single line is necessary. Maybe we should split the info
>> into one line per statistics, so for example
>>
>> CREATE STATISTICS s (mcv, ndistinct, dependencies) ON ...
>>
>> would result in three lines in the \dX output. The statistics name would
>> identify which lines belong together, but other than that the pieces are
>> mostly independent.
>
>Yeah, that's what I'm suggesting. I don't think we need to repeat the
>name/definition for each line though.
>
>It might be useful to know how does pspg show a single entry that's
>split in three lines, though.
>

Ah, I didn't realize you're proposing that - I assumed it's broken
simply to make it readable, or something like that. I think the lines
are mostly independent, so I'd suggest to include the name of the object
on each line. The question is whether this independence will remain true
in the future - for example histograms would be built only on data not
represented by the MCV list, so there's a close dependency there.

Not sure about pspg, and I'm not sure it matters too much.

pspg almost ignores multiline rows - the horizontal cursor is one row every time. There is only one use case where pspg detects multiline rows - sorts, and pspg ensures correct content for multiline rows displayed in different (than input) order.

Regards

Pavel

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

02 September 2020, 23:45:17

Hi,

>      >> I wonder if trying to list info about all stats from the statistics
>      >> object in a single line is necessary. Maybe we should split the info
>      >> into one line per statistics, so for example
>      >>
>      >>     CREATE STATISTICS s (mcv, ndistinct, dependencies) ON ...
>      >>
>      >> would result in three lines in the \dX output. The statistics name would
>      >> identify which lines belong together, but other than that the pieces are
>      >> mostly independent.
>      >
>      >Yeah, that's what I'm suggesting.  I don't think we need to repeat the
>      >name/definition for each line though.
>      >
>      >It might be useful to know how does pspg show a single entry that's
>      >split in three lines, though.
>      >
> 
>     Ah, I didn't realize you're proposing that - I assumed it's broken
>     simply to make it readable, or something like that. I think the lines
>     are mostly independent, so I'd suggest to include the name of the object
>     on each line. The question is whether this independence will remain true
>     in the future - for example histograms would be built only on data not
>     represented by the MCV list, so there's a close dependency there.
> 
>     Not sure about pspg, and I'm not sure it matters too much.
> 
> 
> pspg almost ignores multiline rows - the horizontal cursor is one row every time. There is only one use case where
pspgdetects multiline rows - sorts, and pspg ensures correct content for multiline rows displayed in different (than
input)order.
 



I try to summarize the discussion so far.
Is my understanding right? Could you revise it if it has something wrong?


* Summary

   1. "\dX[+]" doesn't display the Size of extended stats since the size is
       useful only for the development process of the stats.

   2. each row should have stats name, definition, type, and status.
      For example:

      statname |   definition     |         type              |
     ----------+------------------+---------------------------+
      someobj  | (a, b) FROM tab  | n-distinct: built         |
      someobj  | (a, b) FROM tab  | func-dependencies: built  |
      someobj  | (a, b) FROM tab  | mcv: built                |
      sttshoge | (a, b) FROM hoge | n-distinct: required      |
      sttshoge | (a, b) FROM hoge | func-dependencies:required|
      sttscross| (a, b) FROM t1,t2| n-distinct: required      |


My opinion is below:

   For 1., Agreed. I will remove it on the next patch.
   For 2., I feel the design is not beautiful so I'd like to change it.
     The reasons are:

     - I think that even if we expected the number of types increasing two times,
        each type would be better to put as columns, not lines.
       Repeating items (the stats name and definition) should be removed.
       It's okay there are many columns in the future like "\df+" because we can
       use "\x" mode to display if we need it.

     - The type column has two kinds of data, the one is stats type and another
       is status. We know the word "One fact in One place" for data modeling in
       the RDBMS world so it would be better to divide it.
       I'd like to suggest the bellow design of the view.

      statname |   definition     | n-distinct | func-dependencies | mcv   |
     ----------+------------------+------------+-------------------+-------|
      someobj  | (a, b) FROM tab  | built      | built             | built |
      sttshoge | (a, b) FROM hoge | required   | required          |       |
      sttscross| (a, b) FROM t1,t2| required   |                   |       |


Any thoughts?


Thanks,
Tatsuro Yamada

Re: list of extended statistics on psql

From

Michael Paquier

Date:

17 September 2020, 05:55:31

On Thu, Sep 03, 2020 at 08:45:17AM +0900, Tatsuro Yamada wrote:
> I try to summarize the discussion so far.

Could you provide at least a rebased version of the patch?  The CF bot
is complaning here.
--
Michael

Attachment

signature.asc

Re: list of extended statistics on psql

From

Michael Paquier

Date:

30 September 2020, 06:19:47

On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:
> Could you provide at least a rebased version of the patch?  The CF bot
> is complaning here.

Not seeing this answered after two weeks, I have marked the patch as
RwF for now.
--
Michael

Attachment

signature.asc

Re: [spam] Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

28 October 2020, 05:41:40

Hi Michael-san and Hackers,

On 2020/09/30 15:19, Michael Paquier wrote:
> On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:
>> Could you provide at least a rebased version of the patch?  The CF bot
>> is complaning here.
> 
> Not seeing this answered after two weeks, I have marked the patch as
> RwF for now.
> --
> Michael


Sorry for the delayed reply.

I re-based the patch on the current head and did some
refactoring.
I think the size of extended stats are not useful for DBA.
Should I remove it?

Changes:
========
   - Use a keyword "defined" instead of "not built"
   - Use COALESCE function for size for extended stats

Results of \dX and \dX+:
========================
postgres=# \dX
                            List of extended statistics
    Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv
-------------+-----------+-----------------+------------+--------------+---------
  public      | hoge1_ext | a, b FROM hoge1 | defined    | defined      | defined
  hoge1schema | hoge1_ext | a, b FROM hoge1 | built      | built        | built
(2 rows)

postgres=# \dX+
                                         List of extended statistics
    Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv   | N_size | D_size | M_size
-------------+-----------+-----------------+------------+--------------+---------+--------+--------+--------
  public      | hoge1_ext | a, b FROM hoge1 | defined    | defined      | defined |      0 |      0 |      0
  hoge1schema | hoge1_ext | a, b FROM hoge1 | built      | built        | built   |     13 |     40 |   6126
(2 rows)

Query of \dX+:
==============
         SELECT
         stxnamespace::pg_catalog.regnamespace AS "Schema",
         stxname AS "Name",
         pg_catalog.format('%s FROM %s',
           (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
            FROM pg_catalog.unnest(es.stxkeys) s(attnum)
            JOIN pg_catalog.pg_attribute a
            ON (es.stxrelid = a.attrelid
            AND a.attnum = s.attnum
            AND NOT a.attisdropped)),
         es.stxrelid::regclass) AS "Definition",
         CASE WHEN esd.stxdndistinct IS NOT NULL THEN 'built'
              WHEN 'd' = any(stxkind) THEN 'defined'
         END AS "N_distinct",
         CASE WHEN esd.stxddependencies IS NOT NULL THEN 'built'
              WHEN 'f' = any(stxkind) THEN 'defined'
         END AS "Dependencies",
         CASE WHEN esd.stxdmcv IS NOT NULL THEN 'built'
              WHEN 'm' = any(stxkind) THEN 'defined'
         END AS "Mcv",
         COALESCE(pg_catalog.length(stxdndistinct), 0) AS "N_size",
         COALESCE(pg_catalog.length(stxddependencies), 0) AS "D_size",
         COALESCE(pg_catalog.length(stxdmcv), 0) AS "M_size"
         FROM pg_catalog.pg_statistic_ext es
         LEFT JOIN pg_catalog.pg_statistic_ext_data esd
         ON es.oid = esd.stxoid
         INNER JOIN pg_catalog.pg_class c
         ON es.stxrelid = c.oid
         ORDER BY 1, 2;


Regards,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_and_dXplus_r4.patch

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

28 October 2020, 06:07:56

Hi Michael-san and Hackers,

On 2020/09/30 15:19, Michael Paquier wrote:
> On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:
>> Could you provide at least a rebased version of the patch?  The CF bot
>> is complaning here.
> 
> Not seeing this answered after two weeks, I have marked the patch as
> RwF for now.
> --
> Michael


Sorry for the delayed reply.

I re-based the patch on the current head and did some
refactoring.
I think the size of extended stats are not useful for DBA.
Should I remove it?

Changes:
========
    - Use a keyword "defined" instead of "not built"
    - Use COALESCE function for size for extended stats

Results of \dX and \dX+:
========================
postgres=# \dX
                             List of extended statistics
     Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv
-------------+-----------+-----------------+------------+--------------+---------
   public      | hoge1_ext | a, b FROM hoge1 | defined    | defined      | defined
   hoge1schema | hoge1_ext | a, b FROM hoge1 | built      | built        | built
(2 rows)

postgres=# \dX+
                                          List of extended statistics
     Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv   | N_size | D_size | M_size
-------------+-----------+-----------------+------------+--------------+---------+--------+--------+--------
   public      | hoge1_ext | a, b FROM hoge1 | defined    | defined      | defined |      0 |      0 |      0
   hoge1schema | hoge1_ext | a, b FROM hoge1 | built      | built        | built   |     13 |     40 |   6126
(2 rows)

Query of \dX+:
==============
          SELECT
          stxnamespace::pg_catalog.regnamespace AS "Schema",
          stxname AS "Name",
          pg_catalog.format('%s FROM %s',
            (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
             FROM pg_catalog.unnest(es.stxkeys) s(attnum)
             JOIN pg_catalog.pg_attribute a
             ON (es.stxrelid = a.attrelid
             AND a.attnum = s.attnum
             AND NOT a.attisdropped)),
          es.stxrelid::regclass) AS "Definition",
          CASE WHEN esd.stxdndistinct IS NOT NULL THEN 'built'
               WHEN 'd' = any(stxkind) THEN 'defined'
          END AS "N_distinct",
          CASE WHEN esd.stxddependencies IS NOT NULL THEN 'built'
               WHEN 'f' = any(stxkind) THEN 'defined'
          END AS "Dependencies",
          CASE WHEN esd.stxdmcv IS NOT NULL THEN 'built'
               WHEN 'm' = any(stxkind) THEN 'defined'
          END AS "Mcv",
          COALESCE(pg_catalog.length(stxdndistinct), 0) AS "N_size",
          COALESCE(pg_catalog.length(stxddependencies), 0) AS "D_size",
          COALESCE(pg_catalog.length(stxdmcv), 0) AS "M_size"
          FROM pg_catalog.pg_statistic_ext es
          LEFT JOIN pg_catalog.pg_statistic_ext_data esd
          ON es.oid = esd.stxoid
          INNER JOIN pg_catalog.pg_class c
          ON es.stxrelid = c.oid
          ORDER BY 1, 2;


Regards,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_and_dXplus_r4.patch

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

28 October 2020, 07:20:25

Hi,

> Results of \dX and \dX+:
> ========================
> postgres=# \dX
>                              List of extended statistics
>      Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv
> -------------+-----------+-----------------+------------+--------------+---------
>    public      | hoge1_ext | a, b FROM hoge1 | defined    | defined      | defined
>    hoge1schema | hoge1_ext | a, b FROM hoge1 | built      | built        | built
> (2 rows)


I used "Order by 1, 2" on the query but I realized the ordering of
result was wrong so I fixed on the attached patch.
Please fined the patch file. :-D

Regards,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_and_dXplus_r5.patch

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

28 October 2020, 19:06:01

On Wed, Oct 28, 2020 at 03:07:56PM +0900, Tatsuro Yamada wrote:
>Hi Michael-san and Hackers,
>
>On 2020/09/30 15:19, Michael Paquier wrote:
>>On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:
>>>Could you provide at least a rebased version of the patch?  The CF bot
>>>is complaning here.
>>
>>Not seeing this answered after two weeks, I have marked the patch as
>>RwF for now.
>>--
>>Michael
>
>
>Sorry for the delayed reply.
>
>I re-based the patch on the current head and did some
>refactoring.
>I think the size of extended stats are not useful for DBA.
>Should I remove it?
>

I think it's an interesting / useful information, I'd keep it (in the
\dX+ output only, of course). But I think it needs to print the size
similarly to \d+, i.e. using pg_size_pretty - that'll include the unit
and make it more readable for large stats.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

28 October 2020, 19:07:04

On Wed, Oct 28, 2020 at 04:20:25PM +0900, Tatsuro Yamada wrote:
>Hi,
>
>>Results of \dX and \dX+:
>>========================
>>postgres=# \dX
>>                             List of extended statistics
>>     Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv
>>-------------+-----------+-----------------+------------+--------------+---------
>>   public      | hoge1_ext | a, b FROM hoge1 | defined    | defined      | defined
>>   hoge1schema | hoge1_ext | a, b FROM hoge1 | built      | built        | built
>>(2 rows)
>
>
>I used "Order by 1, 2" on the query but I realized the ordering of
>result was wrong so I fixed on the attached patch.
>Please fined the patch file. :-D
>

Thanks. I'll take a look at the beginning of the 2020-11 commitfest, and
I hope to get this committed.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

29 October 2020, 01:22:47

Hi Tomas,

On 2020/10/29 4:07, Tomas Vondra wrote:
> On Wed, Oct 28, 2020 at 04:20:25PM +0900, Tatsuro Yamada wrote:
>> Hi,
>>
>>> Results of \dX and \dX+:
>>> ========================
>>> postgres=# \dX
>>>                             List of extended statistics
>>>     Schema    |   Name    |   Definition    | N_distinct | Dependencies |   Mcv
>>> -------------+-----------+-----------------+------------+--------------+---------
>>>   public      | hoge1_ext | a, b FROM hoge1 | defined    | defined      | defined
>>>   hoge1schema | hoge1_ext | a, b FROM hoge1 | built      | built        | built
>>> (2 rows)
>>
>>
>> I used "Order by 1, 2" on the query but I realized the ordering of
>> result was wrong so I fixed on the attached patch.
>> Please find the patch file. :-D
>>
> 
> Thanks. I'll take a look at the beginning of the 2020-11 commitfest, and
> I hope to get this committed.


Thanks for your reply and I'm glad to hear that.

I'm going to revise the patch as possible to get this committed on
the next commitfest.

Regards,
Tatsuro Yamada

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

29 October 2020, 01:34:44

Hi Tomas,

On 2020/10/29 4:06, Tomas Vondra wrote:
> On Wed, Oct 28, 2020 at 03:07:56PM +0900, Tatsuro Yamada wrote:
>> Hi Michael-san and Hackers,
>>
>> On 2020/09/30 15:19, Michael Paquier wrote:
>>> On Thu, Sep 17, 2020 at 02:55:31PM +0900, Michael Paquier wrote:
>>>> Could you provide at least a rebased version of the patch?  The CF bot
>>>> is complaning here.
>>>
>>> Not seeing this answered after two weeks, I have marked the patch as
>>> RwF for now.
>>> -- 
>>> Michael
>>
>>
>> Sorry for the delayed reply.
>>
>> I re-based the patch on the current head and did some
>> refactoring.
>> I think the size of extended stats are not useful for DBA.
>> Should I remove it?
>>
> 
> I think it's an interesting / useful information, I'd keep it (in the
> \dX+ output only, of course). But I think it needs to print the size
> similarly to \d+, i.e. using pg_size_pretty - that'll include the unit
> and make it more readable for large stats.


Thanks for your comment.
I addressed it, so I keep the size of extended stats with the unit.

Changes:
========
   - Use pg_size_pretty to show the size of extended stats by \dX+

Result of \dX+:
===============
    Schema    |    Name    |   Definition    | N_distinct | Dependencies |   Mcv   |  N_Size  |  D_Size  |   M_Size
-------------+------------+-----------------+------------+--------------+---------+----------+----------+------------
  hoge1schema | hoge1_ext  | a, b FROM hoge1 | built      | built        | built   | 13 bytes | 40 bytes | 6126 bytes
  public      | hoge1_ext1 | a, b FROM hoge1 | defined    | defined      | defined | 0 bytes  | 0 bytes  | 0 bytes
  public      | hoge1_ext2 | a, b FROM hoge1 | defined    |              |         | 0 bytes  |          |
(3 rows)

Please find the attached patch.

Regards,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_and_dXplus_r6.patch

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

04 November 2020, 03:04:48

Hi,

> I addressed it, so I keep the size of extended stats with the unit.
> 
> Changes:
> ========
>    - Use pg_size_pretty to show the size of extended stats by \dX+


I rebased the patch on the head and also added tab-completion.
Any feedback is welcome.


Preparing for tests:
===========
create table t1 (a int, b int);
create statistics stts_1 (dependencies) on a, b from t1;
create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;

create table t2 (a int, b int, c int);
create statistics stts_4 on b, c from t2;

create table hoge (col1 int, col2 int, col3 int);
create statistics stts_hoge on col1, col2, col3 from hoge;

create schema foo;
create schema yama;
create statistics foo.stts_foo on col1, col2 from hoge;
create statistics yama.stts_yama (ndistinct, mcv) on col1, col3 from hoge;

insert into t1 select i,i from generate_series(1,100) i;
analyze t1;

Result of \dX:
==============
postgres=# \dX
                               List of extended statistics
  Schema |   Name    |         Definition         | N_distinct | Dependencies |   Mcv
--------+-----------+----------------------------+------------+--------------+---------
  foo    | stts_foo  | col1, col2 FROM hoge       | defined    | defined      | defined
  public | stts_1    | a, b FROM t1               |            | built        |
  public | stts_2    | a, b FROM t1               | built      | built        |
  public | stts_3    | a, b FROM t1               | built      | built        | built
  public | stts_4    | b, c FROM t2               | defined    | defined      | defined
  public | stts_hoge | col1, col2, col3 FROM hoge | defined    | defined      | defined
  yama   | stts_yama | col1, col3 FROM hoge       | defined    |              | defined
(7 rows)

Result of \dX+:
===============
postgres=# \dX+
                                                List of extended statistics
  Schema |   Name    |         Definition         | N_distinct | Dependencies |   Mcv   |  N_size  |  D_size  |
M_size

--------+-----------+----------------------------+------------+--------------+---------+----------+----------+------------
  foo    | stts_foo  | col1, col2 FROM hoge       | defined    | defined      | defined | 0 bytes  | 0 bytes  | 0
bytes
  public | stts_1    | a, b FROM t1               |            | built        |         |          | 40 bytes |
  public | stts_2    | a, b FROM t1               | built      | built        |         | 13 bytes | 40 bytes |
  public | stts_3    | a, b FROM t1               | built      | built        | built   | 13 bytes | 40 bytes | 6126
bytes
  public | stts_4    | b, c FROM t2               | defined    | defined      | defined | 0 bytes  | 0 bytes  | 0
bytes
  public | stts_hoge | col1, col2, col3 FROM hoge | defined    | defined      | defined | 0 bytes  | 0 bytes  | 0
bytes
  yama   | stts_yama | col1, col3 FROM hoge       | defined    |              | defined | 0 bytes  |          | 0
bytes
(7 rows)

Results of Tab-completion:
===============
postgres=# \dX <Tab>
foo.                 pg_toast.            stts_2               stts_hoge
information_schema.  public.              stts_3               yama.
pg_catalog.          stts_1               stts_4

postgres=# \dX+ <Tab>
foo.                 pg_toast.            stts_2               stts_hoge
information_schema.  public.              stts_3               yama.
pg_catalog.          stts_1               stts_4


Regards,
Tatsuro Yamada

Attachment

add_list_extended_stats_for_psql_by_dX_and_dXplus_r7.patch

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

08 November 2020, 21:53:34

Hi,

I took a look at this today, and I think the code is ready, but the
regression test needs a bit more work:

1) It's probably better to use somewhat more specific names for the
objects, especially when created in public schema. It decreases the
chance of a collision with other tests (which may be hard to notice
because of timing). I suggest we use "stts_" prefix or something like
that, per the attached 0002 patch. (0001 is just the v7 patch)

2) The test is failing intermittently because it's executed in parallel
with stats_ext test, which is also creating extended statistics. So
depending on the timing the \dX may list some of the stats_ext stuff.
I'm not sure what to do about this. Either this part needs to be moved
to a separate test executed in a different group, or maybe we should
simply move it to stats_ext.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

10 November 2020, 03:38:53

Hi Tomas,

> I took a look at this today, and I think the code is ready, but the
> regression test needs a bit more work:

Thanks for taking your time. :-D


> 1) It's probably better to use somewhat more specific names for the
> objects, especially when created in public schema. It decreases the
> chance of a collision with other tests (which may be hard to notice
> because of timing). I suggest we use "stts_" prefix or something like
> that, per the attached 0002 patch. (0001 is just the v7 patch)

I agree with your comment. Thanks.



> 2) The test is failing intermittently because it's executed in parallel
> with stats_ext test, which is also creating extended statistics. So
> depending on the timing the \dX may list some of the stats_ext stuff.
> I'm not sure what to do about this. Either this part needs to be moved
> to a separate test executed in a different group, or maybe we should
> simply move it to stats_ext.

I thought all tests related to meta-commands exist in psql.sql, but I
realize it's not true. For example, the test of \dRp does not exist in
psql.sql. Therefore, I moved the regression test of \dX to stats_ext.sql
to avoid the test failed in parallel.

Attached patches is following:
  - 0001-v8-Add-dX-command-on-psql.patch
  - 0002-Add-regression-test-of-dX-to-stats_ext.sql.patch

However, I feel the test of \dX is not elegant, so I'm going to try
creating another one since it would be better to be aware of the context
of existing extended stats tests.

Regards,
Tatsuro Yamada

Attachment

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

10 November 2020, 08:12:19

Hi,

  
>> 2) The test is failing intermittently because it's executed in parallel
>> with stats_ext test, which is also creating extended statistics. So
>> depending on the timing the \dX may list some of the stats_ext stuff.
>> I'm not sure what to do about this. Either this part needs to be moved
>> to a separate test executed in a different group, or maybe we should
>> simply move it to stats_ext.
> 
> I thought all tests related to meta-commands exist in psql.sql, but I
> realize it's not true. For example, the test of \dRp does not exist in
> psql.sql. Therefore, I moved the regression test of \dX to stats_ext.sql
> to avoid the test failed in parallel.
> 
> Attached patches is following:
>   - 0001-v8-Add-dX-command-on-psql.patch
>   - 0002-Add-regression-test-of-dX-to-stats_ext.sql.patch
> 
> However, I feel the test of \dX is not elegant, so I'm going to try
> creating another one since it would be better to be aware of the context
> of existing extended stats tests.

I tried to create another version of the regression test (0003).
"\dX" was added after ANALYZE command or SELECT... from pg_statistic_ext.

Please find the attached file:
   - 0003-Add-regression-test-of-dX-to-stats_ext.sql-another-ver

Both regression tests 0002 and 0003 are okay for me, I think.
Could you choose one?

Regards,
Tatsuro Yamada

Attachment

0003-Add-regression-test-of-dX-to-stats_ext.sql-another-ver.patch

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

15 November 2020, 18:22:59

Thanks,

It's better to always post the whole patch series, so that cfbot can
test it properly. Sending just 0003 separately kind breaks that.

Also, 0003 seems to only tweak the .sql file, not the expected output,
and there actually seems to be two places that mistakenly use \dx (so
listing extensions) instead of \dX. I've fixed both issues in the
attached patches.

However, I think the 0002 tests are better/sufficient - I prefer to keep
it compact, not interleaving with the tests testing various other stuff.
So I don't intend to commit 0003, unless there's something that I don't
see for some reason.

The one remaining thing I'm not sure about is naming of the columns with
size of statistics - N_size, D_size and M_size does not seem very clear.
Any clearer naming will however make the tables wider, though :-/


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

17 November 2020, 04:35:07

Hi Tomas,

Thanks for your comments and also revising patches.

On 2020/11/16 3:22, Tomas Vondra wrote:
> It's better to always post the whole patch series, so that cfbot can
> test it properly. Sending just 0003 separately kind breaks that.

I now understand how "cfbot" works so that I'll take care of that
when I send patches. Thanks.

> Also, 0003 seems to only tweak the .sql file, not the expected output,
> and there actually seems to be two places that mistakenly use \dx (so
> listing extensions) instead of \dX. I've fixed both issues in the
> attached patches.

Oops, sorry about that.

> However, I think the 0002 tests are better/sufficient - I prefer to keep
> it compact, not interleaving with the tests testing various other stuff.
> So I don't intend to commit 0003, unless there's something that I don't
> see for some reason.

I Agreed. 0002 is easy to modify test cases and check results than 0003.
Therefore, I'll go with 0002.

> The one remaining thing I'm not sure about is naming of the columns with
> size of statistics - N_size, D_size and M_size does not seem very clear.
> Any clearer naming will however make the tables wider, though :-/

Yeah, I think so too, but I couldn't get an idea of a suitable name for
the columns when I created the patch.
I don't prefer a long name but I'll replace the name with it to be clearer.
For example, s/N_size/Ndistinct_size/.

Please find attached patcheds:
   - 0001: Replace column names
   - 0002: Recreate regression test based on 0001

Regards,
Tatsuro Yamada

Attachment

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

30 November 2020, 02:19:10

Hi Tomas and hackers,

> I don't prefer a long name but I'll replace the name with it to be clearer.
> For example, s/N_size/Ndistinct_size/.
> 
> Please find attached patcheds:
>    - 0001: Replace column names
>    - 0002: Recreate regression test based on 0001


I rebased the patch set on the master (7e5e1bba03), and the regression
test is good. Therefore, I changed the status of the patch: "needs review".

I know that you proposed the new extended statistics[1], and it probably
conflicts with the patch. I hope my patch will get commit before your
patch committed to avoid the time of recreating. :-)


[1] https://www.postgresql.org/message-id/flat/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com

Thanks,
Tatsuro Yamada

Attachment

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

05 January 2021, 04:26:43

Hi,

>I rebased the patch set on the master (7e5e1bba03), and the regression
>test is good. Therefore, I changed the status of the patch: "needs review". 

Happy New Year!

I rebased my patches on HEAD.
Please find attached files. :-D

Thanks,
Tatsuro Yamada

Hi,

On 2021/01/08 0:56, Tomas Vondra wrote:
> On 1/7/21 3:47 PM, Alvaro Herrera wrote:
>> On 2021-Jan-07, Tomas Vondra wrote:
>>
>>> On 1/7/21 1:46 AM, Tatsuro Yamada wrote:
>>
>>>> I overlooked the check for MCV in the logic building query
>>>> because I created the patch as a new feature on PG14.
>>>> I'm not sure whether we should do back patch or not. However, I'll
>>>> add the check on the next patch because it is useful if you decide to
>>>> do the back patch on PG10, 11, 12, and 13.
>>>
>>> BTW perhaps a quick look at the other \d commands would show if there are
>>> precedents. I didn't have time for that.
>>
>> Yes, we do promise that new psql works with older servers.
>>
> 
> Yeah, makes sense. That means we need add the check for 12 / MCV.


Ah, I got it.
I fixed the patch to work with older servers to add the checking versions. And I tested \dX command on older servers
(PG10- 13).
 
These results look fine.

0001:
      Added the check code to handle pre-PG12. It has not MCV and
       pg_statistic_ext_data.
0002:
      This patch is the same as the previous patch (not changed).

Please find the attached files.


>> I wonder the column names added by \dX+ is fine? For example,
>> "Ndistinct_size" and "Dependencies_size". It looks like long names,
>> but acceptable?
>>
> 
> Seems acceptable - I don't have a better idea. 

I see, thanks!


Thanks,
Tatsuro Yamada

Attachment

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

08 January 2021, 00:14:39


On 1/8/21 12:52 AM, Tatsuro Yamada wrote:
> Hi,
> 
> On 2021/01/08 0:56, Tomas Vondra wrote:
>> On 1/7/21 3:47 PM, Alvaro Herrera wrote:
>>> On 2021-Jan-07, Tomas Vondra wrote:
>>>
>>>> On 1/7/21 1:46 AM, Tatsuro Yamada wrote:
>>>
>>>>> I overlooked the check for MCV in the logic building query
>>>>> because I created the patch as a new feature on PG14.
>>>>> I'm not sure whether we should do back patch or not. However, I'll
>>>>> add the check on the next patch because it is useful if you decide to
>>>>> do the back patch on PG10, 11, 12, and 13.
>>>>
>>>> BTW perhaps a quick look at the other \d commands would show if 
>>>> there are
>>>> precedents. I didn't have time for that.
>>>
>>> Yes, we do promise that new psql works with older servers.
>>>
>>
>> Yeah, makes sense. That means we need add the check for 12 / MCV.
> 
> 
> Ah, I got it.
> I fixed the patch to work with older servers to add the checking 
> versions. And I tested \dX command on older servers (PG10 - 13).
> These results look fine.
> 
> 0001:
>       Added the check code to handle pre-PG12. It has not MCV and
>        pg_statistic_ext_data.
> 0002:
>       This patch is the same as the previous patch (not changed).
> 
> Please find the attached files.
> 

OK, thanks. I'll take a look and probably push tomorrow. FWIW I plan to 
squash the patches into a single commit.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: list of extended statistics on psql

From

Tomas Vondra

Date:

09 January 2021, 00:01:27


On 1/8/21 1:14 AM, Tomas Vondra wrote:
> 
> 
> On 1/8/21 12:52 AM, Tatsuro Yamada wrote:
>> Hi,
>>
>> On 2021/01/08 0:56, Tomas Vondra wrote:
>>> On 1/7/21 3:47 PM, Alvaro Herrera wrote:
>>>> On 2021-Jan-07, Tomas Vondra wrote:
>>>>
>>>>> On 1/7/21 1:46 AM, Tatsuro Yamada wrote:
>>>>
>>>>>> I overlooked the check for MCV in the logic building query
>>>>>> because I created the patch as a new feature on PG14.
>>>>>> I'm not sure whether we should do back patch or not. However, I'll
>>>>>> add the check on the next patch because it is useful if you decide to
>>>>>> do the back patch on PG10, 11, 12, and 13.
>>>>>
>>>>> BTW perhaps a quick look at the other \d commands would show if
>>>>> there are
>>>>> precedents. I didn't have time for that.
>>>>
>>>> Yes, we do promise that new psql works with older servers.
>>>>
>>>
>>> Yeah, makes sense. That means we need add the check for 12 / MCV.
>>
>>
>> Ah, I got it.
>> I fixed the patch to work with older servers to add the checking
>> versions. And I tested \dX command on older servers (PG10 - 13).
>> These results look fine.
>>
>> 0001:
>>       Added the check code to handle pre-PG12. It has not MCV and
>>        pg_statistic_ext_data.
>> 0002:
>>       This patch is the same as the previous patch (not changed).
>>
>> Please find the attached files.
>>
> 
> OK, thanks. I'll take a look and probably push tomorrow. FWIW I plan to
> squash the patches into a single commit.
> 

Attached is a patch I plan to commit - 0001 is the last submitted
version with a couple minor tweaks, mostly in docs/comments, and small
rework of branching to be more like the other functions in describe.c.

While working on that, I realized that 'defined' might be a bit
ambiguous, I initially thought it means 'NOT NULL' (which it does not).
I propose to change it to 'requested' instead. Tatsuro, do you agree, or
do you think 'defined' is better?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

12 January 2021, 01:57:33

Hi Tomas,

On 2021/01/09 9:01, Tomas Vondra wrote:
> On 1/8/21 1:14 AM, Tomas Vondra wrote:
>> On 1/8/21 12:52 AM, Tatsuro Yamada wrote:
>>> On 2021/01/08 0:56, Tomas Vondra wrote:
>>>> On 1/7/21 3:47 PM, Alvaro Herrera wrote:
>>>>> On 2021-Jan-07, Tomas Vondra wrote:
>>>>>> On 1/7/21 1:46 AM, Tatsuro Yamada wrote:
>>>>>
>>>>>>> I overlooked the check for MCV in the logic building query
>>>>>>> because I created the patch as a new feature on PG14.
>>>>>>> I'm not sure whether we should do back patch or not. However, I'll
>>>>>>> add the check on the next patch because it is useful if you decide to
>>>>>>> do the back patch on PG10, 11, 12, and 13.
>>>>>>
>>>>>> BTW perhaps a quick look at the other \d commands would show if
>>>>>> there are
>>>>>> precedents. I didn't have time for that.
>>>>>
>>>>> Yes, we do promise that new psql works with older servers.
>>>>>
>>>>
>>>> Yeah, makes sense. That means we need add the check for 12 / MCV.
>>>
>>>
>>> Ah, I got it.
>>> I fixed the patch to work with older servers to add the checking
>>> versions. And I tested \dX command on older servers (PG10 - 13).
>>> These results look fine.
>>>
>>> 0001:
>>>        Added the check code to handle pre-PG12. It has not MCV and
>>>         pg_statistic_ext_data.
>>> 0002:
>>>        This patch is the same as the previous patch (not changed).
>>>
>>> Please find the attached files.
>>>
>>
>> OK, thanks. I'll take a look and probably push tomorrow. FWIW I plan to
>> squash the patches into a single commit.
>>
> 
> Attached is a patch I plan to commit - 0001 is the last submitted
> version with a couple minor tweaks, mostly in docs/comments, and small
> rework of branching to be more like the other functions in describe.c.

Thanks for revising the patch.
I reviewed the 0001, and the branching and comments look good to me.
However, I added an alias name in processSQLNamePattern() on the patch:
s/"stxname"/"es.stxname"/


> While working on that, I realized that 'defined' might be a bit
> ambiguous, I initially thought it means 'NOT NULL' (which it does not).
> I propose to change it to 'requested' instead. Tatsuro, do you agree, or
> do you think 'defined' is better?

Regarding the status of extended stats, I think the followings:

  - "defined": it shows the extended stats defined only. We can't know
               whether it needs to analyze or not. I agree this name was
                ambiguous. Therefore we should replace it with a more suitable
               name.
  - "requested": it shows the extended stats needs something. Of course,
               we know it needs to ANALYZE because we can create the patch.
               However, I feel there is a little ambiguity for DBA.
               To solve this, it would be better to write an explanation of
               the status in the document. For example,

======
The column of the kind of extended stats (e. g. Ndistinct) shows some statuses.
"requested" means that it needs to gather data by ANALYZE. "built" means ANALYZE
  was finished, and the planner can use it. NULL means that it doesn't exists.
======

What do you think? :-D


Thanks,
Tatsuro Yamada

Hi Tomas,

On 2021/01/13 7:48, Tatsuro Yamada wrote:
> On 2021/01/12 20:08, Tomas Vondra wrote:
>> On 1/12/21 2:57 AM, Tatsuro Yamada wrote:
>>> On 2021/01/09 9:01, Tomas Vondra wrote:
>> ...>
>>>> While working on that, I realized that 'defined' might be a bit
>>>> ambiguous, I initially thought it means 'NOT NULL' (which it does not).
>>>> I propose to change it to 'requested' instead. Tatsuro, do you agree, or
>>>> do you think 'defined' is better?
>>>
>>> Regarding the status of extended stats, I think the followings:
>>>
>>>   - "defined": it shows the extended stats defined only. We can't know
>>>                whether it needs to analyze or not. I agree this name was
>>>                 ambiguous. Therefore we should replace it with a more suitable
>>>                name.
>>>   - "requested": it shows the extended stats needs something. Of course,
>>>                we know it needs to ANALYZE because we can create the patch.
>>>                However, I feel there is a little ambiguity for DBA.
>>>                To solve this, it would be better to write an explanation of
>>>                the status in the document. For example,
>>>
>>> ======
>>> The column of the kind of extended stats (e. g. Ndistinct) shows some statuses.
>>> "requested" means that it needs to gather data by ANALYZE. "built" means ANALYZE
>>>   was finished, and the planner can use it. NULL means that it doesn't exists.
>>> ======
>>>
>>> What do you think? :-D
>>>
>>
>> Yes, that seems reasonable to me. Will you provide an updated patch?
> 
> 
> Sounds good. I'll send the updated patch today.



I updated the patch to add the explanation of the extended stats' statuses.
Please feel free to modify the patch to improve it more clearly.

The attached files are:
    0001: Add psql \dx and the fixed document
    0002: Regression test for psql \dX
    app-psql.html: Created by "make html" command (You can check the
                   explanation of the statuses easily, probably)

Thanks,
Tatsuro Yamada

Hi,

> The above query is so simple so that we would better to use the following query:
> 
> # This query works on PG10 or later
> SELECT
>      es.stxnamespace::pg_catalog.regnamespace::text AS "Schema",
>      es.stxname AS "Name",
>      pg_catalog.format('%s FROM %s',
>          (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
>           FROM pg_catalog.unnest(es.stxkeys) s(attnum)
>           JOIN pg_catalog.pg_attribute a
>           ON (es.stxrelid = a.attrelid
>           AND a.attnum = s.attnum
>           AND NOT a.attisdropped)),
>      es.stxrelid::regclass) AS "Definition",
>      CASE WHEN 'd' = any(es.stxkind) THEN 'defined'
>      END AS "Ndistinct",
>      CASE WHEN 'f' = any(es.stxkind) THEN 'defined'
>      END AS "Dependencies",
>      CASE WHEN 'm' = any(es.stxkind) THEN 'defined'
>      END AS "MCV"
> FROM pg_catalog.pg_statistic_ext es
> ORDER BY 1, 2;
> 
>   Schema |    Name    |    Definition    | Ndistinct | Dependencies | Dependencies
> --------+------------+------------------+-----------+--------------+--------------
>   public | hoge1_ext  | a, b FROM hoge1  | defined   | defined      | defined
>   public | hoge_t_ext | a, b FROM hoge_t | defined   | defined      | defined
> (2 rows)
> 
> 
> I'm going to create the WIP patch to use the above query.
> Any comments welcome. :-D


Attached patch is WIP patch.

The changes are:
   - Use pg_statistic_ext only
   - Remove these statuses: "required" and "built"
   - Add new status: "defined"
   - Remove the size columns
   - Fix document

I'll create and send the regression test on the next patch if there is
no objection. Is it Okay?

Regards,
Tatsuro Yamada

Attachment

WIP_psql_dX_using_pg_statistic_ext.patch

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

20 January 2021, 02:35:03

Hi Tomas,

On 2021/01/19 11:52, Tomas Vondra wrote:
> 
>> I'm going to create the WIP patch to use the above queriy.
>> Any comments welcome. :-D
> 
> Yes, I think using this simpler query makes sense. If we decide we need something more elaborate, we can improve that
byin future PostgreSQL versions (after adding view/function to core), but I'd leave that as a work for the future.
 


I see, thanks!


> Apologies for all the extra work - I haven't realized this flaw when pushing for showing more stuff :-(


Don't worry about it. We didn't notice the problem even when viewed by multiple
people on -hackers. Let's keep moving forward. :-D

I'll send a patch including a regression test on the next patch.

Regards,
Tatsuro Yamada

Re: list of extended statistics on psql

From

Tatsuro Yamada

Date:

20 January 2021, 06:41:57

Hi Tomas,

On 2021/01/20 11:35, Tatsuro Yamada wrote:
>> Apologies for all the extra work - I haven't realized this flaw when pushing for showing more stuff :-(
> 
> Don't worry about it. We didn't notice the problem even when viewed by multiple
> people on -hackers. Let's keep moving forward. :-D
> 
> I'll send a patch including a regression test on the next patch.


I created patches and my test results on PG10, 11, 12, and 14 are fine.

   0001:
     - Fix query to use pg_statistic_ext only
     - Replace statuses "required" and "built" with "defined"
     - Remove the size columns
     - Fix document
     - Add schema name as a filter condition on the query

   0002:
     - Fix all results of \dX
     - Add new testcase by non-superuser

Please find attached files. :-D


Regards,
Tatsuro Yamada

Hi,

Here's a slightly more complete patch, tweaking the regression tests a
bit to detect this.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

list-stats-schema-fix.patch

Re: list of extended statistics on psql (\dX)

From

Tatsuro Yamada

Date:

08 July 2021, 04:46:41

Hi Tomas and Justin,

On 2021/06/07 4:47, Tomas Vondra wrote:
> Here's a slightly more complete patch, tweaking the regression tests a
> bit to detect this.

I tested your patch on PG14beta2 and PG15devel.
And they work fine.
=======================
  All 209 tests passed.
=======================

Next time I create a feature on psql, I will be careful to add
a check for schema visibility rules. :-D

Thanks,
Tatsuro Yamada

Re: list of extended statistics on psql (\dX)

From

Tomas Vondra

Date:

26 July 2021, 19:26:26

Hi,

I've pushed the last version of the fix, including the regression tests 
etc. Backpatch to 14, where \dX was introduced.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: list of extended statistics on psql (\dX)

From

Tatsuro Yamada

Date:

27 July 2021, 00:25:57

Hi Tomas and Justin,

On 2021/07/27 4:26, Tomas Vondra wrote:
> Hi,
> 
> I've pushed the last version of the fix, including the regression tests etc. Backpatch to 14, where \dX was
introduced.


Thank you!


Regards,
Tatsuro Yamada