Re: list of extended statistics on psql - Mailing list pgsql-hackers
From | Tatsuro Yamada |
---|---|
Subject | Re: list of extended statistics on psql |
Date | |
Msg-id | 4948ebb7-0992-8df2-2b14-f84472b46cc5@nttcom.co.jp_1 Whole thread Raw |
In response to | Re: list of extended statistics on psql (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: list of extended statistics on psql
Re: list of extended statistics on psql |
List | pgsql-hackers |
Hi Tomas, >>> As for how to deal with this, I can think of about three ways: >>> >>> 1) simplify the command to only print information from pg_statistic_ext (so on information about which stats are builtor sizes) >>> >>> 2) extend pg_stats_ext with necessary information (e.g. sizes) >>> >>> 3) create a new system view, with necessary information (so that pg_stats_ext does not need to be modified) >>> >>> 4) add functions returning the necessary information, possibly only for statistics the user can access (similarly towhat pg_stats_ext does) >>> >>> Options 2-4 have the obvious disadvantage that this won't work on older releases (we can't add views or functions there).So I'm leaning towards #1 even if that means we have to remove some of the details. We can consider adding that fornew releases, though. >> >> >> Thanks for the useful advice. I go with option 1). >> The following query is created by using pg_stats_ext instead of pg_statistic_ext and pg_statistic_ext_data. However, Iwas confused >> about writing a part of the query for calculating MCV size because >> there are four columns related to MCV. For example, most_common_vals, most_common_val_nulls, most_common_freqs, and most_common_base_freqs. >> Currently, I don't know how to calculate the size of MCV by using the >> four columns. Thoughts? :-) > > Well, my suggestion was to use pg_statistic_ext, because that lists all statistics, while pg_stats_ext is filtering statisticsdepending on access privileges. I think that's more appropriate for \dX, the contents should not change dependingon the user. > > Also, let me clarify - with option (1) we'd not show the sizes at all. The size of the formatted statistics may be verydifferent from the on-disk representation, so I see no point in showing it in \dX. > > We might show other stats (e.g. number of MCV items, or the fraction of data represented by the MCV list), but the usercan inspect pg_stats_ext if needed. > > What we might do is to show those stats when a superuser is running this command, but I'm not sure that's a good idea (orhow difficult would it be to implement). Thanks for clarifying. I see that your suggestion was to use pg_statistic_ext, not pg_stats_ext. And we don't need the size of stats. If that's the case, we also can't get the status of stats since PG12 or later because we can't use pg_statistic_ext_data, as you know. Therefore, it would be better to replace the query with the old query that I sent five months ago like this: # the old query SELECT stxnamespace::pg_catalog.regnamespace AS "Schema", stxrelid::pg_catalog.regclass AS "Table", stxname AS "Name", (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ') FROM pg_catalog.unnest(stxkeys) s(attnum) JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND a.attnum = s.attnum AND NOT attisdropped)) AS "Columns", 'd' = any(stxkind) AS "Ndistinct", 'f' = any(stxkind) AS "Dependencies", 'm' = any(stxkind) AS "MCV" FROM pg_catalog.pg_statistic_ext stat ORDER BY 1,2; Schema | Table | Name | Columns | Ndistinct | Dependencies | MCV --------+--------+------------+---------+-----------+--------------+----- public | hoge1 | hoge1_ext | a, b | t | t | t public | hoge_t | hoge_t_ext | a, b | t | t | t (2 rows) The above query is so simple so that we would better to use the following query: # This query works on PG10 or later SELECT es.stxnamespace::pg_catalog.regnamespace::text AS "Schema", es.stxname AS "Name", pg_catalog.format('%s FROM %s', (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ') FROM pg_catalog.unnest(es.stxkeys) s(attnum) JOIN pg_catalog.pg_attribute a ON (es.stxrelid = a.attrelid AND a.attnum = s.attnum AND NOT a.attisdropped)), es.stxrelid::regclass) AS "Definition", CASE WHEN 'd' = any(es.stxkind) THEN 'defined' END AS "Ndistinct", CASE WHEN 'f' = any(es.stxkind) THEN 'defined' END AS "Dependencies", CASE WHEN 'm' = any(es.stxkind) THEN 'defined' END AS "MCV" FROM pg_catalog.pg_statistic_ext es ORDER BY 1, 2; Schema | Name | Definition | Ndistinct | Dependencies | Dependencies --------+------------+------------------+-----------+--------------+-------------- public | hoge1_ext | a, b FROM hoge1 | defined | defined | defined public | hoge_t_ext | a, b FROM hoge_t | defined | defined | defined (2 rows) I'm going to create the WIP patch to use the above queriy. Any comments welcome. :-D Thanks, Tatsuro Yamada
pgsql-hackers by date: