Re: Statistics Import and Export - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Statistics Import and Export
Date
Msg-id 3670503.1740173078@sss.pgh.pa.us
Whole thread Raw
In response to Re: Statistics Import and Export  (Andres Freund <andres@anarazel.de>)
Responses Re: Statistics Import and Export
Re: Statistics Import and Export
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> Looking at the query log, the biggest culprit is a *lot* of additional
> queries, I think primarily these two:

> SELECT c.oid::regclass AS relation, current_setting('server_version_num') AS version, c.relpages, c.reltuples,
c.relallvisibleFROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace WHERE n.nspname = 'public' AND c.relname =
'alpha_neg_p2'

> SELECT c.oid::regclass AS relation, s.attname,s.inherited,current_setting('server_version_num') AS version,
s.null_frac,s.avg_width,s.n_distinct,s.most_common_vals,s.most_common_freqs,s.histogram_bounds,s.correlation,s.most_common_elems,s.most_common_elem_freqs,s.elem_count_histogram,s.range_length_histogram,s.range_empty_frac,s.range_bounds_histogram
FROMpg_stats s JOIN pg_namespace n ON n.nspname = s.schemaname JOIN pg_class c ON c.relname = s.tablename AND
c.relnamespace= n.oid WHERE s.schemaname = 'public' AND s.tablename = 'alpha_neg_p2' ORDER BY s.attname, s.inherited 

Oy.  Those are outright horrid, even without any consideration of
pre-preparing them.  We know the OID of the table we want to dump,
we should be doing "FROM pg_class WHERE oid = whatever" and lose
the join to pg_namespace altogether.  The explicit casts to regclass
are quite expensive too to fetch information that pg_dump already
has.  It already knows the server version, too.

Moreover, the first of these shouldn't be a separate query at all.
I objected to fetching pg_statistic content for all tables at once,
but relpages/reltuples/relallvisible is a pretty small amount of
new info.  We should just collect those fields as part of getTables'
main query of pg_class (which, indeed, is already fetching relpages).

On the second one, if we want to go through the pg_stats view then
we can't rely on table OID, but I don't see why we need the joins
to anything else.  "WHERE s.schemaname = 'x' AND s.tablename = 'y'"
seems sufficient.

I wonder whether we ought to issue different queries depending on
whether we're superuser.  The pg_stats view is rather expensive
because of its security restrictions, and if we're superuser we
could just look directly at pg_statistic.  Maybe those checks are
fast enough not to matter, but ...

            regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: explain analyze rows=%.0f
Next
From: Robert Haas
Date:
Subject: Re: explain analyze rows=%.0f