Re: progress report for ANALYZE - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: progress report for ANALYZE
Date
Msg-id 20191105133850.GA2494@alvherre.pgsql
Whole thread Raw
In response to Re: progress report for ANALYZE  (Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp>)
Responses Re: progress report for ANALYZE  (Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp>)
List pgsql-hackers
On 2019-Nov-05, Tatsuro Yamada wrote:

> ==============
> [Session1]
> \! pgbench -i
> create statistics pg_ext1 (dependencies) ON aid, bid from pgbench_accounts;
> create statistics pg_ext2 (mcv) ON aid, bid from pgbench_accounts;
> create statistics pg_ext3 (ndistinct) ON aid, bid from pgbench_accounts;

Wow, it takes a long time to compute these ...

Hmm, you normally wouldn't define stats that way; you'd do this instead:

create statistics pg_ext1 (dependencies, mcv,ndistinct) ON aid, bid from pgbench_accounts;

I'm not sure if this has an important impact in practice.  What I'm
saying is that I'm not sure that "number of ext stats" is necessarily a
useful number as shown.  I wonder if it's possible to count the number
of items that have been computed for each stats object.  So if you do
this

create statistics pg_ext1 (dependencies, mcv) ON aid, bid from pgbench_accounts;
create statistics pg_ext2 (ndistinct,histogram) ON aid, bid from pgbench_accounts;

then the counter goes to 4.  But I also wonder if we need to publish
_which_ type of ext stats is currently being built, in a separate
column.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: amul sul
Date:
Subject: Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Next
From: Peter Eisentraut
Date:
Subject: Re: v12 and pg_restore -f-