Re: multivariate statistics (v19) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: multivariate statistics (v19)
Date
Msg-id 9f7d5c73-71d6-fbe0-c190-b321db46f88c@iki.fi
Whole thread Raw
In response to Re: multivariate statistics (v19)  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: multivariate statistics (v19)  (Dean Rasheed <dean.a.rasheed@gmail.com>)
List pgsql-hackers
On 10/04/2016 10:49 AM, Dean Rasheed wrote:
> On 30 September 2016 at 12:10, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> I fear that using "statistics" as the name of the new object might get a bit
>> awkward. "statistics" is a plural, but we use it as the name of a single
>> object, like "pants" or "scissors". Not sure I have any better ideas though.
>> "estimator"? "statistics collection"? Or perhaps it should be singular,
>> "statistic". I note that you actually called the system table
>> "pg_mv_statistic", in singular.
>
> I think it's OK. The functional dependency is a single statistic, but
> MCV lists and histograms are multiple statistics (multiple facts about
> the data sampled), so in general when you create one of these new
> objects, you are creating multiple statistics about the data.

Ok. I don't really have any better ideas, was just hoping that someone 
else would.

> Also I find "CREATE STATISTIC" just sounds a bit clumsy compared to
> "CREATE STATISTICS".

Agreed.

> The convention for naming system catalogs seems to be to use the
> singular for tables and plural for views, so I guess we should stick
> with that.

However, for tables and views, each object you store in those views is a 
"table" or "view", but with this thing, the object you store is 
"statistics". Would you have a catalog table called "pg_scissor"?

We call the current system table "pg_statistic", though. I agree we 
should call it pg_mv_statistic, in singular, to follow the example of 
pg_statistic.

Of course, the user-friendly system view on top of that is called 
"pg_stats", just to confuse things more :-).

> It doesn't seem like the end of the world that it doesn't
> match the user-facing syntax. A bigger concern is the use of "mv" in
> the name, because as has already been pointed out, this table may also
> in the future be used to store univariate expression and partial
> statistics, so I think we should drop the "mv" and go with something
> like pg_statistic_ext, or some other more general name.

Also, "mv" makes me think of materialized views, which is completely 
unrelated to this.

- Heikki




pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Declarative partitioning - another take
Next
From: Victor Wagner
Date:
Subject: Re: [PATCH] Generic type subscription