Re: WIP: multivariate statistics / proof of concept - Mailing list pgsql-hackers

From David Rowley
Subject Re: WIP: multivariate statistics / proof of concept
Date
Msg-id CAApHDvp_ONYK=u0c_tmzfmmq-SoHbEhCeO_z755oTZFn+Bo-Wg@mail.gmail.com
Whole thread Raw
In response to Re: WIP: multivariate statistics / proof of concept  ("Tomas Vondra" <tv@fuzzy.cz>)
Responses Re: WIP: multivariate statistics / proof of concept
List pgsql-hackers
On Thu, Oct 30, 2014 at 12:48 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Dne 29 Říjen 2014, 12:31, Petr Jelinek napsal(a):
>> I've not really gotten around to looking at the patch yet, but I'm also
>> wondering if it would be simple include allowing functional statistics
>> too. The pg_mv_statistic name seems to indicate multi columns, but how
>> about stats on date(datetime_column), or perhaps any non-volatile
>> function. This would help to solve the problem highlighted here
>> http://www.postgresql.org/message-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
>> . Without giving it too much thought, perhaps any expression that can be
>> indexed should be allowed to have stats? Would that be really difficult
>> to implement in comparison to what you've already done with the patch so
>> far?
>>
>
> I would not over-complicate requirements for the first version of this,
> I think it's already complicated enough.

My thoughts, exactly. I'm not willing to put more features into the
initial version of the patch. Actually, I'm thinking about ripping out
some experimental features (particularly "hashed MCV" and "associative
rules").


That's fair, but I didn't really mean to imply that you should go work on that too and that it should be part of this patch..
I was thinking more along the lines of that I don't really agree with the table name for the new stats and that at some later date someone will want to add expression stats and we'd probably better come up design that would be friendly towards that. At this time I can only think that the name of the table might not suit well to expression stats, I'd hate to see someone have to invent a 3rd table to support these when we could likely come up with something that could be extended later and still make sense both today and in the future.

I was just looking at how expression indexes are stored in pg_index and I see that if it's an expression index that the expression is stored in the indexprs column which is of type pg_node_tree, so quite possibly at some point in the future the new stats table could just have an extra column added, and for today, we'd just need to come up with a future proof name... Perhaps pg_statistic_ext or pg_statisticx, and name functions and source files something along those lines instead?

Regards

David Rowley

pgsql-hackers by date:

Previous
From: Abhijit Menon-Sen
Date:
Subject: Re: initdb -S and tablespaces
Next
From: David Rowley
Date:
Subject: Re: WIP: multivariate statistics / proof of concept