Re: multivariate statistics (v19) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: multivariate statistics (v19)
Date
Msg-id 58b46a5b-b3b5-f7c7-39e2-a0d062da9bf8@2ndquadrant.com
Whole thread Raw
In response to Re: multivariate statistics (v19)  (Ants Aasma <ants.aasma@eesti.ee>)
List pgsql-hackers
On 08/10/2016 03:29 PM, Ants Aasma wrote:
> On Wed, Aug 3, 2016 at 4:58 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> 2) combining multiple statistics
>>
>> I think the ability to combine multivariate statistics (covering different
>> subsets of conditions) is important and useful, but I'm starting to think
>> that the current implementation may not be the correct one (which is why I
>> haven't written the SGML docs about this part of the patch series yet).
>
> While researching this topic a few years ago I came across a paper on
> this exact topic called "Consistently Estimating the Selectivity of
> Conjuncts of Predicates" [1]. While effective it seems to be quite
> heavy-weight, so would probably need support for tiered optimization.
>
> [1] https://courses.cs.washington.edu/courses/cse544/11wi/papers/markl-vldb-2005.pdf
>

I think I've read that paper some time ago, and IIRC it's solving the 
same problem but in a very different way - instead of combining the 
statistics directly, it relies on the "partial" selectivities and then 
estimates the total selectivity using the maximum-entropy principle.

I think it's a nice idea and it probably works fine in many cases, but 
it kinda throws away part of the information (that we could get by 
matching the statistics against each other directly). But I'll keep that 
paper in mind, and we can revisit this solution later.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: "Regina Obe"
Date:
Subject: Re: Is there a way around function search_path killing SQL function inlining?
Next
From: Tomas Vondra
Date:
Subject: Re: multivariate statistics (v19)