Re: proposal : cross-column stats - Mailing list pgsql-hackers

From Joshua Tolley
Subject Re: proposal : cross-column stats
Date
Msg-id 4d065f00.ce05ec0a.3fd2.4143@mx.google.com
Whole thread Raw
In response to Re: proposal : cross-column stats  (Nathan Boley <npboley@gmail.com>)
Responses Re: proposal : cross-column stats  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-hackers
On Sun, Dec 12, 2010 at 07:10:44PM -0800, Nathan Boley wrote:
> Another quick note: I think that storing the full contingency table is
> wasteful since the marginals are already stored in the single column
> statistics. Look at copulas [2] ( FWIW I think that Josh Tolley was
> looking at this a couple years back ).

Josh Tolley still looks at it occasionally, though time hasn't permitted any
sort of significant work for quite some time. The multicolstat branch on my
git.postgresql.org repository will create an empirical copula each
multi-column index, and stick it in pg_statistic. It doesn't yet do anything
useful with that information, nor am I convinced it's remotely bug-free. In a
brief PGCon discussion with Tom a while back, it was suggested a good place
for the planner to use these stats would be clausesel.c, which is responsible
for handling code such as "...WHERE foo > 4 AND foo > 5".

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Complier warnings on mingw gcc 4.5.0
Next
From: Tom Lane
Date:
Subject: Re: initdb failure with Postgres 8.4.4