Re: Why we don't want hints Was: Slow count(*) again... - Mailing list pgsql-performance

From Bruce Momjian
Subject Re: Why we don't want hints Was: Slow count(*) again...
Date
Msg-id 201102162122.p1GLMQS29138@momjian.us
Whole thread Raw
In response to Re: Why we don't want hints Was: Slow count(*) again...  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: Why we don't want hints Was: Slow count(*) again...  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
Kevin Grittner wrote:
> Shaun Thomas <sthomas@peak6.com> wrote:
>
> > how difficult would it be to add that syntax to the JOIN
> > statement, for example?
>
> Something like this syntax?:
>
> JOIN WITH (correlation_factor=0.3)
>
> Where 1.0 might mean that for each value on the left there was only
> one distinct value on the right, and 0.0 would mean that they were
> entirely independent?  (Just as an off-the-cuff example -- I'm not
> at all sure that this makes sense, let alone is the best thing to
> specify.  I'm trying to get at *syntax* here, not particular knobs.)

I am not excited about the idea of putting these correlations in
queries.  What would be more intesting would be for analyze to build a
correlation coeffficent matrix showing how columns are correlated:

    a   b   c
    a   1   .4  0
    b   .1  1   -.3
    c   .2  .3  1

and those correlations could be used to weigh how the single-column
statistics should be combined.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

pgsql-performance by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Estimating hot data size
Next
From: Greg Smith
Date:
Subject: Re: Estimating hot data size