Home > mailing lists

Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: ANALYZE sampling is too good
Date	December 11, 2013 03:15:04
Msg-id	CA+U5nMKQ-b=34u37A1yOMOExQ2me+Tif8_-cYHDM3vODOrLDuA@mail.gmail.com Whole thread Raw
In response to	Re: ANALYZE sampling is too good (Peter Geoghegan <pg@heroku.com>)
Responses	Re: ANALYZE sampling is too good Re: ANALYZE sampling is too good
List	pgsql-hackers

Tree view

On 10 December 2013 23:43, Peter Geoghegan <pg@heroku.com> wrote:
> On Tue, Dec 10, 2013 at 3:26 PM, Jim Nasby <jim@nasby.net> wrote:
>>> I agree that looking for information on block level sampling
>>> specifically, and its impact on estimation quality is likely to not
>>> turn up very much, and whatever it does turn up will have patent
>>> issues.
>>
>>
>> We have an entire analytics dept. at work that specializes in finding
>> patterns in our data. I might be able to get some time from them to at least
>> provide some guidance here, if the community is interested. They could
>> really only serve in a consulting role though.
>
> I think that Greg had this right several years ago: it would probably
> be very useful to have the input of someone with a strong background
> in statistics. It doesn't seem that important that they already know a
> lot about databases, provided they can understand what our constraints
> are, and what is important to us. It might just be a matter of having
> them point us in the right direction.

err, so what does stats target mean exactly in statistical theory?
Waiting for a statistician, and confirming his credentials before you
believe him above others here, seems like wasted time.

What your statistician will tell you is it that YMMV, depending on the data.

So we'll still need a parameter to fine tune things when the default
is off. We can argue about the default later, in various level of
rigour.

Block sampling, with parameter to specify sample size. +1

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Robert Haas
Date: 11 December 2013, 03:11:08
Subject: Re: logical changeset generation v6.8

From: Jeff Janes
Date: 11 December 2013, 03:22:31
Subject: Re: Why we are going to have to go DirectIO

Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

Previous

Next