Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: ANALYZE sampling is too good
Date
Msg-id CA+U5nMKQrTZ=SF93rY=uXYwcXDBtHjXWsP+X1THQnqSQLG57Yg@mail.gmail.com
Whole thread Raw
In response to Re: ANALYZE sampling is too good  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 10 December 2013 19:49, Peter Geoghegan <pg@heroku.com> wrote:
> On Tue, Dec 10, 2013 at 11:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> However, these things presume that we need to continue scanning most
>> of the blocks of the table, which I don't think needs to be the case.
>> There is a better way.
>
> Do they? I think it's one opportunistic way of ameliorating the cost.
>
>> Back in 2005/6, I advocated a block sampling method, as described by
>> Chaudri et al (ref?)
>
> I don't think that anyone believes that not doing block sampling is
> tenable, fwiw. Clearly some type of block sampling would be preferable
> for most or all purposes.

If we have one way of reducing cost of ANALYZE, I'd suggest we don't
need 2 ways - especially if the second way involves the interaction of
otherwise not fully related parts of the code.

Or to put it clearly, lets go with block sampling and then see if that
needs even more work.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: ANALYZE sampling is too good
Next
From: Josh Berkus
Date:
Subject: Re: ANALYZE sampling is too good