Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

From Andres Freund
Subject Re: ANALYZE sampling is too good
Date
Msg-id 20131206092114.GH7814@awork2.anarazel.de
Whole thread Raw
In response to Re: ANALYZE sampling is too good  (Peter Geoghegan <pg@heroku.com>)
Responses Re: ANALYZE sampling is too good
Re: ANALYZE sampling is too good
Re: ANALYZE sampling is too good
List pgsql-hackers
On 2013-12-05 17:52:34 -0800, Peter Geoghegan wrote:
> Has anyone ever thought about opportunistic ANALYZE piggy-backing on
> other full-table scans? That doesn't really help Greg, because his
> complaint is mostly that a fresh ANALYZE is too expensive, but it
> could be an interesting, albeit risky approach.

What I've been thinking of is

a) making it piggy back on scans vacuum is doing instead of doing
separate ones all the time (if possible, analyze needs to be more
frequent). Currently with quite some likelihood the cache will be gone
again when revisiting.

b) make analyze incremental. In lots of bigger tables most of the table
is static - and we actually *do* know that, thanks to the vm. So keep a
rawer form of what ends in the catalogs around somewhere, chunked by the
region of the table the statistic is from. Everytime a part of the table
changes, re-sample only that part. Then recompute the aggregate.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: ANALYZE sampling is too good
Next
From: Boszormenyi Zoltan
Date:
Subject: Re: Backup throttling