Home > mailing lists

Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: ANALYZE sampling is too good
Date	December 6, 2013 09:21:32
Msg-id	20131206092114.GH7814@awork2.anarazel.de Whole thread Raw
In response to	Re: ANALYZE sampling is too good (Peter Geoghegan <pg@heroku.com>)
Responses	Re: ANALYZE sampling is too good Re: ANALYZE sampling is too good Re: ANALYZE sampling is too good
List	pgsql-hackers

Tree view

On 2013-12-05 17:52:34 -0800, Peter Geoghegan wrote:
> Has anyone ever thought about opportunistic ANALYZE piggy-backing on
> other full-table scans? That doesn't really help Greg, because his
> complaint is mostly that a fresh ANALYZE is too expensive, but it
> could be an interesting, albeit risky approach.

What I've been thinking of is

a) making it piggy back on scans vacuum is doing instead of doing
separate ones all the time (if possible, analyze needs to be more
frequent). Currently with quite some likelihood the cache will be gone
again when revisiting.

b) make analyze incremental. In lots of bigger tables most of the table
is static - and we actually *do* know that, thanks to the vm. So keep a
rawer form of what ends in the catalogs around somewhere, chunked by the
region of the table the statistic is from. Everytime a part of the table
changes, re-sample only that part. Then recompute the aggregate.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

pgsql-hackers by date:

From: Amit Kapila
Date: 06 December 2013, 08:49:40
Subject: Re: ANALYZE sampling is too good

From: Boszormenyi Zoltan
Date: 06 December 2013, 09:44:05
Subject: Re: Backup throttling

Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

Previous

Next