Hi,
On 2023-01-19 13:22:28 -0800, Peter Geoghegan wrote:
> On Thu, Jan 19, 2023 at 12:56 PM Andres Freund <andres@anarazel.de> wrote:
> > But in contrast to dead_tuples, where I think we can just stop analyze from
> > updating it unless we crashed recently, I do think we need to update reltuples
> > in vacuum. So computing an accurate value seems like the least unreasonable
> > thing I can see.
>
> I agree, but there is no reasonable basis for treating scanned_pages
> as a random sample, especially if it's only a small fraction of all of
> rel_pages -- treating it as a random sample is completely
> unjustifiable.
Agreed.
> And so it seems to me that the only thing that can be done is to either make
> VACUUM behave somewhat like ANALYZE in at least some cases, or to have it
> invoke ANALYZE directly (or indirectly) in those same cases.
Yea. Hence my musing about potentially addressing this by choosing to visit
additional blocks during the heap scan using vacuum's block sampling logic.
IME most of the time in analyze isn't spent doing IO for the sample blocks
themselves, but CPU time and IO for toasted columns. A trimmed down version
that just computes relallvisible should be a good bit faster.
Greetings,
Andres Freund