Home > mailing lists

Re: PROC_IN_ANALYZE stillborn 13 years ago - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: PROC_IN_ANALYZE stillborn 13 years ago
Date	August 6, 2020 22:02:26
Msg-id	2648863.1596751346@sss.pgh.pa.us Whole thread Raw
In response to	Re: PROC_IN_ANALYZE stillborn 13 years ago (Andres Freund <andres@anarazel.de>)
Responses	Re: PROC_IN_ANALYZE stillborn 13 years ago
List	pgsql-hackers

Tree view

Andres Freund <andres@anarazel.de> writes:
> In fact using conceptually like a new snapshot for each sample tuple
> actually seems like it'd be somewhat of an improvement over using a
> single snapshot.

Dunno, that feels like a fairly bad idea to me.  It seems like it would
overemphasize the behavior of whatever queries happened to be running
concurrently with the ANALYZE.  I do follow the argument that using a
single snapshot for the whole ANALYZE overemphasizes a single instant
in time, but I don't think that leads to the conclusion that we shouldn't
use a snapshot at all.

Another angle that would be worth considering, aside from the issue
of whether the sample used for pg_statistic becomes more or less
representative, is what impact all this would have on the tuple count
estimates that go to the stats collector and pg_class.reltuples.
Right now, we don't have a great story at all on how the stats collector's
count is affected by combining VACUUM/ANALYZE table-wide counts with
the incremental deltas reported by transactions happening concurrently
with VACUUM/ANALYZE.  Would changing this behavior make that better,
or worse, or about the same?

            regards, tom lane

pgsql-hackers by date:

From: Andres Freund
Date: 06 August 2020, 21:45:41
Subject: Re: PROC_IN_ANALYZE stillborn 13 years ago

From: David Rowley
Date: 06 August 2020, 22:24:09
Subject: Re: pg13dev: explain partial, parallel hashagg, and memory use

Re: PROC_IN_ANALYZE stillborn 13 years ago - Mailing list pgsql-hackers

Previous

Next