Re: PROC_IN_ANALYZE stillborn 13 years ago - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PROC_IN_ANALYZE stillborn 13 years ago
Date
Msg-id 2648863.1596751346@sss.pgh.pa.us
Whole thread Raw
In response to Re: PROC_IN_ANALYZE stillborn 13 years ago  (Andres Freund <andres@anarazel.de>)
Responses Re: PROC_IN_ANALYZE stillborn 13 years ago  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> In fact using conceptually like a new snapshot for each sample tuple
> actually seems like it'd be somewhat of an improvement over using a
> single snapshot.

Dunno, that feels like a fairly bad idea to me.  It seems like it would
overemphasize the behavior of whatever queries happened to be running
concurrently with the ANALYZE.  I do follow the argument that using a
single snapshot for the whole ANALYZE overemphasizes a single instant
in time, but I don't think that leads to the conclusion that we shouldn't
use a snapshot at all.

Another angle that would be worth considering, aside from the issue
of whether the sample used for pg_statistic becomes more or less
representative, is what impact all this would have on the tuple count
estimates that go to the stats collector and pg_class.reltuples.
Right now, we don't have a great story at all on how the stats collector's
count is affected by combining VACUUM/ANALYZE table-wide counts with
the incremental deltas reported by transactions happening concurrently
with VACUUM/ANALYZE.  Would changing this behavior make that better,
or worse, or about the same?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: PROC_IN_ANALYZE stillborn 13 years ago
Next
From: David Rowley
Date:
Subject: Re: pg13dev: explain partial, parallel hashagg, and memory use