On 2020-08-05 19:55:49 -0400, Alvaro Herrera wrote: > ... which means the flag I had added two days earlier has never been > used for anything. We've carried the flag forward to this day for > almost 13 years, dutifully turning it on and off ... but never checking > it anywhere. > > I propose to remove it, as in the attached patch.
I'm mildly against that, because I'd really like to start making use of the flag. Not so much for cancellations, but to avoid the drastic impact analyze has on bloat. In OLTP workloads with big tables, and without disabled cost limiting for analyze (or slow IO), the snapshot that analyze holds is often by far the transaction with the oldest xmin.
It's not entirely trivial to fix (just ignoring it could lead to detoasting issues), but also not that.
Only mildly against because it'd not be hard to reintroduce once we need it.
Good points, both.
The most obvious way to avoid long analyze snapshots is to make the analysis take multiple snapshots as it runs, rather than try to invent some clever way of ignoring the analyze snapshots (which as Alvaro points out, we never did). All we need to do is to have an analyze snapshot last for at most N rows, but keep scanning until we have the desired sample size. Doing that would mean the analyze sample wouldn't come from a single snapshot, but then who cares? There is no requirement for consistency - the sample would be arguably *more* stable because it comes from multiple points in time, not just one.