On Thu, 18 Mar 2021 at 14:37, Peter Geoghegan <pg@bowt.ie> wrote:
> They usually involve some *combination* of Postgres problems,
> application code problems, and DBA error. Not any one thing. I've seen
> problems with application code that runs DDL at scheduled intervals,
> which interacts badly with vacuum -- but only really on the rare
> occasions when freezing is required!
What I've seen is an application that regularly ran ANALYZE on a
table. This worked fine as long as vacuums took less than the interval
between analyzes (in this case 1h) but once vacuum started taking
longer than that interval autovacuum would cancel it every time due to
the conflicting lock.
That would have just continued until the wraparound vacuum which
wouldn't self-cancel except that there was also a demon running which
would look for sessions stuck on a lock and kill the blocker -- which
included killing the wraparound vacuum.
And yes, this demon is obviously a terrible idea but of course it was
meant for killing buggy user queries. It wasn't expecting to find
autovacuum jobs blocking things. The real surprise for that user was
that VACUUM could be blocked by things that someone would reasonably
want to run regularly like ANALYZE.
--
greg