Re: PG12 autovac issues - Mailing list pgsql-general

From Julien Rouhaud
Subject Re: PG12 autovac issues
Date
Msg-id 20200327192303.GA20785@nol
Whole thread Raw
In response to Re: PG12 autovac issues  (Michael Paquier <michael@paquier.xyz>)
Responses Re: PG12 autovac issues
List pgsql-general
On Fri, Mar 27, 2020 at 02:12:04PM +0900, Michael Paquier wrote:
> On Thu, Mar 26, 2020 at 09:46:47AM -0500, Justin King wrote:
> > Nope, it was just these tables that were looping over and over while
> > nothing else was getting autovac'd.  I'm happy to share the full log
> > if you'd like.
> 
> Thanks, that could help.  If that's very large, it could be a problem
> to send that to the lists, but you could send me directly a link to
> it and I'll try to extract more information for the lists.  While
> testing for reproducing the issue, I have noticed that basically one
> set of catalog tables happened to see this "skipping redundant" log. 
> And I am wondering if we have a match with the set of catalog tables
> looping.
> 
> > I did have to remove it from this state, but I can undo my workaround
> > and, undoubtedly, it'll end up back there.  Let me know if there's
> > something specific you'd like me to provide when it happens!
> 
> For now I think it's fine.  Note that Julien and I have an environment
> where the issue can be reproduced easily (it takes roughly 12 hours
> until the wraparound cutoffs are reached with the benchmark and
> settings used), and we are checking things using a patched instance
> with 2aa6e33 reverted.  I think that we are accumulating enough
> evidence that this change was not a good idea anyway thanks to the
> information you sent, so likely we'll finish first by a revert of
> 2aa6e33 from the master and REL_12_STABLE branches, before looking at
> the issues with the catalogs for those anti-wraparound and
> non-aggressive jobs (this looks like a relcache issue with the so-said
> catalogs).

FTR we reached the 200M transaxtion earlier, and I can see multiple logs of the
form "automatic vacuum to prevent wraparound", so non-aggressive antiwraparound
autovacuum, all on shared relations.

As those vacuum weren't skipped, autovacuum didn't get stuck in a loop on those
and continue its work normally.  This happened ~ 4h ago, didn't ocurred again
while the 200M threshold was reached again multiple time.



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: \COPY to accept non UTF-8 chars in CHAR columns
Next
From: "Bellrose, Brian"
Date:
Subject: Promoting Hot standby after running select pg_xlog_replay_pause();