Re: Status of autovacuum and the sporadic stats failures ? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Status of autovacuum and the sporadic stats failures ?
Date
Msg-id 20112.1170797059@sss.pgh.pa.us
Whole thread Raw
In response to Re: Status of autovacuum and the sporadic stats failures ?  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Status of autovacuum and the sporadic stats failures ?  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Also note this message:
>> If this theory is correct, then we can improve the reliability of the
>> stats test a good deal if we put a sleep() at the *start* of the test,
>> to let any old backends get out of the way.  It seems worth a try
>> anyway.  I'll add this to HEAD and if the stats failure noise seems to
>> go down, we can back-port it.

> Apparently it wasn't enough to completely eliminate the problems.  Did
> it reduce them?  I haven't been watching the buildfarm closely enough to
> know for sure.

It doesn't seem to have helped much if at all :-(.

The $64 question in my mind is whether the failures represent pgstats
not working at all, or just being pretty slow when the system is under
load.  It seems likely to be the latter, but ...  I don't want to just
keep jacking the sleep up indefinitely, anyway; that will slow the
regression tests down for little reason.

I'm tempted to propose replacing the fixed sleep with a short plpgsql
function that sleeps for a second, checks to see if the stats have
changed, repeats if not; giving up only after perhaps 30 seconds.

It'd be interesting to try to gather stats on the length of the delay
taken, but I don't see a good way to do that within the current
regression-test infrastructure.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Stefan Kaltenbrunner
Date:
Subject: Re: Status of autovacuum and the sporadic stats failures ?
Next
From: Marc Munro
Date:
Subject: Re: referential Integrity and SHARE locks