Re: Reduced power consumption in autovacuum launcher process - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Reduced power consumption in autovacuum launcher process |
Date | |
Msg-id | CA+Tgmoa5EFqmKH3t8KY98dS=1cbfbPosMXGpzOi3ZmsCG+21RA@mail.gmail.com Whole thread Raw |
In response to | Reduced power consumption in autovacuum launcher process (Peter Geoghegan <peter@2ndquadrant.com>) |
Responses |
Re: Reduced power consumption in autovacuum launcher process
|
List | pgsql-hackers |
On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote: >>> Another concern is, what happens when we receive a signal, generically >>> handled or otherwise, and have to SetLatch() to avoid time-out >>> invalidation? Should we just live with a spurious >>> AutoVacLauncherMain() iteration, or should we do something like check >>> if the return value of WaitLatch indicates that we woke up due to a >>> SetLatch() call, which must have been within a singal handler, and >>> that we should therefore goto just before WaitLatch() and elide the >>> spurious iteration? Given that we can expect some signals to occur >>> relatively frequently, spurious iterations could be a real concern. >> >> Really? I suspect that it doesn't much matter exactly how many >> machine language instructions we execute on each wake-up, within >> reasonable bounds, of course. Maybe some testing is in order? > > There's only one way to get around the time-out invalidation problem > that I'm aware of - call SetLatch() in the handler. I'd be happy to > hear alternatives, but until we have an alternative, we're stuck > managing this in each and every signal handler. > > Once we've had the latch set to handle this, and control returns to > the auxiliary process loop, we now have to decide from within the > auxiliary if we can figure out that all that happened was a "required" > wake-up, and thus we shouldn't really go through with another > iteration. That, or we can simply do the iteration. > > I have my doubts that it is acceptable to wake-up spuriously in > response to routine events that there are generic handlers for. Maybe > this needs to be decided on a case-by-case basis. I'm confused. If the process gets hit with a signal, it's already woken up, isn't it? Whatever system call it was blocked on may or may not get restarted depending on the platform and what the signal handler does, but from an OS perspective, the process has already been allocated a time slice and will run until either the time slice is exhausted or it again blocks. >> On another note, I might be inclined to write something like: >> >> if ((return_value_of_waitlatch & WL_POSTMASTER_DEATH) && !PostmasterIsAlive()) >> proc_exit(1); >> >> ...so as to avoid calling that function unnecessarily on every iteration. > > Hmm. I'm not so sure. We're now relying on the return value of > WaitLatch(), which isn't guaranteed to report all wake-up events > (although I don't believe it would be a problem in this exact case). > Previously, we called PostmasterIsAlive() once a second anyway, and > that wasn't much of a problem. Ah. OK. >>> Incidentally, should I worry about the timeout long for WaitLatch() >>> overflowing? >> >> How would that happen? > > struct timeval is comprised of two longs - one representing seconds, > and the other represented the balance of microseconds. Previously, we > didn't combine them into a single microsecond representation. Now, we > do. > > There could perhaps be a very large "nap", as determined by > launcher_determine_sleep(), so that the total number of microseconds > passed to WaitLatch() would exceed the maximum long size that can be > safely represented on some or all platforms. On most 32-bit machines, > sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = > 2,147,483,647 microseconds = only about 35 minutes. There are corner > cases, such as if someone were to set autovacuum_naptime to something > silly. OK. In that case, my feeling is "yes, you need to worry about that". I'm not sure exactly what the best solution is: we could either twiddle the WaitLatch interface some more, or restrict autovacuum_naptime to at most 30 minutes, or maybe there's some other option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: