Thread: Reduced power consumption in autovacuum launcher process
On 18 July 2011 01:48, Robert Haas <robertmhaas@gmail.com> wrote: > Might be good to start a new thread for each auxilliary process, or we > may get confused. Right - I've done so in this reply. There isn't a problem as such with the AV Launcher patch - there's a problem with most or all remaining auxiliary processes, that needs to be worked out once and for all. I refer to the generic signal handler/timeout invalidation issues already described. >> ISTM that these generic handlers ought to be handling this >> generically, and that there should be a Latch for just this purpose >> for each process within PGPROC. We already have this Latch in PGPROC: >> >> Latch waitLatch; /* allow us to wait for sync rep */ >> >> Maybe its purpose should be expanded to "current process Latch"? > > I think that could be a really good idea, though I haven't looked at > it closely enough to be sure. > >> Another concern is, what happens when we receive a signal, generically >> handled or otherwise, and have to SetLatch() to avoid time-out >> invalidation? Should we just live with a spurious >> AutoVacLauncherMain() iteration, or should we do something like check >> if the return value of WaitLatch indicates that we woke up due to a >> SetLatch() call, which must have been within a singal handler, and >> that we should therefore goto just before WaitLatch() and elide the >> spurious iteration? Given that we can expect some signals to occur >> relatively frequently, spurious iterations could be a real concern. > > Really? I suspect that it doesn't much matter exactly how many > machine language instructions we execute on each wake-up, within > reasonable bounds, of course. Maybe some testing is in order? There's only one way to get around the time-out invalidation problem that I'm aware of - call SetLatch() in the handler. I'd be happy to hear alternatives, but until we have an alternative, we're stuck managing this in each and every signal handler. Once we've had the latch set to handle this, and control returns to the auxiliary process loop, we now have to decide from within the auxiliary if we can figure out that all that happened was a "required" wake-up, and thus we shouldn't really go through with another iteration. That, or we can simply do the iteration. I have my doubts that it is acceptable to wake-up spuriously in response to routine events that there are generic handlers for. Maybe this needs to be decided on a case-by-case basis. > On another note, I might be inclined to write something like: > > if ((return_value_of_waitlatch & WL_POSTMASTER_DEATH) && !PostmasterIsAlive()) > proc_exit(1); > > ...so as to avoid calling that function unnecessarily on every iteration. Hmm. I'm not so sure. We're now relying on the return value of WaitLatch(), which isn't guaranteed to report all wake-up events (although I don't believe it would be a problem in this exact case). Previously, we called PostmasterIsAlive() once a second anyway, and that wasn't much of a problem. >> Incidentally, should I worry about the timeout long for WaitLatch() >> overflowing? > > How would that happen? struct timeval is comprised of two longs - one representing seconds, and the other represented the balance of microseconds. Previously, we didn't combine them into a single microsecond representation. Now, we do. There could perhaps be a very large "nap", as determined by launcher_determine_sleep(), so that the total number of microseconds passed to WaitLatch() would exceed the maximum long size that can be safely represented on some or all platforms. On most 32-bit machines, sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = 2,147,483,647 microseconds = only about 35 minutes. There are corner cases, such as if someone were to set autovacuum_naptime to something silly. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On Jul18, 2011, at 15:12 , Peter Geoghegan wrote: > struct timeval is comprised of two longs - one representing seconds, > and the other represented the balance of microseconds. Previously, we > didn't combine them into a single microsecond representation. Now, we > do. I haven't actually looked at the code, so maybe I'm missing something obvious - but why don't we use a double precision float (double) instead of a long for the combined representation? best regards, Florian Pflug
On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote: >>> Another concern is, what happens when we receive a signal, generically >>> handled or otherwise, and have to SetLatch() to avoid time-out >>> invalidation? Should we just live with a spurious >>> AutoVacLauncherMain() iteration, or should we do something like check >>> if the return value of WaitLatch indicates that we woke up due to a >>> SetLatch() call, which must have been within a singal handler, and >>> that we should therefore goto just before WaitLatch() and elide the >>> spurious iteration? Given that we can expect some signals to occur >>> relatively frequently, spurious iterations could be a real concern. >> >> Really? I suspect that it doesn't much matter exactly how many >> machine language instructions we execute on each wake-up, within >> reasonable bounds, of course. Maybe some testing is in order? > > There's only one way to get around the time-out invalidation problem > that I'm aware of - call SetLatch() in the handler. I'd be happy to > hear alternatives, but until we have an alternative, we're stuck > managing this in each and every signal handler. > > Once we've had the latch set to handle this, and control returns to > the auxiliary process loop, we now have to decide from within the > auxiliary if we can figure out that all that happened was a "required" > wake-up, and thus we shouldn't really go through with another > iteration. That, or we can simply do the iteration. > > I have my doubts that it is acceptable to wake-up spuriously in > response to routine events that there are generic handlers for. Maybe > this needs to be decided on a case-by-case basis. I'm confused. If the process gets hit with a signal, it's already woken up, isn't it? Whatever system call it was blocked on may or may not get restarted depending on the platform and what the signal handler does, but from an OS perspective, the process has already been allocated a time slice and will run until either the time slice is exhausted or it again blocks. >> On another note, I might be inclined to write something like: >> >> if ((return_value_of_waitlatch & WL_POSTMASTER_DEATH) && !PostmasterIsAlive()) >> proc_exit(1); >> >> ...so as to avoid calling that function unnecessarily on every iteration. > > Hmm. I'm not so sure. We're now relying on the return value of > WaitLatch(), which isn't guaranteed to report all wake-up events > (although I don't believe it would be a problem in this exact case). > Previously, we called PostmasterIsAlive() once a second anyway, and > that wasn't much of a problem. Ah. OK. >>> Incidentally, should I worry about the timeout long for WaitLatch() >>> overflowing? >> >> How would that happen? > > struct timeval is comprised of two longs - one representing seconds, > and the other represented the balance of microseconds. Previously, we > didn't combine them into a single microsecond representation. Now, we > do. > > There could perhaps be a very large "nap", as determined by > launcher_determine_sleep(), so that the total number of microseconds > passed to WaitLatch() would exceed the maximum long size that can be > safely represented on some or all platforms. On most 32-bit machines, > sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = > 2,147,483,647 microseconds = only about 35 minutes. There are corner > cases, such as if someone were to set autovacuum_naptime to something > silly. OK. In that case, my feeling is "yes, you need to worry about that". I'm not sure exactly what the best solution is: we could either twiddle the WaitLatch interface some more, or restrict autovacuum_naptime to at most 30 minutes, or maybe there's some other option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote: >> There could perhaps be a very large "nap", as determined by >> launcher_determine_sleep(), so that the total number of microseconds >> passed to WaitLatch() would exceed the maximum long size that can be >> safely represented on some or all platforms. On most 32-bit machines, >> sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = >> 2,147,483,647 microseconds = only about 35 minutes. There are corner >> cases, such as if someone were to set autovacuum_naptime to something >> silly. > OK. In that case, my feeling is "yes, you need to worry about that". > I'm not sure exactly what the best solution is: we could either > twiddle the WaitLatch interface some more, or restrict > autovacuum_naptime to at most 30 minutes, or maybe there's some other > option. A wakeup once every half hour would surely not be an issue from a power consumption standpoint. However, I'm not sure I understand here: aren't we trying to remove the timeouts completely? regards, tom lane
On Mon, Jul 18, 2011 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote: >>> There could perhaps be a very large "nap", as determined by >>> launcher_determine_sleep(), so that the total number of microseconds >>> passed to WaitLatch() would exceed the maximum long size that can be >>> safely represented on some or all platforms. On most 32-bit machines, >>> sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = >>> 2,147,483,647 microseconds = only about 35 minutes. There are corner >>> cases, such as if someone were to set autovacuum_naptime to something >>> silly. > >> OK. In that case, my feeling is "yes, you need to worry about that". >> I'm not sure exactly what the best solution is: we could either >> twiddle the WaitLatch interface some more, or restrict >> autovacuum_naptime to at most 30 minutes, or maybe there's some other >> option. > > A wakeup once every half hour would surely not be an issue from a power > consumption standpoint. However, I'm not sure I understand here: aren't > we trying to remove the timeouts completely? Well, in the case of the AV launcher, the issue is that the main loop is timed by definition, cf. autovacuum_naptime, and the WaitLatch() interface is apparently designed so that we can't sleep longer than 35 minutes. We can avoid waking every second simply to check whether the postmaster has died, but waking up every autovacuum_naptime seems de rigeur barring a major redesign of the mechanism. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jul 18, 2011 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A wakeup once every half hour would surely not be an issue from a power >> consumption standpoint. �However, I'm not sure I understand here: aren't >> we trying to remove the timeouts completely? > Well, in the case of the AV launcher, the issue is that the main loop > is timed by definition, cf. autovacuum_naptime, and the WaitLatch() > interface is apparently designed so that we can't sleep longer than 35 > minutes. Hmm. Well, it's not too late to rethink the WaitLatch API, if we think that that might be a significant limitation. Offhand I don't see that it is, though. regards, tom lane
On 18.07.2011 18:32, Tom Lane wrote: > Robert Haas<robertmhaas@gmail.com> writes: >> On Mon, Jul 18, 2011 at 10:35 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote: >>> A wakeup once every half hour would surely not be an issue from a power >>> consumption standpoint. However, I'm not sure I understand here: aren't >>> we trying to remove the timeouts completely? > >> Well, in the case of the AV launcher, the issue is that the main loop >> is timed by definition, cf. autovacuum_naptime, and the WaitLatch() >> interface is apparently designed so that we can't sleep longer than 35 >> minutes. > > Hmm. Well, it's not too late to rethink the WaitLatch API, if we think > that that might be a significant limitation. Right, we can easily change the timeout argument to be in milliseconds instead of microseconds. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 18.07.2011 18:32, Tom Lane wrote: >> Hmm. Well, it's not too late to rethink the WaitLatch API, if we think >> that that might be a significant limitation. > Right, we can easily change the timeout argument to be in milliseconds > instead of microseconds. On the whole I'd be more worried about giving up the shorter waits than the longer ones --- it's not too hard to imagine using submillisecond timeouts in the future, as machines get faster. If we really wanted to fix this, I think we need to move to a wider datatype. regards, tom lane
On Mon, Jul 18, 2011 at 03:12:20PM -0400, Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > On 18.07.2011 18:32, Tom Lane wrote: > >> Hmm. Well, it's not too late to rethink the WaitLatch API, if we think > >> that that might be a significant limitation. > > > Right, we can easily change the timeout argument to be in milliseconds > > instead of microseconds. > > On the whole I'd be more worried about giving up the shorter waits than > the longer ones --- it's not too hard to imagine using submillisecond > timeouts in the future, as machines get faster. If we really wanted to > fix this, I think we need to move to a wider datatype. > > regards, tom lane > You could also tag the high bit to allow you to encode larger ranges by having microseconds for the values with the high bit = 0 and use milliseconds for the values with the high bit = 1. Then you could have the best of both without punching up the width of the datatype. Regard, Ken
On Mon, Jul 18, 2011 at 3:28 PM, ktm@rice.edu <ktm@rice.edu> wrote: > On Mon, Jul 18, 2011 at 03:12:20PM -0400, Tom Lane wrote: >> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> > On 18.07.2011 18:32, Tom Lane wrote: >> >> Hmm. Well, it's not too late to rethink the WaitLatch API, if we think >> >> that that might be a significant limitation. >> >> > Right, we can easily change the timeout argument to be in milliseconds >> > instead of microseconds. >> >> On the whole I'd be more worried about giving up the shorter waits than >> the longer ones --- it's not too hard to imagine using submillisecond >> timeouts in the future, as machines get faster. If we really wanted to >> fix this, I think we need to move to a wider datatype. >> >> regards, tom lane >> > > You could also tag the high bit to allow you to encode larger ranges > by having microseconds for the values with the high bit = 0 and use > milliseconds for the values with the high bit = 1. Then you could > have the best of both without punching up the width of the datatype. Considering that we're just talking about a function call here, and not something that ever goes out to disk, that seems like entirely too much notational complexity. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 18 July 2011 20:06, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> Hmm. Well, it's not too late to rethink the WaitLatch API, if we think >> that that might be a significant limitation. > > Right, we can easily change the timeout argument to be in milliseconds > instead of microseconds. +1 -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attached is revision of this patch that now treats the latch in PGPROC, waitLatch, as the generic "process latch", rather than just using it for sync rep; It is initialised appropriately as a shared latch generically, within InitProcGlobal(), and ownership is subsequently set within InitProcess(). We were doing so before, though only for the benefit of sync rep. The idea here is to have a per-process latch to guard against time-out invalidation issues from within each generic signal handler, by calling SetLatch() on our generic latch there, much as we already do from within non-generic archiver process signal handlers on the archiver's static, non-generic latch (the archiver has its MyProc pointer set to NULL, and we allow for the possibility that generic handlers may be registered within processes that have a NULL MyProc - though latch timeout invalidation issues then become the responsibility of the process exclusively, just as is currently the case with the archiver. In other words, they better not register a generic signal handler). It doesn't really matter that the SetLatch() call will usually be unnecessary, because, as Heikki once pointed out, redundantly setting a latch that is already set is very cheap. We don't check if it's set directly in advance of setting the latch, because the Latch struct is "logically opaque" and there is no "public function" to check if it's set, nor should there be; the checking simply happens in SetLatch(). On 18 July 2011 20:06, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Right, we can easily change the timeout argument to be in milliseconds > instead of microseconds. I've done so in this latest revision as a precautionary measure. I don't see much point in sub-millisecond granularity, and besides, the Windows implementation will not provide that granularity anyway as things stand. Thoughts? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
Peter Geoghegan <peter@2ndquadrant.com> writes: > Attached is revision of this patch that now treats the latch in > PGPROC, waitLatch, as the generic "process latch", rather than just > using it for sync rep; It is initialised appropriately as a shared > latch generically, within InitProcGlobal(), and ownership is > subsequently set within InitProcess(). We were doing so before, though > only for the benefit of sync rep. Now that I've got the WaitLatch code fully swapped into my head, I'm thinking of pushing on to review/commit this patch of Peter's. > On 18 July 2011 20:06, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> Right, we can easily change the timeout argument to be in milliseconds >> instead of microseconds. > I've done so in this latest revision as a precautionary measure. I > don't see much point in sub-millisecond granularity, and besides, the > Windows implementation will not provide that granularity anyway as > things stand. I did not see any objections to such a change. I think we should pull out this aspect and commit it to 9.1 as well as HEAD. That will provide one less gotcha for anyone who develops against the 9.1 latch code and later needs to port to 9.2. regards, tom lane
On 9 August 2011 23:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Now that I've got the WaitLatch code fully swapped into my head, > I'm thinking of pushing on to review/commit this patch of Peter's. Thanks for giving this your attention. I had already planned to produce a new revision this weekend, so I'd appreciate it if you could hold off until Sunday or Monday. > I did not see any objections to such a change. I think we should pull > out this aspect and commit it to 9.1 as well as HEAD. That will provide > one less gotcha for anyone who develops against the 9.1 latch code and > later needs to port to 9.2. That is a good point. I'm aware that someone already made the mistake of giving the value of timeout as milliseconds rather than microseconds at one point, so this seems to be a fertile source of confusion. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan <peter@2ndquadrant.com> writes: > On 9 August 2011 23:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Now that I've got the WaitLatch code fully swapped into my head, >> I'm thinking of pushing on to review/commit this patch of Peter's. > Thanks for giving this your attention. I had already planned to > produce a new revision this weekend, so I'd appreciate it if you could > hold off until Sunday or Monday. Actually, I'm nearly done with it already. Perhaps you could start thinking about the other polling loops. regards, tom lane
On 10 August 2011 01:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Actually, I'm nearly done with it already. Perhaps you could start > thinking about the other polling loops. Fair enough. I'm slightly surprised that there doesn't need to be some bikeshedding about my idea to treat the PGPROC latch as the generic, per-process latch. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan <peter@2ndquadrant.com> writes: > On 10 August 2011 01:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Actually, I'm nearly done with it already. �Perhaps you could start >> thinking about the other polling loops. > Fair enough. I'm slightly surprised that there doesn't need to be some > bikeshedding about my idea to treat the PGPROC latch as the generic, > per-process latch. No, I don't find that unreasonable, especially not since Simon had made that the de facto situation anyhow by having it be initialized for all backends in proc.c and set unconditionally by some of the standard signal handlers. I am working on renaming it to procLatch (I find "waitLatch" a bit too generic) and fixing a bunch of pre-existing bugs that I now see in that code, like failure to save/restore errno in signal handlers that used to only set a flag but now also call SetLatch. regards, tom lane
Peter Geoghegan <peter@2ndquadrant.com> writes: > Attached is revision of this patch that now treats the latch in > PGPROC, waitLatch, as the generic "process latch", rather than just > using it for sync rep; It is initialised appropriately as a shared > latch generically, within InitProcGlobal(), and ownership is > subsequently set within InitProcess(). We were doing so before, though > only for the benefit of sync rep. Applied with some corrections, notably: * Signal handlers mustn't change errno and mustn't assume MyProc is set, since they might get invoked before it's set or after it's cleared. Calling SetLatch outside the save/restore of errno is right out, of course, but I also found that quite a lot of handlers had previously been hacked to call SetLatch or latch_sigusr1_handler without any save/restore at all. I'm not sure if any of those were your mistake; I think a lot of them were pre-existing bugs in the sync rep patch. * avlauncher loop was missing a ResetLatch call. I also added a comment explaining the difference from the normal pattern for using WaitLatch. * I didn't add a SetLatch call in procsignal_sigusr1_handler. I'm unconvinced that it's necessary, and if it is we probably need to rethink using SIGUSR1 internally in unix_latch.c. It does not make sense to set the procLatch when we get a signal that's directed towards releasing some other latch. regards, tom lane
On Wed, Aug 10, 2011 at 3:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <peter@2ndquadrant.com> writes: >> On 10 August 2011 01:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Actually, I'm nearly done with it already. Perhaps you could start >>> thinking about the other polling loops. > >> Fair enough. I'm slightly surprised that there doesn't need to be some >> bikeshedding about my idea to treat the PGPROC latch as the generic, >> per-process latch. > > No, I don't find that unreasonable, especially not since Simon had made > that the de facto situation anyhow by having it be initialized for all > backends in proc.c and set unconditionally by some of the standard > signal handlers. I am working on renaming it to procLatch (I find > "waitLatch" a bit too generic) and That was the direction I wanted to go in anyway, as you guessed. > fixing a bunch of pre-existing bugs > that I now see in that code, like failure to save/restore errno in > signal handlers that used to only set a flag but now also call SetLatch. Thanks for looking at the code; sounds like we nipped a few would-have-been-bugs there. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services