Thread: Re: [PATCHES] WIP archive_timeout patch
Simon Riggs <simon@2ndquadrant.com> writes: > WIP archive_timeout. > All we need to do is add LWLock support to archiver. > Thoughts/ideas/hints welcome. Hint: this isn't the archiver's problem, and so you don't need to get the archiver involved in the solution. I'd suggest bgwriter as a reasonably appropriate place instead. regards, tom lane
On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > WIP archive_timeout. > > All we need to do is add LWLock support to archiver. > > Thoughts/ideas/hints welcome. > > Hint: this isn't the archiver's problem, and so you don't need to get > the archiver involved in the solution. I'd suggest bgwriter as a > reasonably appropriate place instead. OK -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Thu, 2006-08-03 at 19:03 +0100, Simon Riggs wrote: > On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote: > > Simon Riggs <simon@2ndquadrant.com> writes: > > > WIP archive_timeout. > > > All we need to do is add LWLock support to archiver. > > > Thoughts/ideas/hints welcome. > > > > Hint: this isn't the archiver's problem, and so you don't need to get > > the archiver involved in the solution. I'd suggest bgwriter as a > > reasonably appropriate place instead. > > OK A slightly fuller answer: Yes, thats a safer place than archiver, so I'll add it to bgwriter as you suggest. Should have a patch complete by Tuesday, since travelling now. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs <simon@2ndquadrant.com> writes: >> Revised patch enclosed, now believed to be production ready. This >> implements regular log switching using the archive_timeout GUC. > Further patch enclosed implementing these changes plus the record type > version of pg_xlogfile_name_offset() Applied with minor changes --- it seemed better to me to put tracking of the last xlog switch time directly into xlog.c, instead of having the bgwriter code try to determine whether a switch had happened recently. I noticed a minor annoyance while testing: when the system is completely idle, you get a forced segment switch every checkpoint_timeout seconds, even though there is nothing useful to log. The checkpoint code is smart enough not to do a checkpoint if nothing has happened since the last one, and the xlog switch code is smart enough not to do a switch if nothing has happened since the last one ... but they aren't talking to each other and so each one's change looks like "something happened" to the other one. I'm not sure how much trouble it's worth taking to prevent this scenario, though. If you can't afford a WAL file switch every five minutes, you probably shouldn't be using archive_timeout anyway ... regards, tom lane
Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >>> Revised patch enclosed, now believed to be production ready. This >>> implements regular log switching using the archive_timeout GUC. > >> Further patch enclosed implementing these changes plus the record type >> version of pg_xlogfile_name_offset() > > Applied with minor changes --- it seemed better to me to put tracking of > the last xlog switch time directly into xlog.c, instead of having the > bgwriter code try to determine whether a switch had happened recently. > > I noticed a minor annoyance while testing: when the system is completely > idle, you get a forced segment switch every checkpoint_timeout seconds, > even though there is nothing useful to log. The checkpoint code is > smart enough not to do a checkpoint if nothing has happened since the > last one, and the xlog switch code is smart enough not to do a switch > if nothing has happened since the last one ... but they aren't talking > to each other and so each one's change looks like "something happened" > to the other one. I'm not sure how much trouble it's worth taking to > prevent this scenario, though. If you can't afford a WAL file switch > every five minutes, you probably shouldn't be using archive_timeout > anyway ... Actually, this behaviour IMHO even has it's advantages - if you can be sure that at least one wal will be archived every 5 minutes, then it's easy to monitor the replication - you can just watch the logfile if the slave, and send a failure notice if no logfile is imported at least every 10 minutes or so. Of course, for this to be useful, the documentation would have to tell people about that behaviour, and it couldn't easily be changed in the next release... greetings, Florian Pflug
On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > >> Revised patch enclosed, now believed to be production ready. This > >> implements regular log switching using the archive_timeout GUC. > > > Further patch enclosed implementing these changes plus the record type > > version of pg_xlogfile_name_offset() > > Applied with minor changes --- it seemed better to me to put tracking of > the last xlog switch time directly into xlog.c, instead of having the > bgwriter code try to determine whether a switch had happened recently. Code location: sure. > I noticed a minor annoyance while testing: when the system is completely > idle, you get a forced segment switch every checkpoint_timeout seconds, > even though there is nothing useful to log. The checkpoint code is > smart enough not to do a checkpoint if nothing has happened since the > last one, and the xlog switch code is smart enough not to do a switch > if nothing has happened since the last one ... but they aren't talking > to each other and so each one's change looks like "something happened" > to the other one. I'm not sure how much trouble it's worth taking to > prevent this scenario, though. If you can't afford a WAL file switch > every five minutes, you probably shouldn't be using archive_timeout > anyway ... I noticed that minor annoyance and understood that I had fixed it before submitting. That was the reason for putting the code in bgwriter to check whether the pointer had moved before attempting the switch... perhaps that functionality has been removed? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
> I noticed a minor annoyance while testing: when the system is > completely idle, you get a forced segment switch every > checkpoint_timeout seconds, even though there is nothing > useful to log. The checkpoint code is smart enough not to do > a checkpoint if nothing has happened since the last one, and > the xlog switch code is smart enough not to do a switch if > nothing has happened since the last one ... but they aren't > talking to each other and so each one's change looks like > "something happened" > to the other one. I'm not sure how much trouble it's worth > taking to prevent this scenario, though. If you can't afford > a WAL file switch every five minutes, you probably shouldn't > be using archive_timeout anyway ... Um, I would have thought practical timeouts would be rather more than 5 minutes than less. So this does seem like a problem to me :-( Andreas
Simon Riggs <simon@2ndquadrant.com> writes: > On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote: >> I noticed a minor annoyance while testing: when the system is completely >> idle, you get a forced segment switch every checkpoint_timeout seconds, >> even though there is nothing useful to log. The checkpoint code is >> smart enough not to do a checkpoint if nothing has happened since the >> last one, and the xlog switch code is smart enough not to do a switch >> if nothing has happened since the last one ... but they aren't talking >> to each other and so each one's change looks like "something happened" >> to the other one. > I noticed that minor annoyance and understood that I had fixed it before > submitting. That was the reason for putting the code in bgwriter to > check whether the pointer had moved before attempting the switch... > perhaps that functionality has been removed? No, the original form of the patch was equally vulnerable. AFAICS the only way to prevent this would be for XLogRequestSwitch (or really XLogInsert, which does the heavy lifting for this) to suppress a switch if the current segment is empty *or* contains only a checkpoint WAL record. Basically it'd have to pretend the checkpoint record is not there. This is doable but seems a bit weird --- in particular, that would mean that pg_switch_xlog sometimes returns a pointer less than pg_current_xlog_location, which might confuse backup scripts. On the whole I'm leaning towards not changing it. As Florian mentioned, guaranteed segment-every-checkpoint isn't completely without its uses. And people who are looking for low WAL volume ought to be stretching out their checkpoint intervals anyway. regards, tom lane
On Fri, 2006-08-18 at 08:52 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote: > >> I noticed a minor annoyance while testing: when the system is completely > >> idle, you get a forced segment switch every checkpoint_timeout seconds, > >> even though there is nothing useful to log. The checkpoint code is > >> smart enough not to do a checkpoint if nothing has happened since the > >> last one, and the xlog switch code is smart enough not to do a switch > >> if nothing has happened since the last one ... but they aren't talking > >> to each other and so each one's change looks like "something happened" > >> to the other one. > > > I noticed that minor annoyance and understood that I had fixed it before > > submitting. That was the reason for putting the code in bgwriter to > > check whether the pointer had moved before attempting the switch... > > perhaps that functionality has been removed? > > No, the original form of the patch was equally vulnerable. AFAICS the > only way to prevent this would be for XLogRequestSwitch (or really > XLogInsert, which does the heavy lifting for this) to suppress a switch > if the current segment is empty *or* contains only a checkpoint WAL > record. Basically it'd have to pretend the checkpoint record is not > there. This is doable but seems a bit weird --- in particular, that > would mean that pg_switch_xlog sometimes returns a pointer less than > pg_current_xlog_location, which might confuse backup scripts. > > On the whole I'm leaning towards not changing it. As Florian mentioned, > guaranteed segment-every-checkpoint isn't completely without its uses. > And people who are looking for low WAL volume ought to be stretching > out their checkpoint intervals anyway. Agreed. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com