Thread: Re: [PATCHES] WIP archive_timeout patch

Re: [PATCHES] WIP archive_timeout patch

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> WIP archive_timeout.
> All we need to do is add LWLock support to archiver.
> Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution.  I'd suggest bgwriter as a
reasonably appropriate place instead.

            regards, tom lane

Re: [PATCHES] WIP archive_timeout patch

From
Simon Riggs
Date:
On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > WIP archive_timeout.
> > All we need to do is add LWLock support to archiver.
> > Thoughts/ideas/hints welcome.
>
> Hint: this isn't the archiver's problem, and so you don't need to get
> the archiver involved in the solution.  I'd suggest bgwriter as a
> reasonably appropriate place instead.

OK

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com


Re: [PATCHES] WIP archive_timeout patch

From
Simon Riggs
Date:
On Thu, 2006-08-03 at 19:03 +0100, Simon Riggs wrote:
> On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > WIP archive_timeout.
> > > All we need to do is add LWLock support to archiver.
> > > Thoughts/ideas/hints welcome.
> >
> > Hint: this isn't the archiver's problem, and so you don't need to get
> > the archiver involved in the solution.  I'd suggest bgwriter as a
> > reasonably appropriate place instead.
>
> OK

A slightly fuller answer:

Yes, thats a safer place than archiver, so I'll add it to bgwriter as
you suggest. Should have a patch complete by Tuesday, since travelling
now.

--
  Simon Riggs
  EnterpriseDB          http://www.enterprisedb.com


Re: [PATCHES] WIP archive_timeout patch

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
>> Revised patch enclosed, now believed to be production ready. This
>> implements regular log switching using the archive_timeout GUC.

> Further patch enclosed implementing these changes plus the record type
> version of pg_xlogfile_name_offset()

Applied with minor changes --- it seemed better to me to put tracking of
the last xlog switch time directly into xlog.c, instead of having the
bgwriter code try to determine whether a switch had happened recently.

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log.  The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one.  I'm not sure how much trouble it's worth taking to
prevent this scenario, though.  If you can't afford a WAL file switch
every five minutes, you probably shouldn't be using archive_timeout
anyway ...
        regards, tom lane


Re: [PATCHES] WIP archive_timeout patch

From
"Florian G. Pflug"
Date:
Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
>>> Revised patch enclosed, now believed to be production ready. This
>>> implements regular log switching using the archive_timeout GUC.
> 
>> Further patch enclosed implementing these changes plus the record type
>> version of pg_xlogfile_name_offset()
> 
> Applied with minor changes --- it seemed better to me to put tracking of
> the last xlog switch time directly into xlog.c, instead of having the
> bgwriter code try to determine whether a switch had happened recently.
> 
> I noticed a minor annoyance while testing: when the system is completely
> idle, you get a forced segment switch every checkpoint_timeout seconds,
> even though there is nothing useful to log.  The checkpoint code is
> smart enough not to do a checkpoint if nothing has happened since the
> last one, and the xlog switch code is smart enough not to do a switch
> if nothing has happened since the last one ... but they aren't talking
> to each other and so each one's change looks like "something happened"
> to the other one.  I'm not sure how much trouble it's worth taking to
> prevent this scenario, though.  If you can't afford a WAL file switch
> every five minutes, you probably shouldn't be using archive_timeout
> anyway ...

Actually, this behaviour IMHO even has it's advantages - if you can be
sure that at least one wal will be archived every 5 minutes, then it's
easy to monitor the replication - you can just watch the logfile if the
slave, and send a failure notice if no logfile is imported at least
every 10 minutes or so.

Of course, for this to be useful, the documentation would have to tell
people about that behaviour, and it couldn't easily be changed in the next
release...

greetings, Florian Pflug


Re: [PATCHES] WIP archive_timeout patch

From
Simon Riggs
Date:
On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> >> Revised patch enclosed, now believed to be production ready. This
> >> implements regular log switching using the archive_timeout GUC.
> 
> > Further patch enclosed implementing these changes plus the record type
> > version of pg_xlogfile_name_offset()
> 
> Applied with minor changes --- it seemed better to me to put tracking of
> the last xlog switch time directly into xlog.c, instead of having the
> bgwriter code try to determine whether a switch had happened recently.

Code location: sure.

> I noticed a minor annoyance while testing: when the system is completely
> idle, you get a forced segment switch every checkpoint_timeout seconds,
> even though there is nothing useful to log.  The checkpoint code is
> smart enough not to do a checkpoint if nothing has happened since the
> last one, and the xlog switch code is smart enough not to do a switch
> if nothing has happened since the last one ... but they aren't talking
> to each other and so each one's change looks like "something happened"
> to the other one.  I'm not sure how much trouble it's worth taking to
> prevent this scenario, though.  If you can't afford a WAL file switch
> every five minutes, you probably shouldn't be using archive_timeout
> anyway ...

I noticed that minor annoyance and understood that I had fixed it before
submitting. That was the reason for putting the code in bgwriter to
check whether the pointer had moved before attempting the switch...
perhaps that functionality has been removed?

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com



Re: [PATCHES] WIP archive_timeout patch

From
"Zeugswetter Andreas DCP SD"
Date:
> I noticed a minor annoyance while testing: when the system is
> completely idle, you get a forced segment switch every
> checkpoint_timeout seconds, even though there is nothing
> useful to log.  The checkpoint code is smart enough not to do
> a checkpoint if nothing has happened since the last one, and
> the xlog switch code is smart enough not to do a switch if
> nothing has happened since the last one ... but they aren't
> talking to each other and so each one's change looks like
> "something happened"
> to the other one.  I'm not sure how much trouble it's worth
> taking to prevent this scenario, though.  If you can't afford
> a WAL file switch every five minutes, you probably shouldn't
> be using archive_timeout anyway ...

Um, I would have thought practical timeouts would be rather more
than 5 minutes than less. So this does seem like a problem to me :-(

Andreas


Re: [PATCHES] WIP archive_timeout patch

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:
>> I noticed a minor annoyance while testing: when the system is completely
>> idle, you get a forced segment switch every checkpoint_timeout seconds,
>> even though there is nothing useful to log.  The checkpoint code is
>> smart enough not to do a checkpoint if nothing has happened since the
>> last one, and the xlog switch code is smart enough not to do a switch
>> if nothing has happened since the last one ... but they aren't talking
>> to each other and so each one's change looks like "something happened"
>> to the other one.

> I noticed that minor annoyance and understood that I had fixed it before
> submitting. That was the reason for putting the code in bgwriter to
> check whether the pointer had moved before attempting the switch...
> perhaps that functionality has been removed?

No, the original form of the patch was equally vulnerable.  AFAICS the
only way to prevent this would be for XLogRequestSwitch (or really
XLogInsert, which does the heavy lifting for this) to suppress a switch
if the current segment is empty *or* contains only a checkpoint WAL
record.  Basically it'd have to pretend the checkpoint record is not
there.  This is doable but seems a bit weird --- in particular, that
would mean that pg_switch_xlog sometimes returns a pointer less than
pg_current_xlog_location, which might confuse backup scripts.

On the whole I'm leaning towards not changing it.  As Florian mentioned,
guaranteed segment-every-checkpoint isn't completely without its uses.
And people who are looking for low WAL volume ought to be stretching
out their checkpoint intervals anyway.
        regards, tom lane


Re: [PATCHES] WIP archive_timeout patch

From
Simon Riggs
Date:
On Fri, 2006-08-18 at 08:52 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:
> >> I noticed a minor annoyance while testing: when the system is completely
> >> idle, you get a forced segment switch every checkpoint_timeout seconds,
> >> even though there is nothing useful to log.  The checkpoint code is
> >> smart enough not to do a checkpoint if nothing has happened since the
> >> last one, and the xlog switch code is smart enough not to do a switch
> >> if nothing has happened since the last one ... but they aren't talking
> >> to each other and so each one's change looks like "something happened"
> >> to the other one.
> 
> > I noticed that minor annoyance and understood that I had fixed it before
> > submitting. That was the reason for putting the code in bgwriter to
> > check whether the pointer had moved before attempting the switch...
> > perhaps that functionality has been removed?
> 
> No, the original form of the patch was equally vulnerable.  AFAICS the
> only way to prevent this would be for XLogRequestSwitch (or really
> XLogInsert, which does the heavy lifting for this) to suppress a switch
> if the current segment is empty *or* contains only a checkpoint WAL
> record.  Basically it'd have to pretend the checkpoint record is not
> there.  This is doable but seems a bit weird --- in particular, that
> would mean that pg_switch_xlog sometimes returns a pointer less than
> pg_current_xlog_location, which might confuse backup scripts.
> 
> On the whole I'm leaning towards not changing it.  As Florian mentioned,
> guaranteed segment-every-checkpoint isn't completely without its uses.
> And people who are looking for low WAL volume ought to be stretching
> out their checkpoint intervals anyway.

Agreed.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com