Thread: Re: [ADMIN] does wal archiving block the current client connection?

Re: [ADMIN] does wal archiving block the current client connection?

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> OK, I'm on it.

What solution have you got in mind?  I was thinking about an fcntl lock
to ensure only one archiver is active in a given data directory.  That
would fix the problem without affecting anything outside the archiver.
Not sure what's the most portable way to do it though.
        regards, tom lane


Re: [ADMIN] does wal archiving block the current client connection?

From
Simon Riggs
Date:
On Fri, 2006-05-19 at 12:03 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > OK, I'm on it.
> 
> What solution have you got in mind?  I was thinking about an fcntl lock
> to ensure only one archiver is active in a given data directory.  That
> would fix the problem without affecting anything outside the archiver.
> Not sure what's the most portable way to do it though.

I was trying to think of a better way than using an archiver.pid file in
pg_xlog/archive_status...

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com



Re: [ADMIN] does wal archiving block the current client

From
Simon Riggs
Date:
On Fri, 2006-05-19 at 17:27 +0100, Simon Riggs wrote:
> On Fri, 2006-05-19 at 12:03 -0400, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > OK, I'm on it.
> > 
> > What solution have you got in mind?  I was thinking about an fcntl lock
> > to ensure only one archiver is active in a given data directory.  That
> > would fix the problem without affecting anything outside the archiver.
> > Not sure what's the most portable way to do it though.
> 
> I was trying to think of a better way than using an archiver.pid file in
> pg_xlog/archive_status...

Yesterday I posted to -patches with a new archiver.pid interlock
mechanism. This will prevent server startup when the archiver is first
activated, but once running will clean up and restart again.

This doesn't quite get to the nub of the problem: archiver is designed
to keep archiving files, even in the event that the postmaster explodes.
It will keep archiving until they're all gone. 

My recent patch will prevent server startup, so if you do a fast restart
to bounce the server and change parameters you'll have to keep the
server down while the archiver completes (or you kill it).

The archiver's Spartan diligence is great if postmaster does fail, but
archiver can't tell the difference between a normal shutdown and a
postmaster crash. If the postmaster sent a SIGUSR2 on normal shutdown,
we would be able to interrupt the outer loop and shutdown much faster. A
starting postmaster might then reasonably wait a little while for the
old archiver to quit before starting the new one.

What do you think?

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com



Re: [ADMIN] does wal archiving block the current client connection?

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> This doesn't quite get to the nub of the problem: archiver is designed
> to keep archiving files, even in the event that the postmaster explodes.
> It will keep archiving until they're all gone. 

I think we just need a PostmasterIsAlive check in the per-file loop.
        regards, tom lane


Re: [ADMIN] does wal archiving block the current client

From
Simon Riggs
Date:
On Tue, 2006-05-23 at 10:53 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > This doesn't quite get to the nub of the problem: archiver is designed
> > to keep archiving files, even in the event that the postmaster explodes.
> > It will keep archiving until they're all gone. 
> 
> I think we just need a PostmasterIsAlive check in the per-file loop.

...which would mean the archiver would not outlive postmaster in the
event it crashes...which is exactly the time you want it to keep going.

Granted, that's an easy change.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com



Re: [ADMIN] does wal archiving block the current client connection?

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Tue, 2006-05-23 at 10:53 -0400, Tom Lane wrote:
>> I think we just need a PostmasterIsAlive check in the per-file loop.

> ...which would mean the archiver would not outlive postmaster in the
> event it crashes...which is exactly the time you want it to keep going.

Postmaster crashes are not a problem in practice; we've been careful to
keep the postmaster doing so little that there's no material risk of it
failing.  If the postmaster dies it's almost certainly because someone
killed it, and you really want the child processes to close up shop too.

(If we did want the archiver to keep running, it shouldn't have any
PostmasterIsAlive check at all; I can't see a reason why completing
one iteration of the outer loop is a better time to stop than any
other time.)
        regards, tom lane


Re: [ADMIN] does wal archiving block the current client

From
Simon Riggs
Date:
On Tue, 2006-05-23 at 11:09 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Tue, 2006-05-23 at 10:53 -0400, Tom Lane wrote:
> >> I think we just need a PostmasterIsAlive check in the per-file loop.
> 
> > ...which would mean the archiver would not outlive postmaster in the
> > event it crashes...which is exactly the time you want it to keep going.
> 
> Postmaster crashes are not a problem in practice; we've been careful to
> keep the postmaster doing so little that there's no material risk of it
> failing.  If the postmaster dies it's almost certainly because someone
> killed it, and you really want the child processes to close up shop too.
> 
> (If we did want the archiver to keep running, it shouldn't have any
> PostmasterIsAlive check at all; I can't see a reason why completing
> one iteration of the outer loop is a better time to stop than any
> other time.)

This does at least solve the fast restart problem, so look on -patches
in a few minutes.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com



Re: [ADMIN] does wal archiving block the current client

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> My recent patch will prevent server startup, so if you do a fast restart
> to bounce the server and change parameters you'll have to keep the
> server down while the archiver completes (or you kill it).

BTW, I was not planning on having it do that.  The archiver subprocess
should fail to start (and the PM keep trying to start it).  Not take
down the entire database.
        regards, tom lane