Thread: pg_autovacuum Win32 Service startup delay

pg_autovacuum Win32 Service startup delay

From
"Dave Page"
Date:
When starting as a service at boot time on Windows, pg_autovacuum may
fail to start because the PostgreSQL service is still starting up. This
patch causes the service to attempt a second connection 30 seconds after
the initial connection failure before giving up entirely.

Regards, Dave

Attachment

Re: pg_autovacuum Win32 Service startup delay

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
> When starting as a service at boot time on Windows, pg_autovacuum may
> fail to start because the PostgreSQL service is still starting up. This
> patch causes the service to attempt a second connection 30 seconds after
> the initial connection failure before giving up entirely.

Hm.  In event that the system crashed beforehand, it could require much
more than 30 seconds to finish replaying the old WAL log.  So the above
doesn't seem super robust to me.  Would it be reasonable to try every 30
seconds for five minutes, or some such?  (Five minutes at least has a
defensible rationale, ie it's the default checkpoint interval and we
expect we can replay the log at least as fast as it was created
initially.)

            regards, tom lane

Re: pg_autovacuum Win32 Service startup delay

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> On Mon, Jan 24, 2005 at 06:57:54PM -0500, Tom Lane wrote:
>> (Five minutes at least has a defensible rationale, ie it's the default
>> checkpoint interval and we expect we can replay the log at least as
>> fast as it was created initially.)

> Hmm, I remember Mark Wong from OSDL saying that it took to replay the
> logs after a crash more than the six hours it had taken to generate
> them.

Six hours?  Did he have checkpoints disabled somehow?

            regards, tom lane

Re: pg_autovacuum Win32 Service startup delay

From
Alvaro Herrera
Date:
On Mon, Jan 24, 2005 at 06:57:54PM -0500, Tom Lane wrote:

> (Five minutes at least has a defensible rationale, ie it's the default
> checkpoint interval and we expect we can replay the log at least as
> fast as it was created initially.)

Hmm, I remember Mark Wong from OSDL saying that it took to replay the
logs after a crash more than the six hours it had taken to generate
them.  Simon commented that it was unexpected, but there was no further
comment on the issue.

(On his test the server is generating the logs as fast as it can, so it
may not be important, but anyway ... )

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Ciencias políticas es la ciencia de entender por qué
 los políticos actúan como lo hacen"  (netfunny.com)

Re: pg_autovacuum Win32 Service startup delay

From
"Matthew T. O'Connor"
Date:
Dave Page wrote:

>When starting as a service at boot time on Windows, pg_autovacuum may
>fail to start because the PostgreSQL service is still starting up. This
>patch causes the service to attempt a second connection 30 seconds after
>the initial connection failure before giving up entirely.
>
>

In the windows service world, is there any reason pg_autovacuum should
ever give up?  The reason I had it give up was so that it didn't
accidently run against a different postgresql instance.  I don't think
that will happen in the windows service world.  I think it should keep
trying to do it's job until it's told to exit.

Matthew


Re: pg_autovacuum Win32 Service startup delay

From
"Michael Paesold"
Date:
Tom Lane wrote:

> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
>> On Mon, Jan 24, 2005 at 06:57:54PM -0500, Tom Lane wrote:
>>> (Five minutes at least has a defensible rationale, ie it's the default
>>> checkpoint interval and we expect we can replay the log at least as
>>> fast as it was created initially.)
>
>> Hmm, I remember Mark Wong from OSDL saying that it took to replay the
>> logs after a crash more than the six hours it had taken to generate
>> them.
>
> Six hours?  Did he have checkpoints disabled somehow?

No, I remember they were talking about recovery from backup using PITR.
(i.e. not simple crash recovery, but replaying the logs from the whole
benchmark session)

Best Regards,
Michael Paesold


Re: pg_autovacuum Win32 Service startup delay

From
Tom Lane
Date:
"Matthew T. O'Connor" <matthew@zeut.net> writes:
> In the windows service world, is there any reason pg_autovacuum should
> ever give up?

I was a bit worried about the scenario in which J Random Luser tries to
start the server twice and ends up with two autovacuum daemons attached
to the same postmaster.  I'm not sure if this is possible, probable,
or dangerous ... but it seems like a point to consider.

            regards, tom lane

Re: pg_autovacuum Win32 Service startup delay

From
Harald Armin Massa
Date:
Matthew T. O'Connor schrieb:

> In the windows service world, is there any reason pg_autovacuum should
> ever give up?  The reason I had it give up was so that it didn't
> accidently run against a different postgresql instance.  I don't think
> that will happen in the windows service world.  I think it should keep
> trying to do it's job until it's told to exit.

A "never giving up" pg_autovacuum seems a little bit rude to me. It's
like the salesman who keeps trying to sell me something I have clearly
no use.

Especially if in setting up og_autovacuum sth. goes wrong: wrong user,
wrong password. Service keeps running, service keeps using ressources,
seems perfectly normal... but: nothing happens. (and if everything looks
"perfect", checking the logs is not the first you do, do you?)

So: I think a reasonable compromise is to keep pg_autovacuum trying for
some time (maybe 5 minutes as Tom recommended) and after that give up.

Harald


Attachment

Re: pg_autovacuum Win32 Service startup delay

From
"Matthew T. O'Connor"
Date:
Tom Lane wrote:

>"Matthew T. O'Connor" <matthew@zeut.net> writes:
>
>
>>In the windows service world, is there any reason pg_autovacuum should
>>ever give up?
>>
>>
>
>I was a bit worried about the scenario in which J Random Luser tries to
>start the server twice and ends up with two autovacuum daemons attached
>to the same postmaster.  I'm not sure if this is possible, probable,
>or dangerous ... but it seems like a point to consider.
>

It is a good point to consider.  Let me be a little more detailed in my
explanation and see if that helps:
* A never give up pg_autovacuum would only be used when run as a windows
service.
* The windows service control manager can still kill pg_autovacuum, so
you shouldn't  be able to start more than one that way.
* You have always been able to run multiple pg_autovacuums, it's not
advisable, and it's only bad side effect would be excessive, or more
than expected, vacuum commands.


Re: pg_autovacuum Win32 Service startup delay

From
Tom Lane
Date:
"Matthew T. O'Connor" <matthew@zeut.net> writes:
> Tom Lane wrote:
>> I was a bit worried about the scenario in which J Random Luser tries to
>> start the server twice and ends up with two autovacuum daemons attached
>> to the same postmaster.  I'm not sure if this is possible, probable,
>> or dangerous ... but it seems like a point to consider.

> It is a good point to consider.  Let me be a little more detailed in my
> explanation and see if that helps:
> * A never give up pg_autovacuum would only be used when run as a windows
> service.
> * The windows service control manager can still kill pg_autovacuum, so
> you shouldn't  be able to start more than one that way.
> * You have always been able to run multiple pg_autovacuums, it's not
> advisable, and it's only bad side effect would be excessive, or more
> than expected, vacuum commands.

OK, that seems to take care of my worries above.

I agree with the point someone else made that if the service keeps trying
to start forever, it wouldn't be obvious to the user that it wasn't
working.  So a limited time window seems best ... but I think it needs
to be at least five minutes.

            regards, tom lane

Re: pg_autovacuum Win32 Service startup delay

From
"Dave Page"
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 24 January 2005 23:58
> To: Dave Page
> Cc: pgsql-patches@postgresql.org
> Subject: Re: [PATCHES] pg_autovacuum Win32 Service startup delay
>
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> > When starting as a service at boot time on Windows,
> pg_autovacuum may
> > fail to start because the PostgreSQL service is still
> starting up. This
> > patch causes the service to attempt a second connection 30
> seconds after
> > the initial connection failure before giving up entirely.
>
> Hm.  In event that the system crashed beforehand, it could
> require much
> more than 30 seconds to finish replaying the old WAL log.  So
> the above
> doesn't seem super robust to me.  Would it be reasonable to
> try every 30
> seconds for five minutes, or some such?  (Five minutes at least has a
> defensible rationale, ie it's the default checkpoint interval and we
> expect we can replay the log at least as fast as it was created
> initially.)

OK, revised patch attached. This version tries every 30 seconds for 5
minutes then gives up.

Regards, Dave.

Attachment