Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database) - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Date
Msg-id 20121116131733.GB4454@alvh.no-ip.org
Whole thread Raw
In response to Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
List pgsql-hackers
Kohei KaiGai escribió:
> 2012/10/22 Alvaro Herrera <alvherre@2ndquadrant.com>:
> > Here's an updated version of this patch, which also works in
> > an EXEC_BACKEND environment.  (I haven't tested this at all on Windows,
> > but I don't see anything that would create a portability problem there.)
> >
> I also tried to check the latest patch "briefly".
> Let me comment on several points randomly.

Thanks.

> Once bgworker process got crashed, postmaster tries to restart it about 60
> seconds later. My preference is this interval being configurable with initial
> registration parameter; that includes "never restart again" option.
> Probably, some extensions don't want to restart again on unexpected crash.

The main issue with specifying a crash-restart policy, I thought, was
that after a postmaster restart cycle there is no reasonable way to know
whether any bgworker should restart or not: had any particular bgworker
crashed?  Remember that any backend crash, and some bgworker crashes,
will cause the postmaster to reinitialize the whole system.  So I don't
see that it will be handled consistently; if we offer the option, module
developers might be led into thinking that a daemon that stops is never
to run again, but that's not going to be the case.

But I'm not really wedded to this behavior.

> Stop is one other significant event for bgworker, not only start-up.
> This interface has no way to specify when we want to stop bgworker.
> Probably, we can offer two options. Currently, bgworkers are simultaneously
> terminated with other regular backends. In addition, I want an option to make
> sure bgworker being terminated after all the regular backends exit.
> It is critical issue for me, because I try to implement parallel
> calculation server
> with bgworker, thus, it should not be terminated earlier than regular backend.

I can do that.  I considered stop time as well, but couldn't think of
any case in which I would want the worker to persist beyond backends
exit, so I ended up not adding the option.  But I think it's reasonably
easy to add.

> Regarding to process restart, this interface allows to specify the timing to
> start bgworker using BgWorkerStart_* label. On the other hand,
> bgworker_should_start_now() checks correspondence between pmState
> and the configured start-time using equal matching. It can make such a
> scenarios, for instance, a bgworker launched at PM_INIT state, then it got
> crashed. Author expected it shall be restarted, however, pmState was
> already progressed to PM_RUN, so nobody can restart this bgworker again.
> How about an idea to provide the start time with bitmap? It allows extensions
> to specify multiple candidates to launch bgworker.

No, I think you must be misreading the code.  If a daemon specified
BgWorkerStart_PostmasterStart then it will start whenever pmState is
PM_INIT *or any later state*.  This is why the switch has all those
"fall throughs".

> do_start_bgworker() initializes process state according to normal manner,
> then it invokes worker->bgw_main(). Once ERROR is raised inside of the
> main routine, the control is backed to the sigsetjmp() on do_start_bgworker(),
> then the bgworker is terminated.
> I'm not sure whether it is suitable for bgworker that tries to connect database,
> because it may raise an error everywhere, such as zero division and so on...
> I think it is helpful to provide a utility function in the core; that
> handles to abort
> transactions in progress, clean-up resources, and so on.

Yeah, I considered this too --- basically this is the question I posted
elsewhere, "what else do we need to provide for modules"?  A sigsetjmp()
standard block or something is probably part of that.

> It ensures bgworker must be registered within shared_preload_libraries.
> Why not raise an error if someone try to register bgworker later?

Hm, sure, we can do that I guess.

> How about to move bgw_name field into tail of BackgroundWorker structure?
> It makes simplifies the logic in RegisterBackgroundWorker(), if it is
> located as:
>     typedef struct BackgroundWorker
>     {
>         int         bgw_flags;
>         BgWorkerStartTime bgw_start_time;
>         bgworker_main_type  bgw_main;
>         void       *bgw_main_arg;
>         bgworker_sighdlr_type bgw_sighup;
>         bgworker_sighdlr_type bgw_sigterm;
>         char        bgw_name[1];  <== (*)
>     } BackgroundWorker;

Makes sense.

> StartOneBackgroundWorker always scan the BackgroundWorkerList from
> the head. Isn't it available to save the current position at static variable?
> If someone tries to manage thousand of bgworkers, it makes a busy loop. :(

We could try that.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Materialized views WIP patch
Next
From: Markus Wanner
Date:
Subject: Re: logical changeset generation v3 - comparison to Postgres-R change set format