Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database) - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Date
Msg-id 1348232909-sup-3587@alvh.no-ip.org
Whole thread Raw
In response to Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)  (Amit Kapila <amit.kapila@huawei.com>)
Responses Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)  (Amit kapila <amit.kapila@huawei.com>)
List pgsql-hackers
Excerpts from Amit Kapila's message of vie sep 21 02:26:49 -0300 2012:
> On Thursday, September 20, 2012 7:13 PM Alvaro Herrera wrote:

> > Well, there is a difficulty here which is that the number of processes
> > connected to databases must be configured during postmaster start
> > (because it determines the size of certain shared memory structs).  So
> > you cannot just spawn more tasks if all max_worker_tasks are busy.
> > (This is a problem only for those workers that want to be connected as
> > backends.  Those that want libpq connections do not need this and are
> > easier to handle.)
>
> Are you telling about shared memory structs that need to be allocated for each worker task?
> I am not sure if they can be shared across multiple slaves or will be required for each slave.
> However even if that is not possible, other mechanism can be used to get the work done by existing slaves.

I mean stuff like PGPROC entries and such.  Currently, they are
allocated based on max_autovacuum_workers + max_connections +
max_prepared_transactions IIRC.  So by following identical reasoning we
would just have to add an hypothetical new max_bgworkers to the mix;
however as I said above, we don't really need that because we can count
the number of registered workers at postmaster start time and use that
to size PGPROC.

Shared memory used by each worker (or by a group of workers) that's not
part of core structs should be allocated by the worker itself via
RequestAddInShmemSpace.

> If not above then where there is a need of dynamic worker tasks as mentioned by Simon?

Well, I think there are many uses for dynamic workers, or short-lived
workers (start, do one thing, stop and not be restarted).

In my design, a worker is always restarted if it stops; otherwise there
is no principled way to know whether it should be running or not (after
a crash, should we restart a registered worker?  We don't know whether
it stopped before the crash.)  So it seems to me that at least for this
first shot we should consider workers as processes that are going to be
always running as long as postmaster is alive.  On a crash, if they have
a backend connection, they are stopped and then restarted.

> > One thing I am not going to look into is how is this new capability be
> > used for parallel query.  I feel we have enough use cases without it,
> > that we can develop a fairly powerful feature.  After that is done and
> > proven (and committed) we can look into how we can use this to implement
> > these short-lived workers for stuff such as parallel query.
>
>   Agreed and I also meant to say the same as you are saying.

Great.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Daniele Varrazzo
Date:
Subject: Re: pg_reorg in core?
Next
From: Marko Tiikkaja
Date:
Subject: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.