Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Date
Msg-id 004801cd97b9$b42b6860$1c823920$@kapila@huawei.com
Whole thread Raw
In response to Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Thursday, September 20, 2012 7:13 PM Alvaro Herrera wrote:
Excerpts from Amit Kapila's message of jue sep 20 02:10:23 -0300 2012:


>>   Why can't worker tasks be also permanent, which can be controlled through
>>   configuration. What I mean to say is that if user has need for parallel
>> operations
>>   he can configure max_worker_tasks and those many worker tasks will get
>> created.
>>   Otherwise without having such parameter, we might not be sure whether such
>> deamons
>>   will be of use to database users who don't need any background ops.
>
>>   The dynamism will come in to scene when we need to allocate such daemons
>> for particular ops(query), because
>>   might be operation need certain number of worker tasks, but no such task
>> is available, at that time it need
>>   to be decided whether to spawn a new task or change the parallelism in
>> operation such that it can be executed with
>>   available number of worker tasks.

> Well, there is a difficulty here which is that the number of processes
> connected to databases must be configured during postmaster start
> (because it determines the size of certain shared memory structs).  So
> you cannot just spawn more tasks if all max_worker_tasks are busy.
> (This is a problem only for those workers that want to be connected as
> backends.  Those that want libpq connections do not need this and are
> easier to handle.)

Are you telling about shared memory structs that need to be allocated for each worker task?
I am not sure if they can be shared across multiple slaves or will be required for each slave.
However even if that is not possible, other mechanism can be used to get the work done by existing slaves.

If not above then where there is a need of dynamic worker tasks as mentioned by Simon?

> The design we're currently discussing actually does not require a new
> GUC parameter at all.  This is why: since the workers must be registered
> before postmaster start anyway (in the _PG_init function of a module
> that's listed in shared_preload_libraries) then we have to run a
>registering function during postmaster start.  So postmaster can simply
> count how many it needs and size those structs from there.  Workers that
> do not need a backend-like connection don't have a shmem sizing
> requirement so are not important for this.  Configuration is thus
> simplified.

> BTW I am working on this patch and I think I have a workable design in
> place; I just couldn't get the code done before the start of this
> commitfest.  (I am missing handling the EXEC_BACKEND case though, but I
> will not even look into that until the basic Unix case is working).

> One thing I am not going to look into is how is this new capability be
> used for parallel query.  I feel we have enough use cases without it,
> that we can develop a fairly powerful feature.  After that is done and
> proven (and committed) we can look into how we can use this to implement
> these short-lived workers for stuff such as parallel query.
 Agreed and I also meant to say the same as you are saying.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Assigning NULL to a record variable
Next
From: Amit Kapila
Date:
Subject: Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)