Re: pg_background (and more parallelism infrastructure patches) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pg_background (and more parallelism infrastructure patches)
Date
Msg-id CA+TgmoZLx-dmeKa-Y+tGrcm+OWTZ4qZKjBYL2Prn=HVdtRQBMw@mail.gmail.com
Whole thread Raw
In response to Re: pg_background (and more parallelism infrastructure patches)  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: pg_background (and more parallelism infrastructure patches)
List pgsql-hackers
On Sat, Jul 26, 2014 at 4:37 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
>> Attached is a contrib module that lets you launch arbitrary command in
>> a background worker, and supporting infrastructure patches for core.
>
> Cool.
>
> I assume this 'fell out' of the work towards parallelism? Do you think
> all of the patches (except the contrib one) are required for that or is
> some, e.g. 3), only required to demonstrate the others?

I'm fairly sure that patches 3, 4, and 5 are all required in some form
as building blocks for parallelism.  Patch 1 contains two functions,
one of which (shm_mq_set_handle) I think is generally useful for
people using background workers, but not absolutely required; and one
of which is infrastructure for patch 3 which might not be necessary
with different design choices (shm_mq_sendv).  Patch 2 is only
included because pg_background can benefit from it; we could instead
use an eoxact callback, at the expense of doing cleanup at
end-of-transaction rather than end-of-query.  But it's a mighty small
patch and seems like a reasonable extension to the API, so I lean
toward including it.

>> Patch 3 adds the ability for a backend to request that the protocol
>> messages it would normally send to the frontend get redirected to a
>> shm_mq.  I did this by adding a couple of hook functions.  The best
>> design is definitely arguable here, so if you'd like to bikeshed, this
>> is probably the patch to look at.
>
> Uh. This doesn't sound particularly nice. Shouldn't this rather be
> clearly layered by making reading/writing from the client a proper API
> instead of adding hook functions here and there?

I don't know exactly what you have in mind here.  There is an API for
writing to the client that is used throughout the backend, but right
now "the client" always has to be a socket.  Hooking a couple of parts
of that API lets us write someplace else instead.  If you've got
another idea how to do this, suggest away...

> Also, you seem to have only touched receiving from the client, and not
> sending back to the subprocess. Is that actually sufficient? I'd expect
> that for this facility to be fully useful it'd have to be two way
> communication. But perhaps I'm overestimating what it could be used for.

Well, the basic shm_mq infrastructure can be used to send any kind of
messages you want between any pair of processes that care to establish
them.  But in general I expect that data is going to flow mostly in
one direction - the user backend will launch workers and give them an
initial set of instructions, and then results will stream back from
the workers to the user backend.  Other messaging topologies are
certainly possible, and probably useful for something, but I don't
really know exactly what those things will be yet, and I'm not sure
the FEBE protocol will be the right tool for the job anyway.  But
error propagation, which is the main thrust of this, seems like a need
that will likely be pretty well ubiquitous.

>> This patch also adds a function to
>> help you parse an ErrorResponse or NoticeResponse and re-throw the
>> error or notice in the originating backend.  Obviously, parallelism is
>> going to need this kind of functionality, but I suspect a variety of
>> other applications people may develop using background workers may
>> want it too; and it's certainly important for pg_background itself.
>
> I would have had use for it previously.

Cool.  I know Petr was interested as well (possibly for the same project?).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
Next
From: Thomas Munro
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)