Thread: bg worker: overview

bg worker: overview

From
Markus Wanner
Date:
Hi,

I've been working on modularizing Postgres-R to ease review and maybe 
allow code reuse. As threatened at the Cluster Meeting in Tokyo and 
again at CHAR(10), I'm now presenting more results of that effort: the 
background workers infrastructure module.

Postgres-R so far used custom backends to apply transactions from remote 
nodes. These were controlled by an additional coordinator process, which 
acted as a job dispatcher and obviously didn't have a client connection. 
There were obvious similarities between that and the existing autovacuum 
component, with its launcher that controls multiple worker processes.

I've combined these two components into a single, general purpose 
background worker infrastructure component, which is now capable to 
serve autovacuum as well as Postgres-R. And it might be of use for other 
purposes as well, most prominently parallel query processing. Basically 
anything that needs a backend connected to a database to do any kind of 
background processing, possibly parallelized.

Overall, this module represents quite a large portion of the Postgres-R 
patch. 15% by lines inserted (2912 vs 19332) and as much as 95% by lines 
deleted (1422 vs 1482).

With this further modularization, I hope to increase understandability 
and wish to encourage more hackers to have a look at (parts of) the 
Postgres-R source code. Of course, I highly appreciate reviews and 
discussions. And it would be very nice to see this module reused. Please 
don't hesitate to ask questions, if you need help.

(I don't dare to add these patches to the commit fest, as this 
refactoring doesn't have any immediate benefit for Postgres itself, at 
the moment.)

Regards

Markus Wanner

P.S.: git adicts, everything's up here:
http://git.postgres-r.org/?p=bgworker


Re: bg worker: overview

From
"Kevin Grittner"
Date:
Markus Wanner <markus@bluegap.ch> wrote:
> (I don't dare to add these patches to the commit fest, as this 
> refactoring doesn't have any immediate benefit for Postgres
> itself, at the moment.)
You could submit them as Work In Progress patches....
-Kevin


Re: bg worker: overview

From
Markus Wanner
Date:
Hi,

On 07/13/2010 08:45 PM, Kevin Grittner wrote:
> You could submit them as Work In Progress patches....

Okay, I added them. I guess they get more attention that way.

Regards

Markus


Re: bg worker: overview

From
Dimitri Fontaine
Date:
Hi,

We've been talking about this topic on -performance:

Markus Wanner <markus@bluegap.ch> writes:
> I've combined these two components into a single, general purpose background
> worker infrastructure component, which is now capable to serve autovacuum as
> well as Postgres-R. And it might be of use for other purposes as well, most
> prominently parallel query processing. Basically anything that needs a
> backend connected to a database to do any kind of background processing,
> possibly parallelized.

Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 13, 2010 at 16:42, Dimitri Fontaine <dfontaine@hi-media.com> wrote:
>> So a supervisor daemon with a supervisor API that would have to support
>> autovacuum as a use case, then things like pgagent, PGQ and pgbouncer,
>> would be very welcome.
>>
>> What about starting a new thread about that? Or you already know you
>> won't want to push the extensibility of PostgreSQL there?
>
> +1 on this idea in general, if we can think up a good API - this seems
> very useful to me, and you have some good examples there of cases
> where it'd definitely be a help.

So, do you think we could use your work as a base for allowing custom
daemon code? I guess we need to think about how to separate external
code and internal code, so a second layer could be necessary here.

As far as the API goes, I have several ideas but nothing that I have
already implemented, so I'd prefer to follow Markus there :)

Regards,
--
dim


Re: bg worker: overview

From
Markus Wanner
Date:
Hi,

On 07/15/2010 03:45 PM, Dimitri Fontaine wrote:
> We've been talking about this topic on -performance:

Thank for pointing out this discussion, I'm not following -performance
too closely.

> So, do you think we could use your work as a base for allowing custom
> daemon code?

Daemon code? That sounds like it could be an addition to the
coordinator, which I'm somewhat hesitant to extend, as it's a pretty
critical process (especially for Postgres-R).

With the step3, which adds support for sockets, you can use the
coordinator to listen on pretty much any kind of socket you want. That
might be helpful in some cases (just as it is required for connecting to
the GCS).

However, note that the coordinator is designed to be just a message
passing or routing process, which should not do any kind of time
consuming processing. It must *coordinate* things (well, jobs) and react
promptly. Nothing else.

On the other side, the background workers have a connection to exactly
one database. They are supposed to do work on that database.

> I guess we need to think about how to separate external
> code and internal code, so a second layer could be necessary here.

The background workers can easily load external libraries - just as a
normal backend can with LOAD. That would also provide better
encapsulation (i.e. an error would only tear down that backend, not the
coordinator). You'd certainly have to communicate between the
coordinator and the background worker. I'm not sure how match that fits
your use case.

The thread on -performance is talking quite a bit about connection
pooling. The only way I can imagine some sort of connection pooling to
be implemented on top of bgworkers would be to let the coordinator
listen on an additional port and pass on all requests to the bgworkers
as jobs (using imessages). And of course send back the responses to the
client. I'm not sure how that overhead compares to using pgpool or
pgbouncer. Those are also separate processes through which all of your
data must flow. They use plain system sockets, imessages use signals and
shared memory.

I don't know enough about the pgagent or PgQ use cases to comment,
sorry. Hope that's helpful, anyway.

Regards

Markus


Re: bg worker: overview

From
Jaime Casanova
Date:
On Thu, Jul 15, 2010 at 1:28 PM, Markus Wanner <markus@bluegap.ch> wrote:
>
> However, note that the coordinator is designed to be just a message
> passing or routing process, which should not do any kind of time
> consuming processing. It must *coordinate* things (well, jobs) and react
> promptly. Nothing else.
>

so, merging this with the autovacuum will drop our hopes of having a
time based autovacuum? not that i'm working on that nor i was thinking
on working on that... just asking to know what the implications are,
and what the future improves could be if we go this route

--
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


Re: bg worker: overview

From
Markus Wanner
Date:
Hi,

On 07/15/2010 09:51 PM, Jaime Casanova wrote:
> so, merging this with the autovacuum will drop our hopes of having a
> time based autovacuum? not that i'm working on that nor i was thinking
> on working on that... just asking to know what the implications are,
> and what the future improves could be if we go this route

Not at all. Autovacuum should work exactly as before (seen from the
outside, the implementation is a bit different). The coordinator is an
async event processor. Events may origin from sockets, or may be driven
by time, as is the case for autovacuum (up to something like a 1 second
precision or something, I don't remember exactly).

(There's nothing that needs to be merged with autovacuum. It already is
merged, if you want).

Regards

Markus Wanner


Re: bg worker: overview

From
Alvaro Herrera
Date:
Excerpts from Jaime Casanova's message of jue jul 15 15:51:10 -0400 2010:
> On Thu, Jul 15, 2010 at 1:28 PM, Markus Wanner <markus@bluegap.ch> wrote:
> >
> > However, note that the coordinator is designed to be just a message
> > passing or routing process, which should not do any kind of time
> > consuming processing. It must *coordinate* things (well, jobs) and react
> > promptly. Nothing else.
> 
> so, merging this with the autovacuum will drop our hopes of having a
> time based autovacuum?

I don't think so, but I didn't know we had hopes for time-based
autovacuum.  What exactly do you mean?

Initially there were some thoughts on schedule-based autovacuum
parameters, but it seems that interest has dropped for that feature, so
I haven't pushed much for that.  However, I don't think that this patch
series affects that in any way.

BTW I think this patch series makes sense, though I haven't looked at it
in detail.  I guess it means I'll have to have a look at the IMessages
stuff as well.


Re: imessages up-date

From
Markus Wanner
Date:
Hi,

On 07/15/2010 10:37 PM, Alvaro Herrera wrote:
> BTW I think this patch series makes sense, though I haven't looked at it
> in detail.  I guess it means I'll have to have a look at the IMessages
> stuff as well.

Yes, only after adding these patches to the commit fest, I realized that
I'dd have to add the dynshmem and imessages patches for bgworker to be
of any use.

A few words about imessages: as these can be of arbitrary size, that
whole concept sort of includes the problem of dynamically allocating
memory (from the shared area). Splitting that out into its separate
patch (and module) - namely dynshmem - made things a lot simpler for
imessages.

Now imessages is just a singly-linked list (queue) of messages per
backend, including the coordinator (but obviously not the postmaster).
Protected by a spinlock per queue. IMessageActivate() adds a message to
the queue of another backend, while IMessageCheck() returns the newest
message from the queue for the calling backend, if any.

The harder part has gone to the dynshmem patch, which I've already
explained [1].

@Kevin: how do we proceed WRT the commit fest? Having bgworker in there,
but not its dependencies (dynshmem and imessages) is what I'd call a
conflict. OTOH, both of those patches have been published way before the
commit fest started as well.

Alvaro, do you have time to review some of the autovacuum related
changes, i.e. bgworker stuff?

Regards

Markus


Re: imessages up-date

From
"Kevin Grittner"
Date:
Markus Wanner <markus@bluegap.ch> wrote:
> On 07/15/2010 10:37 PM, Alvaro Herrera wrote:
>> BTW I think this patch series makes sense, though I haven't
>> looked at it in detail.  I guess it means I'll have to have a
>> look at the IMessages stuff as well.
> 
> Yes, only after adding these patches to the commit fest, I
> realized that I'dd have to add the dynshmem and imessages patches
> for bgworker to be of any use.
> @Kevin: how do we proceed WRT the commit fest? Having bgworker in
> there, but not its dependencies (dynshmem and imessages) is what
> I'd call a conflict. OTOH, both of those patches have been
> published way before the commit fest started as well.
Since these two patches were posted before the commit fest started,
and are prerequisites for six properly submitted patches, I'm going
with the "spirit of the law" and saying it's OK to add them.  Does
the application allow that?
-Kevin


Re: imessages up-date

From
Markus Wanner
Date:
Hi,

On 07/16/2010 04:01 PM, Kevin Grittner wrote:
> Since these two patches were posted before the commit fest started,
> and are prerequisites for six properly submitted patches, I'm going
> with the "spirit of the law" and saying it's OK to add them.  Does
> the application allow that?

Yes, it does. I just added those two (under miscellaneous).

Markus


Re: bg worker: overview

From
Simon Riggs
Date:
On Tue, 2010-07-13 at 16:30 +0200, Markus Wanner wrote:

> I've combined these two components into a single, general purpose 
> background worker infrastructure component

I think many people want such a feature, so the requirement is good.

The code itself merely reflects your design, so what I would really like
to see is a full explanation of this. If the generalisation is to be
accepted we need a very clear explanation of how it works and details of
the API since that is what's needed to allow other people besides
yourself to begin using it for patches in 9.1.

If we can see the docs on that SGML/README form then we'll be able to
more quickly agree how to proceed. After that, reviewing your patch
against that design will be easy/ier.

Let's go for this in stages. If we can get something basic and useful
for lots of people in this commitfest, we can layer on the other stuff
later.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: bg worker: overview

From
Markus Wanner
Date:
Hello Simon,

On 07/17/2010 12:30 PM, Simon Riggs wrote:
> The code itself merely reflects your design, so what I would really like
> to see is a full explanation of this.

Are the descriptive mails I sent for each patch going into the right 
direction and just need to be extended, in your opinion? Or are you 
really missing something in there?

It's easier to answer more specific questions.

> If the generalisation is to be
> accepted we need a very clear explanation of how it works and details of
> the API since that is what's needed to allow other people besides
> yourself to begin using it for patches in 9.1.

Understood.

> If we can see the docs on that SGML/README form then we'll be able to
> more quickly agree how to proceed. After that, reviewing your patch
> against that design will be easy/ier.

I don't think SGML makes much sense, as there are not many user visible 
changes that need to go into the manual (except for the GUCs, those 
certainly require to be mentioned in the manual).

If you agree, I'd add the currently sent descriptions to README files in 
the source.

I think that I commented the source code pretty extensively, however, 
that's a subjective feeling.

I'm under the impression, that I commented the source code pretty well.

> Let's go for this in stages. If we can get something basic and useful
> for lots of people in this commitfest, we can layer on the other stuff
> later.
>



Re: bg worker: overview

From
Markus Wanner
Date:
Sorry, hit send too early.

On 07/17/2010 01:47 PM, Markus Wanner wrote:
> I think that I commented the source code pretty extensively, however,
> that's a subjective feeling.

Take this phrase.

> I'm under the impression, that I commented the source code pretty well.

Scratch that, please.

:-)


You wrote:
>> Let's go for this in stages. If we can get something basic and useful
>> for lots of people in this commitfest, we can layer on the other stuff
>> later.

Hm.. agreed. That's why I split into dynshmem, imessages and steps 1 - 6 
of bgworker.

I'm expecting controversy and discussion about dynshmem, which is a 
dependency of all of my other patches. So, maybe we should start there?

Or what kind of staging did you have in mind?

Regards

Markus


Re: bg worker: overview

From
Simon Riggs
Date:
On Sat, 2010-07-17 at 13:47 +0200, Markus Wanner wrote:

> Are the descriptive mails I sent for each patch going into the right 
> direction and just need to be extended, in your opinion? Or are you 
> really missing something in there?

Not detailed enough, for me, by a long way. Your notes read like an
update for someone that's been following your work in detail up to this
point. I apologise that I have not been able to do that.

If I was going to write a module that used the facilities you are
providing, what would I need to know?
e.g. http://developer.postgresql.org/pgdocs/postgres/indexam.html

> It's easier to answer more specific questions. 

Agreed. I don't have enough information to ask any. It's hard to write
simple and clear design specs but no harder than writing the code;
reading someone else's code to discover what it does is very hard (for
me).

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: bg worker: overview

From
Dimitri Fontaine
Date:
Markus Wanner <markus@bluegap.ch> writes:
> Daemon code? That sounds like it could be an addition to the
> coordinator, which I'm somewhat hesitant to extend, as it's a pretty
> critical process (especially for Postgres-R).
[...]
> However, note that the coordinator is designed to be just a message
> passing or routing process, which should not do any kind of time
> consuming processing. It must *coordinate* things (well, jobs) and react
> promptly. Nothing else.

Yeah, I guess user daemons would have to be workers, not plugins you
want to load into the coordinator.

> On the other side, the background workers have a connection to exactly
> one database. They are supposed to do work on that database.

Is that because of the way backends are started, and to avoid having to
fork new ones too often?

> The background workers can easily load external libraries - just as a
> normal backend can with LOAD. That would also provide better
> encapsulation (i.e. an error would only tear down that backend, not the
> coordinator). You'd certainly have to communicate between the
> coordinator and the background worker. I'm not sure how match that fits
> your use case.

Pretty well I think.

> The thread on -performance is talking quite a bit about connection
> pooling. The only way I can imagine some sort of connection pooling to
> be implemented on top of bgworkers would be to let the coordinator
> listen on an additional port and pass on all requests to the bgworkers
> as jobs (using imessages). And of course send back the responses to the
> client. I'm not sure how that overhead compares to using pgpool or
> pgbouncer. Those are also separate processes through which all of your
> data must flow. They use plain system sockets, imessages use signals and
> shared memory.

Yeah. The connection pool is better outside of code. Let's think PGQ and
a internal task scheduler first, if we think at any generalisation.

Regards,
-- 
dim


Re: bg worker: overview

From
Markus Wanner
Date:
Hi,

On 07/23/2010 09:45 PM, Dimitri Fontaine wrote:
> Yeah, I guess user daemons would have to be workers, not plugins you
> want to load into the coordinator.

Okay.

>> On the other side, the background workers have a connection to exactly
>> one database. They are supposed to do work on that database.
>
> Is that because of the way backends are started, and to avoid having to
> fork new ones too often?

For one, yes, I want to avoid having to start ones too often. I did look 
into letting these background workers switch the database connection, 
but that turned out not to be worth the effort.

Would you prefer a background worker that's not connected to a database, 
or why are you asking?

>> The background workers can easily load external libraries - just as a
>> normal backend can with LOAD. That would also provide better
>> encapsulation (i.e. an error would only tear down that backend, not the
>> coordinator). You'd certainly have to communicate between the
>> coordinator and the background worker. I'm not sure how match that fits
>> your use case.
>
> Pretty well I think.

Go ahead, re-use the background workers. That's what I've published them 
for ;-)

> Yeah. The connection pool is better outside of code. Let's think PGQ and
> a internal task scheduler first, if we think at any generalisation.

To be honest, I still don't quite grok the concept behind PGQ. So I 
cannot really comment on this.

Regards

Markus Wanner


Re: bg worker: overview

From
Dimitri Fontaine
Date:
Markus Wanner <markus@bluegap.ch> writes:
> For one, yes, I want to avoid having to start ones too often. I did look
> into letting these background workers switch the database connection, but
> that turned out not to be worth the effort.
>
> Would you prefer a background worker that's not connected to a database, or
> why are you asking?

Trying to figure out how it would fit the PGQ and pgagent needs. But
maybe user defined daemons should be sub-coordinators (I used to think
about them as "supervisors") able to talk to the coordinator to get a
backend connected to some given database and distribute work to it.

You're using iMessage as the data exchange, how are you doing the work
distribution? What do you use to tell the backend what is the processing
you're interrested into?

> Go ahead, re-use the background workers. That's what I've published
> them for

Hehe :) The aim of this thread would be to have your input as far as
designing an API would go, now that we're about on track as to what the
aim is.

> To be honest, I still don't quite grok the concept behind PGQ. So I cannot
> really comment on this.

In very short, the idea is a clock that ticks and associate
current_txid() to now(), so that you're able to say "give me 3s worth of
transactions activity from this queue". It then provides facilities to
organise a queue into batches at consumer request, and for more details,
see there:
 http://github.com/markokr/skytools-dev/blob/master/sql/ticker/pgqd.c
http://github.com/markokr/skytools-dev/blob/master/sql/ticker/ticker.c

But the important thing as far as making it a child of the coordinator
goes would be, I guess, that it's some C code running as a deamon and
running SQL queries from time to time. The SQL queries are calling C
user defined functions, provided by the PGQ backend module.

Regards,
--
dim


Re: bg worker: overview

From
Markus Wanner
Date:
Hey Dimitri,

On 07/24/2010 07:26 PM, Dimitri Fontaine wrote:
> Trying to figure out how it would fit the PGQ and pgagent needs. But
> maybe user defined daemons should be sub-coordinators (I used to think
> about them as "supervisors") able to talk to the coordinator to get a
> backend connected to some given database and distribute work to it.

Hm.. sounds like an awful lot of work to me, but if you need the 
separation and security of a separate process...

To simplify, you might want to start a bgworker on database 'postgres', 
which then acts as a sub-coordinator (and doesn't really need to use its 
database connection).

> You're using iMessage as the data exchange, how are you doing the work
> distribution? What do you use to tell the backend what is the processing
> you're interrested into?

Well, there are different types of imessages defined in imsg.h. If you 
are coding something within Postgres, you'd just add all the required 
messages types there. There's no such thing as an external registration 
for new message types.

For example, for autovacuum, there are two message types: 
IMSGT_PERFORM_VACUUM, that's sent from the coordinator to a bgworker, 
and initiates a vacuum job there. Then there's IMSGT_FORCE_VACUUM, which 
is sent from a backend to the coordinator to inform it that a certain 
database urgently needs vacuuming.

For Postgres-R, things are a bit more complicated. The first IMSGT_CSET 
messages starts the application of a remote transaction. Further 
IMSGT_CSET messages may follow. The IMSGT_ORDERING message finally 
completes the job.

So, imessage types cannot be mapped to jobs directly. See 
include/postmaster/coordinator.h, enum worker_state. Those are the 
possible states a worker can be in (job types).

Adding a job would consist of adding a worker_state, plus at least one 
imessage type. Once the worker is done with its job, it returns 
IMSGT_READY to the coordinator.

I'm open to refinements, such as assigning a certain range of message 
types to external use or some such. However, have no idea how to avoid 
clashing message type ids, then. Maybe those should still be part of imsg.h?

>> Go ahead, re-use the background workers. That's what I've published
>> them for
>
> Hehe :) The aim of this thread would be to have your input as far as
> designing an API would go, now that we're about on track as to what the
> aim is.

Oh, sure. :-)

> In very short, the idea is a clock that ticks and associate
> current_txid() to now(), so that you're able to say "give me 3s worth of
> transactions activity from this queue". It then provides facilities to
> organise a queue into batches at consumer request, and for more details,
> see there:
>
>    http://github.com/markokr/skytools-dev/blob/master/sql/ticker/pgqd.c
>    http://github.com/markokr/skytools-dev/blob/master/sql/ticker/ticker.c

Okay, thanks for the pointers. However, comments are relatively sparse 
in there as well...

> But the important thing as far as making it a child of the coordinator
> goes would be, I guess, that it's some C code running as a deamon and
> running SQL queries from time to time. The SQL queries are calling C
> user defined functions, provided by the PGQ backend module.

You could certainly define jobs, which don't ever terminate. And calling 
SQL queries certainly sounds more like a background job to me, than 
something belonging to the sphere of the coordinator. Sorry, if my first 
impulse has been misleading.

So, the bgworker infrastructure could probably satisfy the internal 
communication needs. But how does this ticker daemon talk to the 
outside? Does it need to open a socket and listen there? Or do the 
requests to that queue come in via SQL?

Regards

Markus


Re: bg worker: overview

From
Dimitri Fontaine
Date:
Markus Wanner <markus@bluegap.ch> writes:
> To simplify, you might want to start a bgworker on database 'postgres',
> which then acts as a sub-coordinator (and doesn't really need to use its
> database connection).

Yeah, that sounds like the simplest way forward, so that it's easy for
this "user daemon" to communicate with the coordinator. Typically this
would start a backend and LOAD a module, which would enter the user
daemon main loop, I guess.

Then all the usual backend code facilities are there, even SQL and
calling user defined function.

> Well, there are different types of imessages defined in imsg.h. If you are
> coding something within Postgres, you'd just add all the required messages
> types there. There's no such thing as an external registration for new
> message types.

Given that imessages can have a payload, maybe the simplest way there
would be to add a "IMSGT_EXEC_QUERY_PARAMS" message type, the payload of
which would be composed of the SQL text and its parameters. I get it
that requiring a bgworker backend connected to a given database is
already part of the API right?

> So, the bgworker infrastructure could probably satisfy the internal
> communication needs. But how does this ticker daemon talk to the outside?
> Does it need to open a socket and listen there? Or do the requests to that
> queue come in via SQL?


The ticker only job is to manage a "ticks" table per database. All it
needs for that is a libpq connection, really, but given your model it'd
be a single backend (worker) that would send imessages to the
coordinator so that a background worker would tick, by executing this
SQL: select pgq.ticker()).

So that would be a lot of changes to follow your facilities, it's
unclear to me how much we're twisting it so that it fits. Well, maybe
that's not a good example after all. Dave, want to see about pgagent?

Regards,
-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte