Hi,
I've been working on modularizing Postgres-R to ease review and maybe
allow code reuse. As threatened at the Cluster Meeting in Tokyo and
again at CHAR(10), I'm now presenting more results of that effort: the
background workers infrastructure module.
Postgres-R so far used custom backends to apply transactions from remote
nodes. These were controlled by an additional coordinator process, which
acted as a job dispatcher and obviously didn't have a client connection.
There were obvious similarities between that and the existing autovacuum
component, with its launcher that controls multiple worker processes.
I've combined these two components into a single, general purpose
background worker infrastructure component, which is now capable to
serve autovacuum as well as Postgres-R. And it might be of use for other
purposes as well, most prominently parallel query processing. Basically
anything that needs a backend connected to a database to do any kind of
background processing, possibly parallelized.
Overall, this module represents quite a large portion of the Postgres-R
patch. 15% by lines inserted (2912 vs 19332) and as much as 95% by lines
deleted (1422 vs 1482).
With this further modularization, I hope to increase understandability
and wish to encourage more hackers to have a look at (parts of) the
Postgres-R source code. Of course, I highly appreciate reviews and
discussions. And it would be very nice to see this module reused. Please
don't hesitate to ask questions, if you need help.
(I don't dare to add these patches to the commit fest, as this
refactoring doesn't have any immediate benefit for Postgres itself, at
the moment.)
Regards
Markus Wanner
P.S.: git adicts, everything's up here:
http://git.postgres-r.org/?p=bgworker