Re: tie user processes to postmaster was:(Re: [HACKERS] scheduler in core) - Mailing list pgsql-hackers
From | Dimitri Fontaine |
---|---|
Subject | Re: tie user processes to postmaster was:(Re: [HACKERS] scheduler in core) |
Date | |
Msg-id | m2vddpdl5i.fsf@hi-media.com Whole thread Raw |
In response to | Re: tie user processes to postmaster was:(Re: [HACKERS] scheduler in core) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: tie user processes to postmaster was:(Re: [HACKERS] scheduler in core)
|
List | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> writes: > This seems like a solution in search of a problem to me. The most > salient aspect of such processes is that they would necessarily run > as the postgres user I happen to run my PGQ tickers and londiste daemons as "londiste" user and make it a superuser (at least while installing, as they need to install some PL/C stuff). Then there's pgbouncer too, which I always run as postgres system user, if only to be able to open a socket in the same directory where postgres opens them (/var/run/postgresql on my system). The precedent are archive and restore command. They do run as postgres user too, don't they? I think we could have made walreceiver and walsender some generic out-of-core facilities too, within this model. The other common use case is to schedule maintenance (vacuum, cluster some table, maintain a materialized view, backup), all of which can be run as postgres user too, only adaptation could be to have a security definer function. So, out of the only scheduler use case, if you want to see some C code that I'd like to be able to run as a postmaster's child, have a look at pgqd, the next skytools version ticker daemon, here: http://github.com/markokr/skytools-dev/blob/master/sql/ticker/pgqd.c http://github.com/markokr/skytools-dev/blob/master/sql/ticker/ticker.c You'll see mainly a C daemon which connects to some database and calls stored procedures there. There could be separate schedules in fact, the main loop for ticking the snapshots, another one for managing the retry event queue, and yet another one for managing the maintenance operations. What I think I'd like to have is a user process supervisor as a postmaster child, its job would be to start and stop the user processes at the right time frames, and notice their death. A restart policy should be attached to each child, which is either permanent, transient or temporary. To avoid infinitely restarting a process, the supervisor has 2 GUCs, supervisor_max_restarts in supervisor_max_time. Being unable to manage a "user" permanent child process (worker) would trigger a postmaster stop. All of this is heavily inspired by the Erlang approach, which I've found simple and effective: http://erlang.org/doc/man/supervisor.html The supervised processes will have to offer a main entry point, which will get called once the supervisor has forked, in the child process, and must be prepared to receive SIGHUP, SIGINT and SIGTERM signals. The setup will get done with the existing custom_variable_classes, and more generally I guess we're reusing the PGXS and custom .so infrastructure (shared_preload_libraries). The main good reason to have this is to allow extension authors to develop autonomous daemon in a portable way, benefiting from all those efforts PostgreSQL made to have a fork() model available on windows et al. I guess we need a way to start the same supervised daemon extension code more than once too, for example several pgbouncer setups on different ports in different pooling modes. > I still haven't seen a good reason for not using cron or Task Scheduler > or other standard tools. We're past the scheduler alone. You won't turn archive_command, restore_command, walsender, walreceiver, pgbouncer or PGQ as a cron job, but you could have them managed by the postmaster, as plugins. Your good reason would be less code to keep an eye on :) Back to the scheduling, you can backup the maintenance schedule with the database itself. If all they do is call some function, which in my case the only exception is pg_dump, then you don't need to re-validate then when you upgrade your OS, or migrate from CentOS to debian or from developer station running windows to production server running some Unix variant. Once more, nothing you couldn't implement already. Maybe PostgreSQL is growing fast enough that now is the time to look at how to enable non core things to be easily shipped with the core product? Do we need a PostgreSQL distribution? I know David Wheeler's opinion on that, and think PGAN + pg_restore friendly extensions + supervised helper daemons will be huge enablers. Regards, -- dim
pgsql-hackers by date: