Proposed structure for coexisting major versions - Mailing list pgsql-hackers
From | Oliver Elphick |
---|---|
Subject | Proposed structure for coexisting major versions |
Date | |
Msg-id | 1067207093.14956.50.camel@linda.lfix.co.uk Whole thread Raw |
Responses |
Re: Proposed structure for coexisting major versions
|
List | pgsql-hackers |
I sent this out a few days ago to pgsql-general. I'm resending it in the hope of some comment, in particular whether there is any interest in incorporating this into PostgreSQL itself, since this may make a difference to how I approach it. If people aren't in favour of including it, I will go ahead with it as a Debian package project only (unless someone reveals a horrid flaw in it that makes it unworkable!) ================================================================= The attached proposal is written primarily for Debian. Its motivation is that the current package upgrade process is pretty flaky and also that the current packaging does not really provide for multiple simultaneous postmasters, which are wanted by people hosting several database clusters. I assume this proposal may also be of interest to Red Hat packagers. I am copying it to the PostgreSQL list in case there are any obvious problems that will make it impracticable or harmful and in the hope of suggestions for its improvement. If the general idea is acceptable to the core team, I will rework it to be incorporated into 7.5. The major changes would be in the directory names, which currently assume Debian's file structure. PostgreSQL client wrapper proposal ================================== Current situation ----------------- When a new major version of PostgreSQL is released, it is necessary to dump and reload the database. The old software must be used for the dump, and the new software for the reload. This is a major problem for Red Hat and Debian, because a dump and reload is not required by every upgrade and by the time the need for a dump is realised, the old software may be deleted. Debian has certain rather unreliable procedures to save the old software and use it to do a dump, but these procedures often go wrong. Red Hat's installation environment is so rigid that it is not practicable for the Red Hat packages to attempt an automatic upgrade. At the moment, Debian offers a debconf choice for whether to attempt automatic upgrading; if it fails or is not allowed, a manual upgrade must be done, either from a pre-existing dump or by manual invocation of the postgresql-dump script. There was once an upstream program called pg_upgrade which could be used for in-place upgrading. This does not currently work and does not seem to be a high priority with upstream developers. It is possible to run different versions of PostgreSQL simultaneously, and indeed to run the same version on separate database clusters simultaneously. To do so, each postmaster must listen on a different port, so each client must specify the correct port. If it were possible to have two separate versions of the PostgreSQL packages installed simultaneously, it would be simple to do database upgrades by dumping from the old version and uploading to the new. However, the current structure of the packaging does not permit this. Proposed changes ---------------- I propose to change the Debian packaging to create a new package for each major version. The criterion for creating a new package is that initdb is required when upgrading from the previous version. Thus, the existing postgresql packages will become postgresql-7.3 and the new packages which are currently in experimental will become postgresql-7.4 (and similarly for all the binary packages). I propose to backport this to the current woody version (7.2.1) so that upgrading to sarge will be made much easier. When a new set of versioned packages is put in the archive, the previous set should depend on the new set, so as to force its installation. This will enable old packages to be removed. All versioned packages should provide postgresql, but there should also be a dummy postgresql package which will depend on postgresql-7.2. When sarge becomes the new stable release, that package can be dropped from the archive. Each versioned package will install into /usr/*/postgresql/{version}. In order to allow users easily to select the right package when working, I propose to modify the Debian package's pg_wrapper program to read the current version set by/for any user and to fork the correct executable with the correct library versions according to those preferences. /usr/bin will provide executables as soft-links to pg_wrapper. This procedure will also allow separate database clusters to be maintained for the use of different groups of users; these clusters need not all be of the same major version. This will allow much greater flexibility for those people who need to make application software changes consequent on a PostgreSQL upgrade. Not really for developers ------------------------- Since this proposal deals only with major versions of the software, it is probably not a suitable structure for PostgreSQL developers who may want to be running minor versions in parallel for debugging. However, it is important that this structure should not interfere with their ability to do that. Detailed structure ------------------ The system will be managed by two simple datafiles, one of which will tabulate clusters against major versions and a unique port number. The other will tabulate user against cluster and default database. The default database need not be specified, and will default to the user's own login name, in accordance with current upstream behaviour. Configuration: -------------- [In accordance with Debian policy, all configuration files are in /etc; they are currently soft-linked to the files in $PGDATA; this is controlled by an additional flag to initdb: --debian-conffile] /etc/default/postgresql: Default values for all versions /etc/postgresql/cluster_ports: maps clusters to versions and ports /etc/postgresql/user_clusters: maps users against clusters and dbs /etc/postgresql/config-{version}/: Default per-version configuration postgresql.conf, etc. /etc/postgresql/{port}/ Configuration for a specific cluster containing cluster-specific files soft-linked to PGDATA/ (esp. pg_hba.conf) /etc/default/postgresql will replace the current /etc/postgresql/postmaster.conf and /etc/postgresql/postgresql.env. Any per-cluster information currently in those files must be stored in /etc/postgresql/cluster_ports or under /etc/postgresql/{port}/ Package files and programs: /usr/lib/postgresql/{version}/: files for a specific version /usr/share/postgresql/{version}: /usr/share/doc/postgresql/{version}: /usr/bin/{program}-{version}: scripts calling "pg_wrapper -v {version}" /usr/lib/postgresql/bin/{program}-{version} Common programs: /usr/bin/pg_wrapper: environment chooser and program selector /usr/bin/{program} soft-links to pg_wrapper /usr/lib/postgresql/bin/{program} soft-links to pg_wrapper for administrator programs cluster_ports file ------------------ # This file maps PostgreSQL database clusters against their database # versions, the ports on which their postmasters are to listen, the Unix # logins of their administrators and the directories where their data are # to be found. # # Cluster: the name of the cluster # Active: (yes/no) whether a postmaster should be started at boot time # Version: Which major version of the software is needed # User: The Unix login of the cluster owner and administrator # Port: The port on which this cluster's postmaster should listen # Pgdata: The pathname of the data directory for this cluster # # This file will be altered by initdb, which will add or amend lines when # new clusters are created. # # CLUSTER ACTIVE VERSION USER PORT PGDATA postgresql yes 7.3 postgres 5432 /var/lib/postgres/7.3/data developers yes 7.4 postgres 5433 /var/lib/postgres/7.4/data mycustomer01 yes 7.3 cust01 5436 /usr3/cust/cust01/data user_clusters file ------------------ # This file maps users against the database clusters to which they will # connect by default. Any user may use "pg-wrapper -c" to set a new # default to be stored in ~/.postgresqlrc and the existence of that file # will supersede anything stored here, unless FORCED is set. If a # database is specified, that will be the one connected to by client tools # if none is specified on the command line. If the database specified # here is "*", this is interpreted as the database whose name is the same # as the user's login. (Setting the database to "*" will provide the # current default upstream behaviour for command line tools.) # # When scanning this file, the first matching line will be used. It is a # good idea to provide a default explicitly, with a final line with both # user and group set to "*". If there is no default, the implicit default # is to connect to the cluster whose postmaster is listening on port 5432 # and to the database matching the user's login name. # # If FORCED is set, matching users will not be able to override the # choices preselected for them. In order to enforce this, pg_wrapper will # have SGID mode set and will be owned by postgres.postgres; all the # /usr/lib/postgresql/{version}/ directories will have mode 750 set and be # owned by postgres.postgres. [NOTE: is this feature beneficial? access # can be restricted in other ways. On the other hand, its use helps # prevent ignorant users' causing problems by private enterprise.] # # This file may be altered by pg_wrapper -u when run by root. # # In the context of this file, user and group refer to the Unix login or # group, not to PostgreSQL users and groups # # USER GROUP CLUSTER FORCED DATABASE cust01 * mycustomer01 no template1 * cust01 mycustomer01 yes apps * * postgresql no * Changes to existing programs: ----------------------------- initdb ------ Add options to specify parameters to be written into cluster_ports: -C {cluster} --cluster={cluster} -p {port} --port={port} The administrator login recorded will be that of the owner of the current login session. The version will be that of the initdb instance, and the database path will be the specified PGDATA. When the --debian-conffile option is specified, the configuration files will be written in /etc/postgresql/config-{version} (if they do not exist already). Any options that apply to a specific cluster only will be written to /etc/postgresql/{port}. This will always include pg_hba.conf and pg_ident.conf. postmaster ---------- Read the config files in order: /etc/postgresql/[port}/*, then /etc/postgresql/config-{version}/* for any missing options and finally /etc/default/postgresql for any more that are not yet set. The presence of inappropriate options in /etc/default/postgresql will not be an error, since appropriate options change from one version to another. pg_hba.conf and pg_indent.conf are only ever read from /etc/postgresql/[port}/* psql ---- Abandon the current non-standard error abort if a connection database is not specified; psql is not expected to be run directly and all connection parameters should be provided by pg_wrapper as specified above. other command-line programs --------------------------- *All* command-line programs will be accessed through pg_wrapper, including those which are currently in /usr/lib/postgresql/bin/ It will be possible for the postgres Unix user to execute programs directly from /usr/lib/postgresql/{version}/bin, but he should be advised against doing this in normal circumstances. pg_wrapper ---------- [pg_wrapper is a program written for Debian in C. It is soft-linked to the various command-line clients and its job is to set the environment correctly for the linked program (because Debian does not allow a program to depend on the correct setting of environment variables). It is not currently allowed to be called under its own name.] Allow the program to be called directly. When called directly, it will allow a user to display his own connection choices, or to change them for the current session, or for all sessions (by writing ~/.postgresqlrc). When called by root, it will allow user_clusters to be changed. Add options: -u {user} root only - change user_clusters entry for the specified user. -g {group} root only - change user_clusters entry for the specified group. -f root only, with -u or -g - set the FORCE flag, so that the user cannot override the choice setby -c and -d -c {cluster} specify the cluster for the user -d {database} specify the connection database for the user in that cluster -s (if -u and -g are not given) save the settings in ~/.postgresqlrc When called as a wrapper with PGHOST not set or through the loopback interface or from the machine's own IP address, pg_wrapper will read user_clusters first to find the default (or forced) behaviour. If FORCE is not specified, it will then read ~/.postgresqlrc, if it is present and if PGCLUSTER is not set in the environment, and will set PGCLUSTER according to what it finds. It will then set PGPORT to the appropriate value (from cluster_ports) for the cluster in $PGCLUSTER. Version specific links of pg_wrapper will be provided, to enable the use of the appropriate software version when connecting to a remote machine (which may have a different version or multiple versions of PostgreSQL on it). The same behaviour may be achieved with another new command-line option: pg_wrapper -v {major version} [-h {remotehost}] -c {command} {command options} If -h is given or PGHOST is set and the specified host is not the local machine, the connection is made to the remote machine accoding to whatever is set in the current environment. pg_wrapper -l will list the current settings and show whether or not they are forced and whether they aretemporary or saved. Errors: If PGCLUSTER is set to a value which is not found in cluster_ports, pg_wrapper will exit with return code16. If an attempt is made to connect to the local system with disallowed options (trying to get round FORCED), exit with return code 20 /etc/init.d/postgresql ---------------------- This [boot-time] script must now start a postmaster for each active instance in clusters_ports, setting PGPORT and PGDATA and the effective user id appropriately for each instance and starting the correct version of the software for each cluster. pg_ctl ------ This script must become cluster-aware and be able to start the correct version of the postmaster for each cluster. Add options -c {cluster} read the appropriate settings from cluster_ports (and abort if the EUID is wrong). -A operate on all clusters for the current EUID -a (with start) operate on active clusters for the current EUID -r operate only on running clusters - those for which postmaster is running - for the current EUID -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "Whosoever therefore shall be ashamed of me and of my words in this adulterousand sinful generation; of him also shall the Son of man be ashamed, when he cometh in the glory of hisFather with the holy angels." Mark 8:38
pgsql-hackers by date: