RFC tool to support development / operations work with slony replicated databases - Mailing list pgsql-general

From Andrew Hammond
Subject RFC tool to support development / operations work with slony replicated databases
Date
Msg-id 5a0a9d6f0703051707i26b553c3v721ba61db9b8e786@mail.gmail.com
Whole thread Raw
Responses Re: RFC tool to support development / operations work with slony replicated databases  (Kenneth Downs <ken@secdat.com>)
Re: RFC tool to support development / operations work with slony replicated databases  (Mark Stosberg <mark@summersault.com>)
List pgsql-general
Hello All,

I've been working on designing a tool to facilitate both developers
and operations staff working with slony replicated databases. I think
that the problem described below is a general problem for people
working with systems that are both in production and under on-going
development / maintenance. As a result I would like to both solicit
the input of the community and share the results. Documentation (which
is still somewhat drafty) follows.

Thank you for your time,
Andrew Hammond


Current Approach

A common problem in the database world is handling revisions to the
database that go with revisions in the software running against this
database. Currently our method is to include upgrade.sql and
downgrade.sql scripts with each software release.

Problem Statement

This will fail when we start using slony since we need to handle DML
differently from DDL and DCL. We also need a way to apply slonik
scripts. Ordering matters in the application of these scripts.

After talking about it for a while, we agreed that developers want a
way to apply their updates without stepping on each other's toes while
in the process of developing and testing their work.

Design

Interface

updatemydatabase -f target [-y] [--force-upgrade | --force-downgrade]
[-U pguser] [-h pghost] [-p pgport] [-d pgdatabase] [--cluster
clustername]

-f

    Optional Defaults to the current working directory. Specifies the
target intended to be upgraded to. This may be either the full or
relative path. This may be either a directory or a file.
-y

    Optional If set, assume yes to all questions. This is intended for
use when running the program in tool mode.
-U -h -p -d

    Optional As for psql and other PostgreSQL command line utilities.
--cluster

    Optional Defaults to the database name. Specifies the name of the
slony cluster to work with. This should have a one-letter short form
that conforms with other similar tools. Gotta figure out what those
are though...

    Since we will be using a python connector which is based on libqp,
we will auto-magically respect the standard postgres environment
variables including the .pgpass file for handling passwords.

Limitations

    * We are not trying to deal with databases with more than one
slony replication cluster in them.
    * We are not going to deal with the case where various sets have
different origins.
    * We assume that this is run off the same machine that is
currently running the slons. We can connect to every database in the
cluster.
    * Aside from generating the slonik preamble, we are not going to
try and auto-generate slonik scripts that do anything more complicated
than EXECUTE SCRIPT. At least not initially. Maybe we can get more
clever later?
    * We will not try to be clever about detecting changes to files.
Alfred floated the idea of using the SVN id tag to detect if a file
had been changed since it was last applied and then forcing a
downgrade/upgrade cycle. That seems like a lot of code for a corner
case. Alfred and Long agreed that it's probably a good idea to create
a convention instead. Do not edit files after they're committed unless
it will cause in-efficiencies in the application to the production
database. Instead, create a new file. If you are forced to edit a
committed file, then email the dev list.
    * Along the lines of not being clever, we assume there is only one
set, and that it's number is 1.
    * We will not assume the existence of a node 1. The whole point of
increasing availability by replicating is that we don't have to rely
on the existence of a given database.

Data Structure

Each release will include a directory that has the same name as the
full release tag. This directory must contain all the scripts to be
applied. The release may include directories of scripts from prior
releases in the same parent directory. The scripts may have an
arbitrary name, but must end with a suffix of either dml.sql, ddl.sql,
dcl.sql or slonik. Script names should incorporate the bug number
they're addressing.

    * /my/base/directory
          o 3.10.0
                + create_foo_23451.ddl.sql
                + populate_foo_23451.dml.sql
                + alter_bar_add_column_reference_foo_23451.ddl.sql
                + update_bar_reference_foo_23451.dml.sql
                + alter_bar_column_not_null_23451.ddl.sql
                + subscribe_foo_23451.slonik
                + cleanup_some_data_migration_stuff_23451.ddl.sql
                + fix_bug_24341.ddl.sql -- these are poorly chosen
names, but hey, it's an example...
                + fix_bug_24341.dml.sql
                + fix_bug_24341.slonik
                + drop_broken_node_30031.slonik
          o 3.10.1
                + another_table_29341.ddl.sql

Inside the script, we add some semantics to what are usually comments.
An example is probably the best way to show this.

-- alter_bar_column_not_null_23451.ddl.sql
-- Witty comment about why this column needs to be not null.
--dep update_bar_reference_foo_23451.dml.sql

ALTER TABLE bar ALTER COLUMN foo_id DROP NOT NULL;

--upgrade

ALTER TABLE bar ALTER COLUMN foo_id SET NOT NULL;

At the top of the script, before any line that isn't either a comment
or whitespace, you can have zero or more comments of the form --dep
<filename> in SQL or #dep <filename> in slonik scripts. This is how
you define which other files the current file depends on. All files
for a given version depend on all files of all previous versions.
Filenames are all characters following the space after def until the
end of the line, not including any whitespace at the tail of the line.
Don't use filenames that end with whitespace, it's annoying.

I don't see any need to get even more restrictive and disallow
whitespace in filenames. This would allow brief comments on the same
line. More involved comments will take more than just a single line
anyway. Thoughtful selection of filenames should eliminate the need
for brief comments and tab eliminates the annoyance of typing them.

The other additional semantic is the --upgrade or #upgrade tag. This
defines when the downgrade section ends and the upgrade begins.

Finally, we need to add a table in the database which lists files applied.

CREATE TABLE applied_files
(   release varchar references dbinfo.version ?
,   file_name text
,   applied_on timestamptz default now()
,   primary key (release, file_name)
-- more meta-info?
);

Does anyone have an opinion about what schema this table should go into?

Algorithm

Determining the Target

The parameter passed to -f must be either a relative or absolute path
to either a file or a directory. Obviously, it's an error to pass -f
something that doesn't exist.

A target must consist of a version tag and may include a filename. If
there is no -f parameter, then the cwd is assumed to be the parameter.
If the parameter is a directory, then the name of that directory is
inspected. If it looks like \d+\.\d+\.\d+.* (ie 3.10.0 or 3.10.1 or
3.10.2b13) then this is be the target version tag. If it doesn't match
this pattern, then look for sub-directories within this directory
which do match this pattern. The highest (sorted by dotted numeric,
not alphabetically) found is the target version tag. If no
sub-directories that match the pattern are found, then fail.

If the -f parameter is a file then the version tag must be the name of
the directory in which that file resides.

Deciding Between Upgrade and Downgrade

The current version and file set are obtained from the database. If
the target version is higher than our current version, we're
upgrading. If the target is a lower version than our current version,
we're downgrading. If the target is the same version as our current
version, and no filename has been supplied then are upgrading.

If the target is the same as the current version, and a filename has
been supplied and that filename has not been applied (according to the
file set in the database) we're upgrading. If it has been applied,
then we're downgrading.

If we're downgrading and there is no filename involved, then we need
to apply all the downgrades necessary to achieve the target version
(do not downgrade any of the patches for that version). If we're
upgrading then we need to apply all the upgrades necessary to achieve
the target version (including all the patches for that version).

When file names are involved then an upgrade means to apply all the
files necessary to achieve the version prior to the target version,
plus any files from the target version upon which the target file
depends followed by the target file itself, of course. For a downgrade
the opposite effect is desired: downgrade all the versions greater
than the target version, then downgrade any file which depends on the
target file followed by downgrading the target file.

The process upgrading or downgrading is incremental by patch number,
then by minor version number, finally by major version number.

For example, to upgrade from an existing version to a new version, the
update tool check to see what the current version of the database is
as well as seeing what files have been applied. If there are more file
to be applied for the current version, it applies them. It then looks
for the next reversion up by incrementing the patch number. If that
doesn't exist, it sets the patch number to zero and increments the
minor version. If that doesn't exist then set the minor version to
zero and increments the major version. If that doesn't exist then
we're done. Downgrades are the same except that they decrement instead
of increment.

As each upgrade file is applied, the file name is inserted into the
applied_files table. Downgrades delete the file name out of the
applied_files table once the downgrade has been applied.

Once all the upgrade files for a given version are applied for a given
version, if we are upgrading beyond that version, it it time to update
the database's dbinfo.version. Further updates are then incrementally
applied. For downgrades the dbinfo.version may be updated once all the
files for a given version have been deleted from applied_files,
assuming that you are downgrading beyond the current version.

Applying Different Types of Changes

The method of application for updates varies depending on if the
database is replicated or not. It also varies depending on the nature
of the update as determined by it's suffix. A database is assumed to
be replicated if it has a _cluster schema.

dml
    This is the easiest category of changes. The update tool needs
only to connect to the origin, issue a BEGIN statement, send all the
DML and issue a COMMIT if they succeed. Otherwise it should issue a
ROLLBACK and abort the upgrade/downgrade process with an appropriate
error message.
ddl / dcl

    These changes need to be applied globally to the cluster. The
update tool must connect to all the databases in the cluster, issue a
BEGIN statement, run the DDL / DCL script and see if it completes,
finally issue a ROLLBACK. This is to verify that it will succeed on
all the clusters. Abort with a suitable error message if the script
doesn't apply to all members in the cluster. We might want to be able
to skip the verification step by having a
--trust-me-i-know-what-im-doing parameter. Once the scripts are
verified, they are applied via slonik execute. The slonik preamble
(cluster name, and connection info for all nodes) should be drawn from
the slonik schema in the origin database.
slonik
    These are intended to manipulate the slony cluster directly. For
example, creating a new set, adding some tables and sequences and then
subscribing a bunch of nodes to it. The update tool will generate the
preamble for this.

Implementation protocols/rules

   1. The main set number is 1.
   2. Temporary file name will be /tmp/{ddl|dml}-pid-timestamp.sql.
This file must be removed after execution. Unless of execution fails
in which case do we want to leave it around to facilitate debugging?
   3. Temporary set (to hold new tables) number is arbitrary, but
should probably be based on the PID.

pgsql-general by date:

Previous
From: Omar Eljumaily
Date:
Subject: Re: M:M table conditional delete for parents
Next
From: Yumiko Izumi
Date:
Subject: pq_flush: send() failed: Broken pipe