Thread: Best practices for migrating a development database to a release database

Best practices for migrating a development database to a release database

From
Collin Peters
Date:
I have searched the Internet... but haven't found much relating to this.

I am wondering on what the best practices are for migrating a
developmemnt database to a release database.  Here is the simplest
example of my situation (real world would be more complex).

Say you have two versions of your application.  A release version and a
development version.  After a month of developing you are ready to
release a new version.  There have been many changes to the development
database that are not in the release database.  However, the release
database contains all your real information (customers, etc...).  What
is the best practice for migrating the development database to the
release database?

I have thought of the following situations:
-Simply track all the changes you made to the development database and
make the same changes to the release database
-Back up the release database... overwrite it with the development
database... then copy all your real data back into the release database
(this last step is probably quite difficult)
-Perhaps some combination of the two


Does anybody have any recommendations?

Regards,
Collin Peters

Re: Best practices for migrating a development database to a release database

From
Thomas F.O'Connell
Date:
One thing I used to do (and I won't necessarily claim it as a best
practice) was to maintain my entire data model (tables, functions,
indexes, sequences) as SQL (plus postgres extensions) CREATE statements
in text files that were version controlled (via CVS). I had an entire
set of utilities that could modify the existing database as necessary
to treat the SQL files as authoritative. For anything new, the create
statements sufficed, but for modifications, some objects had to be
regenerated. When it was time to release, we would export the textual
SQL schema to the production server, make the necessary updates using
my utilities, and then restart services.

Since I'm deploying postgres in new environments now, and I left these
utilities behind at another job (where they're still in use), I've been
thinking more about the concept of schema version control. But I'm
similarly interested in any concepts of best practices in this area.

-tfo

On Sep 10, 2004, at 1:55 PM, Collin Peters wrote:

> I have searched the Internet... but haven't found much relating to
> this.
>
> I am wondering on what the best practices are for migrating a
> developmemnt database to a release database.  Here is the simplest
> example of my situation (real world would be more complex).
>
> Say you have two versions of your application.  A release version and
> a development version.  After a month of developing you are ready to
> release a new version.  There have been many changes to the
> development database that are not in the release database.  However,
> the release database contains all your real information (customers,
> etc...).  What is the best practice for migrating the development
> database to the release database?
>
> I have thought of the following situations:
> -Simply track all the changes you made to the development database and
> make the same changes to the release database
> -Back up the release database... overwrite it with the development
> database... then copy all your real data back into the release
> database (this last step is probably quite difficult)
> -Perhaps some combination of the two
>
> Does anybody have any recommendations?
>
> Regards,
> Collin Peters


Re: Best practices for migrating a development database to a release database

From
Martijn van Oosterhout
Date:
On Sat, Sep 11, 2004 at 02:29:42AM -0500, Thomas F. O'Connell wrote:
> One thing I used to do (and I won't necessarily claim it as a best
> practice) was to maintain my entire data model (tables, functions,
> indexes, sequences) as SQL (plus postgres extensions) CREATE statements
> in text files that were version controlled (via CVS). I had an entire
> set of utilities that could modify the existing database as necessary
> to treat the SQL files as authoritative. For anything new, the create
> statements sufficed, but for modifications, some objects had to be
> regenerated. When it was time to release, we would export the textual
> SQL schema to the production server, make the necessary updates using
> my utilities, and then restart services.

One thing I was thinking about at my job which I would really have
liked is some kind of version control linked with the database. Say for
example I'd be able to 'checkout' a database function, edit it and
check it in again. This would require some kind of backing store and I
was wondering whether that would be in the database too.

I always found it annoying when I had function definitions in seperate
files which could be checked into CVS, but there was no guarentee that
those files had any relationship with what was in the database.

Maybe I should sketch something out that could be merged with psql or
something... I don't suppose anything like this exists anywhere
already?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Best practices for migrating a development database

From
Csaba Nagy
Date:
Beside version controlled schema files we have a guy who writes
migration scripts based on the old schema and the new (development)
schema (frozen e.g. by branching in CVS).
Usually there are 3 steps involved:
 - a pre-migration script, which prepares the data base for the new
schema, by adding the new structures needed for the data migration;
 - a data migration script, which moves around data between the old and
the new structures;
 - a finalization script, which removes the old structures not needed
anymore;

I think there's no way to make any of these steps automatically computed
as a diff between the old and new schemas...
We usually do it anyway so that after step 1 was executed, both the old
version of the application and the new version can work at the same
time, and the new version will only use the data migrated by step 2, but
I suppose our application is not very typical (we have lots of distinct
customers which live in the same data base but have distinct data).
This also means we try to do minimal changes to the data base and we try
to only have additions, no modifications, this makes migration easier.

HTH,
Csaba.


On Sat, 2004-09-11 at 09:29, Thomas F.O'Connell wrote:
> One thing I used to do (and I won't necessarily claim it as a best
> practice) was to maintain my entire data model (tables, functions,
> indexes, sequences) as SQL (plus postgres extensions) CREATE statements
> in text files that were version controlled (via CVS). I had an entire
> set of utilities that could modify the existing database as necessary
> to treat the SQL files as authoritative. For anything new, the create
> statements sufficed, but for modifications, some objects had to be
> regenerated. When it was time to release, we would export the textual
> SQL schema to the production server, make the necessary updates using
> my utilities, and then restart services.
>
> Since I'm deploying postgres in new environments now, and I left these
> utilities behind at another job (where they're still in use), I've been
> thinking more about the concept of schema version control. But I'm
> similarly interested in any concepts of best practices in this area.
>
> -tfo
>
> On Sep 10, 2004, at 1:55 PM, Collin Peters wrote:
>
> > I have searched the Internet... but haven't found much relating to
> > this.
> >
> > I am wondering on what the best practices are for migrating a
> > developmemnt database to a release database.  Here is the simplest
> > example of my situation (real world would be more complex).
> >
> > Say you have two versions of your application.  A release version and
> > a development version.  After a month of developing you are ready to
> > release a new version.  There have been many changes to the
> > development database that are not in the release database.  However,
> > the release database contains all your real information (customers,
> > etc...).  What is the best practice for migrating the development
> > database to the release database?
> >
> > I have thought of the following situations:
> > -Simply track all the changes you made to the development database and
> > make the same changes to the release database
> > -Back up the release database... overwrite it with the development
> > database... then copy all your real data back into the release
> > database (this last step is probably quite difficult)
> > -Perhaps some combination of the two
> >
> > Does anybody have any recommendations?
> >
> > Regards,
> > Collin Peters
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend


Re: Best practices for migrating a development database to a

From
Vivek Khera
Date:
>>>>> "CP" == Collin Peters <cpeters@mcrt.ca> writes:

CP> I have thought of the following situations:
CP> -Simply track all the changes you made to the development database and
CP> make the same changes to the release database
CP> -Back up the release database... overwrite it with the development
CP> database... then copy all your real data back into the release
CP> database (this last step is probably quite difficult)
CP> -Perhaps some combination of the two

You need one more layer: the staging server.

What we do is develop on local workstations, prepare release on a
staging server, then push the staging server info to the production
box, or run the same updating script on production.

Any schema changes are done via scripts within transactions.  The
renames, alters, grants, etc., are all tested on the staging server
with a current copy (pg_dump/restore) from the live server so we know
there won't be any surprizes on the live data (or close to it).  It
also lets us know how long some things might take.

For example, this weekend we need to add a primary key to a 65 million
row table that just logs events.  Until now it really didn't need a PK
since it was never updated and the queries were all aggregates.
However, to run slony replication it needs a PK...  The test procedure
of doing it on the staging server pointed out some flaws in the
conversion script that were not noticed when running on the
development server because the dataset was so small.  These flaws
would have made the DB unusable for something like 5 days (if it ever
completed -- I don't know because I aborted that test) while the
update occurred, and once done would leave the application without
access to the revised table.  Naturally, we found better ways to do it
that have trimmed the expected time down to about 1.5 hours or less.

You really have to take each situation separately.  The easy way of
the PK adding script works fine on tables up to about 60k or 100k
rows, so we used that on some other smaller tables.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/