Hi list.
2 cases I'm interested in:
1) Migrating data from one database to another
2) Distributing data over many databases, and later merging
In what ways can you design tables to easier facilitate the above cases?
I am aware of multi-master replication software, as described here:
http://en.wikipedia.org/wiki/Multi-master_replication
For this question, I'm more interested in schema design, so that a
home-brewed database synchronization can perform synchronization.
I have some experience with this. One of my previous projects was to
reliably migrate data from one database to another, where the 2
databases had problems like:
- Many foreign keys weren't enforced
- Some fields needed special treatment (eg: should be unique, or
behave like a foreign key ref, even if db schema doesn't specify it.
In other cases they need to be updated during the migration).
- Most auto-incrementing primary keys (and related foreign key
references) needed to be updated during migration, because they are
already used in the destination database for other records.
- Many tables are undocumented, some fields have an unknown purpose
- Some tables didn't have fields that can be used as a 'natural' key
for the purpose of migration (eg: tables which only exist to link
together other tables, or tables where there are duplicate records).
I wrote a Python script (using SQLAlchemy and Elixir) to do the above
for our databases.
Are there any existing migration tools which could have helped with
the above? (it would have required a *lot* of user help).
Are there recommended ways of designing tables so that synchronization
is easier?
The main thing I've read about is ensuring that all records have a
natural key of some kind, eg GUID. Also, your migration app needs to
have rules for conflict resolution.
David.