Database design questions - Mailing list pgsql-general
From | David |
---|---|
Subject | Database design questions |
Date | |
Msg-id | 18c1e6480806180143j653acb16w5aa07969e7112b24@mail.gmail.com Whole thread Raw |
Responses |
Re: Database design questions
|
List | pgsql-general |
Hi list. There are some database design-related issues I've pondered about for some time. But first: * Is this the correct list to ask these questions on? * Should I split this into separate threads instead of 1 thread for all my questions? Assuming there isn't a problem, here are my questions: ========== Question 1: Storing app defaults. If you have a table like this: table1 - id - field1 - field2 - field3 table2 - id - table1_id - field1 - field2 - field3 table1 & table2 are setup as 1-to-many. If I want to start providing user-customizable defaults to the database (ie, we don't want apps to update database schema), is it ok database design to add a table2 record, with a NULL table1_id field? In other words, if table1 has no matching table2 record, then the app will use the table2 record with a NULL table1_id field to get defaults. This looks messy however. Is there a better way to do it? A few other ways I can think of: 1) Have an extra table1 record (with string fields containing 'DEFAULT'), against which the extra table2 record is linked. 2) Have a new table, just for defaults, like this: table2_defaults - field1 - field2 - field3 Which is the cleanest way? Is there another method I should use instead? ========== Question 2: Backwards-compatible field addition If you have an existing table, and apps which use it, then how do you add new fields to the table (for new apps), but which might affect existing apps negatively? eg: I start with a table like this: table1 - id - field1 - field2 - field3 Later, I want to add a use case, where there is new behaviour, if a new field is set in the table, like this: table1 - id - field1 - field2 - field3 - field4 - NEW - if unset, do old behaviour. if set, do something else The problem is, that existing apps (besides your new app) won't know about field4, so they will keep using the old behaviour for new records (where field4 is set), which you don't want. The most obvious thing to do is to update all apps using table1, so they also check the value of field4. Is there another, more backwards-compatible way to add field4 for the new behaviour, without having to update all the apps? A few things I can think of: 1) table1 becomes a view of an updated table, with a 'WHERE field4 IS NULL' clause. Problem with this is that some RDBMS (Postgresql specifically) don't let you run update statements on views. 2) Apps use stored procedures for all database access. Maybe ok for new apps, not so much for existing apps which use regular SQL. 3) All apps use the same library for accessing database Then you update the library and all apps automagically know about the extra field. Again, maybe ok for new apps, not so much for existing apps. 4) Make a new table (copy of the old one), with the extra field. Then your app checks both tables, or just the new one if applicable. This can work, but you may end up with a lot of app-specific tables, where the main difference between the tables is extra columns, and which apps use the tables. 5) Have a 'db version' column in the table. Older apps only operate on records at or before the version the programmer knew about at the time. This can work, but it seems like a very non-standard, hackish way of designing database tables. Also it's a pain for all apps to have to hardcode a db version number. 6) Find a clever way to use table inheritance I haven't thought it through, but here are some docs I've read on the subject: http://www.postgresql.org/docs/8.1/static/ddl-inherit.html Any other ideas? ========== Question 3: Temporal databases http://en.wikipedia.org/wiki/Temporal_database I haven't used them before, but I like the idea of never deleting/updating records so you have a complete history (a bit like source code version control). How well do temporal databases work? Do RDBMS (ie Postgresql) need add-ons to make it effective, or can you just add extra temporal columns to all your tables and add them to your app queries? Does this increase app complexity and increase server load a lot? Are there Python libraries which simplify this? (eg: add-ons for Elixir or SQLAlchemy). Or should apps all implement their own 'temporal data access' module, which transparently uses the current date & time until queried for historical data? ========== Question 4: Data synchronization 2 cases I'm interested in: 1) Migrating data from one database to another 2) Distributing data over many databases, and later merging In what ways can you design tables to easier facilitate the above cases? I am aware of multi-master replication software, as described here: http://en.wikipedia.org/wiki/Multi-master_replication For this question, I'm more interested in schema design, so that a home-brewed database synchronization can perform synchronization. I have some experience with this. One of my previous projects was to reliably migrate data from one database to another, where the 2 databases had problems like: - Many foreign keys weren't enforced - Some fields needed special treatment (eg: should be unique, or behave like a foreign key ref, even if db schema doesn't specify it. In other cases they need to be updated during the migration). - Most auto-incrementing primary keys (and related foreign key references) needed to be updated during migration, because they are already used in the destination database for other records. - Many tables are undocumented, some fields have an unknown purpose - Some tables didn't have fields that can be used as a 'natural' key for the purpose of migration (eg: tables which only exist to link together other tables, or tables where there are duplicate records). I wrote a Python script (using SQLAlchemy and Elixir) to do the above for our databases. Are there any existing migration tools which could have helped with the above? (it would have required a *lot* of user help). Are there recommended ways of designing tables so that synchronization is easier? The main thing I've read about is ensuring that all records have a natural key of some kind, eg GUID. Also, your migration app needs to have rules for conflict resolution. ========== Thanks in advance for any suggestions :-) David.
pgsql-general by date: