Re: How to synchronize tables in remote (production) and local databases in case if the structure of local database's table has been modified - Mailing list pgsql-general

From Jim Nasby
Subject Re: How to synchronize tables in remote (production) and local databases in case if the structure of local database's table has been modified
Date
Msg-id 54384423.6040000@BlueTreble.com
Whole thread Raw
In response to How to synchronize tables in remote (production) and local databases in case if the structure of local database's table has been modified  (Vitaly Isaev <visaev@team112.ru>)
List pgsql-general
On 10/8/14, 2:49 AM, Vitaly Isaev wrote:
> Hello,
> I have faced with a complicated case of table synchronisation. The full description of the problem (with some
figures)is posted here:
http://stackoverflow.com/questions/26237661/postgresql-update-table-with-new-records-from-the-same-table-on-remote-server
> Here it the partial repost of my stackowerflow's topic:
>
>     We have a PostgreSQL server running in production and a plenty of workstations with an isolated development
environments.Each one has its own local PostgreSQL server (with no replication with the production server). Developers
needto receive updates stored in production server periodically. 
>     I am trying to figure out how to dump the contents of several selected tables from server in order to update the
tableson development workstations. The biggest challenge is that the tables I'm trying to synchronize may be diverged
(developersmay add - but not delete - new fields to the tables through the Django ORM, while schema of the production
databaseremains unchanged for a long time). 
>
>     Therefore the updated records and new fields of the tables stored on workstations /must be/ preserved against the
overwriting.
>
>     I guess that direct dumps (e.g. pg_dump -U remote_user -h remote_server -t table_to_copy source_db | psql
target_db)are not suitable here. 
>
>     *UPD*: If possible I would also like to avoid the use of third (intermediate) database while transferring the
datafrom production database to the workstations. 
>
> Have no idea how to work it out. Any help will be appreciated.
> Sincerely,

The way I've handled this in the past is to use pg_dump -a -t. That should be robust against added columns in the
destination.

My overall recommendation though is *never put data in a dev database that you can't recreate*. If you need test data,
createa script that generates that data for you. 

It's basically guaranteed that you'll sometimes need to blow away dev databases and start over, so it's best to just
buildthe infrastructure to support that and be able to do it as often as you want. 

BTW, there is a trick you can use to speed this process up. If you create a database that has most everything you need
init, you can then use that as a template for creating new databases via createdb -T. That is MUCH faster than manually
insertingdata. So what I'd suggest is something like: 

createdb master
<load production schema and necessary data into master>
createdb -T master dev

... do development work

When you need to rebuild your dev database you would then do:

dropdb dev
createdb -T master dev

I actually kept multiple master copies around, based on the date of the dump from production. It was sometimes useful
fortesting things. I also kept the production dumps in version control (note that I as careful not to dump any
sensitivedata). 
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


pgsql-general by date:

Previous
From: Jim Nasby
Date:
Subject: Re: psql connection issue
Next
From: Jim Nasby
Date:
Subject: Re: table versioning approach (not auditing)