Thread: How to synchronize tables in remote (production) and local databases in case if the structure of local database's table has been modified

PGRpdj5IZWxsbyw8L2Rpdj48ZGl2Ppo8L2Rpdj48ZGl2PkkgaGF2ZSBmYWNlZCB3aXRoIGEgY29t
cGxpY2F0ZWQgY2FzZSBvZiB0YWJsZSBzeW5jaHJvbmlzYXRpb24uIFRoZSBmdWxsIGRlc2NyaXB0
aW9uIG9mIHRoZSBwcm9ibGVtICh3aXRoIHNvbWUgZmlndXJlcykgaXMgcG9zdGVkIGhlcmU6IGh0
dHA6Ly9zdGFja292ZXJmbG93LmNvbS9xdWVzdGlvbnMvMjYyMzc2NjEvcG9zdGdyZXNxbC11cGRh
dGUtdGFibGUtd2l0aC1uZXctcmVjb3Jkcy1mcm9tLXRoZS1zYW1lLXRhYmxlLW9uLXJlbW90ZS1z
ZXJ2ZXI8L2Rpdj48ZGl2Ppo8L2Rpdj48ZGl2PkhlcmUgaXQgdGhlIHBhcnRpYWwgcmVwb3N0IG9m
IG15IHN0YWNrb3dlcmZsb3cncyB0b3BpYzo8L2Rpdj48ZGl2Ppo8L2Rpdj48YmxvY2txdW90ZT48
ZGl2PldlIGhhdmUgYSBQb3N0Z3JlU1FMIHNlcnZlciBydW5uaW5nIGluIHByb2R1Y3Rpb24gYW5k
IGEgcGxlbnR5IG9mIHdvcmtzdGF0aW9ucyB3aXRoIGFuIGlzb2xhdGVkIGRldmVsb3BtZW50IGVu
dmlyb25tZW50cy4gRWFjaCBvbmUgaGFzIGl0cyBvd24gbG9jYWwgUG9zdGdyZVNRTCBzZXJ2ZXIg
KHdpdGggbm8gcmVwbGljYXRpb24gd2l0aCB0aGUgcHJvZHVjdGlvbiBzZXJ2ZXIpLiBEZXZlbG9w
ZXJzIG5lZWQgdG8gcmVjZWl2ZSB1cGRhdGVzIHN0b3JlZCBpbiBwcm9kdWN0aW9uIHNlcnZlciBw
ZXJpb2RpY2FsbHkuPC9kaXY+PGRpdj6aPC9kaXY+PGRpdj5JIGFtIHRyeWluZyB0byBmaWd1cmUg
b3V0IGhvdyB0byBkdW1wIHRoZSBjb250ZW50cyBvZiBzZXZlcmFsIHNlbGVjdGVkIHRhYmxlcyBm
cm9tIHNlcnZlciBpbiBvcmRlciB0byB1cGRhdGUgdGhlIHRhYmxlcyBvbiBkZXZlbG9wbWVudCB3
b3Jrc3RhdGlvbnMuIFRoZSBiaWdnZXN0IGNoYWxsZW5nZSBpcyB0aGF0IHRoZSB0YWJsZXMgSSdt
IHRyeWluZyB0byBzeW5jaHJvbml6ZSBtYXkgYmUgZGl2ZXJnZWQgKGRldmVsb3BlcnMgbWF5IGFk
ZCAtIGJ1dCBub3QgZGVsZXRlIC0gbmV3IGZpZWxkcyB0byB0aGUgdGFibGVzIHRocm91Z2ggdGhl
IERqYW5nbyBPUk0sIHdoaWxlIHNjaGVtYSBvZiB0aGUgcHJvZHVjdGlvbiBkYXRhYmFzZSByZW1h
aW5zIHVuY2hhbmdlZCBmb3IgYSBsb25nIHRpbWUpLjwvZGl2PjwvYmxvY2txdW90ZT48ZGl2Pjxi
bG9ja3F1b3RlPjxwPlRoZXJlZm9yZSB0aGUgdXBkYXRlZCByZWNvcmRzIGFuZCBuZXcgZmllbGRz
IG9mIHRoZSB0YWJsZXMgc3RvcmVkIG9uIHdvcmtzdGF0aW9ucyA8ZW0+bXVzdCBiZTwvZW0+IHBy
ZXNlcnZlZCBhZ2FpbnN0IHRoZSBvdmVyd3JpdGluZy48L3A+PHA+SSBndWVzcyB0aGF0IGRpcmVj
dCBkdW1wcyAoZS5nLiBwZ19kdW1wIC1VIHJlbW90ZV91c2VyIC1oIHJlbW90ZV9zZXJ2ZXIgLXQg
dGFibGVfdG9fY29weSBzb3VyY2VfZGIgfCBwc3FsIHRhcmdldF9kYikgYXJlIG5vdCBzdWl0YWJs
ZSBoZXJlLjwvcD48cD48c3Ryb25nPlVQRDwvc3Ryb25nPjogSWYgcG9zc2libGUgSSB3b3VsZCBh
bHNvIGxpa2UgdG8gYXZvaWQgdGhlIHVzZSBvZiB0aGlyZCAoaW50ZXJtZWRpYXRlKSBkYXRhYmFz
ZSB3aGlsZSB0cmFuc2ZlcnJpbmcgdGhlIGRhdGEgZnJvbSBwcm9kdWN0aW9uIGRhdGFiYXNlIHRv
IHRoZSB3b3Jrc3RhdGlvbnMuPC9wPjwvYmxvY2txdW90ZT48L2Rpdj48ZGl2Ppo8L2Rpdj48ZGl2
PkhhdmUgbm8gaWRlYSBob3cgdG8gd29yayBpdCBvdXQuIEFueSBoZWxwIHdpbGwgYmUgYXBwcmVj
aWF0ZWQuPC9kaXY+PGRpdj5TaW5jZXJlbHksPC9kaXY+PGRpdj4tLZo8L2Rpdj48ZGl2PlZpdGFs
eSBJc2FldjwvZGl2PjxkaXY+c29mdHdhcmUgZW5nZW5lZXI8L2Rpdj48ZGl2PlRlYW0xMTIucnU8
L2Rpdj48ZGl2Ppo8L2Rpdj4=
On Wed, Oct 8, 2014 at 12:49 AM, Vitaly Isaev <visaev@team112.ru> wrote:
> I am trying to figure out how to dump the contents of several selected
> tables from server in order to update the tables on development
> workstations. The biggest challenge is that the tables I'm trying to
> synchronize may be diverged (developers may add - but not delete - new
> fields to the tables through the Django ORM, while schema of the production
> database remains unchanged for a long time).

The COPY trick will probably help you. Note that I specify a column
list in the last COPY statement.

skonoplev@[local]:5432 ~=#
create table t (i integer);
CREATE TABLE

skonoplev@[local]:5432 ~=#
insert into t select * from generate_series(1, 5);
INSERT 0 5

skonoplev@[local]:5432 ~=#
copy t to '/tmp/t.dump';
COPY 5

skonoplev@[local]:5432 ~=#
truncate t;
TRUNCATE TABLE

skonoplev@[local]:5432 ~=#
alter table t add s text;
ALTER TABLE

skonoplev@[local]:5432 ~=#
copy t(i) from '/tmp/t.dump';
COPY 5

skonoplev@[local]:5432 ~=#
select * from t;
 i | s
---+---
 1 |
 2 |
 3 |
 4 |
 5 |
(5 rows)


--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979
gray.ru@gmail.com


On 10/8/14, 2:49 AM, Vitaly Isaev wrote:
> Hello,
> I have faced with a complicated case of table synchronisation. The full description of the problem (with some
figures)is posted here:
http://stackoverflow.com/questions/26237661/postgresql-update-table-with-new-records-from-the-same-table-on-remote-server
> Here it the partial repost of my stackowerflow's topic:
>
>     We have a PostgreSQL server running in production and a plenty of workstations with an isolated development
environments.Each one has its own local PostgreSQL server (with no replication with the production server). Developers
needto receive updates stored in production server periodically. 
>     I am trying to figure out how to dump the contents of several selected tables from server in order to update the
tableson development workstations. The biggest challenge is that the tables I'm trying to synchronize may be diverged
(developersmay add - but not delete - new fields to the tables through the Django ORM, while schema of the production
databaseremains unchanged for a long time). 
>
>     Therefore the updated records and new fields of the tables stored on workstations /must be/ preserved against the
overwriting.
>
>     I guess that direct dumps (e.g. pg_dump -U remote_user -h remote_server -t table_to_copy source_db | psql
target_db)are not suitable here. 
>
>     *UPD*: If possible I would also like to avoid the use of third (intermediate) database while transferring the
datafrom production database to the workstations. 
>
> Have no idea how to work it out. Any help will be appreciated.
> Sincerely,

The way I've handled this in the past is to use pg_dump -a -t. That should be robust against added columns in the
destination.

My overall recommendation though is *never put data in a dev database that you can't recreate*. If you need test data,
createa script that generates that data for you. 

It's basically guaranteed that you'll sometimes need to blow away dev databases and start over, so it's best to just
buildthe infrastructure to support that and be able to do it as often as you want. 

BTW, there is a trick you can use to speed this process up. If you create a database that has most everything you need
init, you can then use that as a template for creating new databases via createdb -T. That is MUCH faster than manually
insertingdata. So what I'd suggest is something like: 

createdb master
<load production schema and necessary data into master>
createdb -T master dev

... do development work

When you need to rebuild your dev database you would then do:

dropdb dev
createdb -T master dev

I actually kept multiple master copies around, based on the date of the dump from production. It was sometimes useful
fortesting things. I also kept the production dumps in version control (note that I as careful not to dump any
sensitivedata). 
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com