Thread: pg_dump ORDER BY
Hi, I did a mistake lastnight. I have 2 databases of the same structure: db1, db2 Now I need to pg_dump them and do a diff. Therefor I need the contents of the tables to be ordered. Is there a way to achive that ? TIA -- Nabil Sayegh
> Hi, > > I did a mistake lastnight. > I have 2 databases of the same structure: db1, db2 > > Now I need to pg_dump them and do a diff. > Therefor I need the contents of the tables to be ordered. If you have databases (db1 and db2) and two tables in each (t1 and t2), I think it would be easiest to dump each table independently (pg_dump -t t1 d1 > /tmp/d1t1) and diff that to its sister table. You could pretty easily turn diff into COPY input. If you do a full pg_dump of the database, you'll get non-data stuff, like all of the sequences, index declarations, etc. (Unless *that's* what you want to reconcile.) -- Joel Burton, Director of Information Systems -*- jburton@scw.org Support Center of Washington (www.scw.org)
> Hm, as both databases are of the same structure, this shouldn't be a > problem. My problem is that updated rows in a table will appear at the > bottom of the table if not ordered. That leads to falsealarms of diff > :( Perhaps either: (1) sort the dumped output (assuming you have a sortable column as the first dumped column). You could cut the file into pieces a do this, or in emacs, narrow-to-region of the COPY and sort that (and even do diff in emacs...) or (2) CLUSTER your tables to your primary key. This will force that sort order back onto them, and, according to my test, will dump them in the primary key order. (CLUSTER has some drawbacks, like dropping ref integrity & such. A better solution is: SELECT * INTO foo_temp FROM foo ORDER BY primarykeycolumn; TRUNCATE foo; INSERT INTO foo SELECT * FROM foo_temp; This preserves triggers, ref.int., etc., and will put things physically in the right order. (The TRUNCATE won't call any delete triggers you have, but the INSERT will, so if you log/handle inserts in a special way, this might fill up your log/take some time.) ) If you get better advice that's not cross posted to the list, please do so. I'd be interested in hearing other solutions. -- Joel Burton, Director of Information Systems -*- jburton@scw.org Support Center of Washington (www.scw.org)
Joel Burton wrote: > If you have databases (db1 and db2) and two tables in each > (t1 and t2), I think it would be easiest to dump each table > independently (pg_dump -t t1 d1 > /tmp/d1t1) and diff that to its > sister table. > > You could pretty easily turn diff into COPY input. > > If you do a full pg_dump of the database, you'll get non-data stuff, > like all of the sequences, index declarations, etc. (Unless *that's* > what you want to reconcile.) Hm, as both databases are of the same structure, this shouldn't be a problem. My problem is that updated rows in a table will appear at the bottom of the table if not ordered. That leads to falsealarms of diff :( cu -- Nabil Sayegh
Joel Burton wrote: > SELECT * INTO foo_temp FROM foo ORDER BY primarykeycolumn; > > TRUNCATE foo; > > INSERT INTO foo SELECT * FROM foo_temp; > > This preserves triggers, ref.int., etc., and will put things physically > in the right order. (The TRUNCATE won't call any delete triggers you > have, but the INSERT will, so if you log/handle inserts in a special > way, this might fill up your log/take some time.) ) That sounds interesting, but as i only need the dump sorted (not the real data, the order will be destroyed the next time we update) it would be enough to do the SELECT INTO. cu -- Nabil Sayegh