Thread: pg_dump ORDER BY

pg_dump ORDER BY

From
Nabil Sayegh
Date:
Hi,

I did a mistake lastnight.
I have 2 databases of the same structure: db1, db2

Now I need to pg_dump them and do a diff.
Therefor I need the contents of the tables to be ordered.

Is there a way to achive that ?

TIA
--
 Nabil Sayegh

Re: pg_dump ORDER BY

From
"Joel Burton"
Date:
> Hi,
>
> I did a mistake lastnight.
> I have 2 databases of the same structure: db1, db2
>
> Now I need to pg_dump them and do a diff.
> Therefor I need the contents of the tables to be ordered.

If you have databases (db1 and db2) and two tables in each
(t1 and t2), I think it would be easiest to dump each table
independently (pg_dump -t t1 d1 > /tmp/d1t1) and diff that to its
sister table.

You could pretty easily turn diff into COPY input.

If you do a full pg_dump of the database, you'll get non-data stuff,
like all of the sequences, index declarations, etc. (Unless *that's*
what you want to reconcile.)

--
Joel Burton, Director of Information Systems -*- jburton@scw.org
Support Center of Washington (www.scw.org)

Re: pg_dump ORDER BY

From
"Joel Burton"
Date:
> Hm, as both databases are of the same structure, this shouldn't be a
> problem. My problem is that updated rows in a table will appear at the
> bottom of the table if not ordered. That leads to falsealarms of diff
> :(

Perhaps either:

(1) sort the dumped output (assuming you have a sortable column
as the first dumped column). You could cut the file into pieces a do
this, or in emacs, narrow-to-region of the COPY and sort that (and
even do diff in emacs...)

or

(2) CLUSTER your tables to your primary key. This will force that
sort order back onto them, and, according to my test, will dump
them in the primary key order. (CLUSTER has some drawbacks, like
dropping ref integrity & such. A better solution is:

SELECT * INTO foo_temp FROM foo ORDER BY primarykeycolumn;

TRUNCATE foo;

INSERT INTO foo SELECT * FROM foo_temp;

This preserves triggers, ref.int., etc., and will put things physically
in the right order. (The TRUNCATE won't call any delete triggers you
have, but the INSERT will, so if you log/handle inserts in a special
way, this might fill up your log/take some time.) )


If you get better advice that's not cross posted to the list, please
do so. I'd be interested in hearing other solutions.
--
Joel Burton, Director of Information Systems -*- jburton@scw.org
Support Center of Washington (www.scw.org)

Re: pg_dump ORDER BY

From
Nabil Sayegh
Date:
Joel Burton wrote:

> If you have databases (db1 and db2) and two tables in each
> (t1 and t2), I think it would be easiest to dump each table
> independently (pg_dump -t t1 d1 > /tmp/d1t1) and diff that to its
> sister table.
>
> You could pretty easily turn diff into COPY input.
>
> If you do a full pg_dump of the database, you'll get non-data stuff,
> like all of the sequences, index declarations, etc. (Unless *that's*
> what you want to reconcile.)

Hm, as both databases are of the same structure, this shouldn't be a
problem.
My problem is that updated rows in a table will appear at the bottom of
the
table if not ordered. That leads to falsealarms of diff :(

cu
--
 Nabil Sayegh

Re: pg_dump ORDER BY

From
Nabil Sayegh
Date:
Joel Burton wrote:

> SELECT * INTO foo_temp FROM foo ORDER BY primarykeycolumn;
>
> TRUNCATE foo;
>
> INSERT INTO foo SELECT * FROM foo_temp;
>
> This preserves triggers, ref.int., etc., and will put things physically
> in the right order. (The TRUNCATE won't call any delete triggers you
> have, but the INSERT will, so if you log/handle inserts in a special
> way, this might fill up your log/take some time.) )

That sounds interesting, but as i only need the dump sorted (not the
real data,
the order will be destroyed the next time we update) it would be enough
to
do the SELECT INTO.

cu
--
 Nabil Sayegh