Re: compare two databases - Mailing list pgsql-admin

From Reece Hart
Subject Re: compare two databases
Date
Msg-id 1069091514.2618.23.camel@tallac
Whole thread Raw
In response to Re: compare two databases  (Andrew Sullivan <andrew@libertyrms.info>)
List pgsql-admin
You might consider comparing the schemas with pg_dump -s and diff (as was already suggested), then comparing the data with a (e.g.) perl DBI script connected to two databases simultaneously.

If the two databases are truly descended from the same design (e.g., restored from the same dump), then I'd expect the dump order to be the same and the diff probably won't be corrupted with false differences. I'd strip lines matching /^--/ to facilitate the diff.

A simple-minded implementation (I excel here) of the perl-DBI part doesn't require table-specific code. I'd do it like this: 1) connect to both databases; 2) for each (common) table, create a statement which selects rows in some canonical order (e.g., sorted by primary key or some non-degenerate multi-column tuple), and then fetch into a hash one-by-one (cursors would be better, but IIRC the DBI code doesn't support them). Because you're fetching into a hash, you can write a single routine to compare all elements of the common hash keys (you'll already know of differences between keys/columns themselves from the schema diff). This is not the speed-optimal solution, but it shouldn't be too bad in total time and you'd probably spend even more time implementing the optimal solution.

Obviously, this will miss certain diabolical changes like column or table renames, but it should suffice for most mutations of two instances from a common schema and content.

-Reece

-- 
Reece Hart, http://www.in-machina.com/~reece/, GPG:0x25EC91A0

pgsql-admin by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: compare two databases
Next
From: ow
Date:
Subject: Re: pg_restore and FK constraints with large dbs