Thread: Postgres schema comparison.

Postgres schema comparison.

From
Stef
Date:
Hi all,

I've got a master database with many other databases that
all have (or is supposed to have) the same exact same schema
as the master database (the master database is basically an empty
template database containing the schema definition).

The problem is that none of the schemas actually match the master schema.
e.g. missing columns, columns not in the correct order (attnum), missing indexes
and primary keys, and in severe cases, missing sequences and tables.

I have the wonderful job of re-synch'ing  all the schemas out there not
conforming to the master. I've looked everywhere for something that
will help doing this. I'm specifically looking for a way to do a sumcheck
or something similar on tables and/or schema as a whole to be able to
do a table comparison with the master database.

It will be a bonus to pick up exactly what is missing, but for now, just identifying
differences is what I want to achieve. I'm using postgres 7.3 mostly, but
I may want to use this for 7.4 and 8.0 databases as well.

Has anybody got some suggestions of what I can do or use to do this.

TIA
Kind Regards
Stefan

Re: Postgres schema comparison.

From
Markus Schaber
Date:
Hi, Stef,

Stef schrieb:

> It will be a bonus to pick up exactly what is missing, but for now, just identifying
> differences is what I want to achieve. I'm using postgres 7.3 mostly, but
> I may want to use this for 7.4 and 8.0 databases as well.
>
> Has anybody got some suggestions of what I can do or use to do this.

There are (at least) two independently developed pgdiff applications,
they can be found at:

http://pgdiff.sourceforge.net/

http://gborg.postgresql.org/project/pgdiff/projdisplay.php

I did not try the first one, but the latter one worked on some of my
datas, but fails on others. I filed a bug report some time ago, but got
no answer, so I'm afraid this tool currently is unmaintained:
http://gborg.postgresql.org/project/pgdiff/bugs/bugupdate.php?895

But maybe a pg_dump --schema-only on all the databases, and then
manually diffing the files may already fulfil your needs.


Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: [ADMIN] Postgres schema comparison.

From
John DeSoi
Date:
On Mar 7, 2005, at 4:33 AM, Stef wrote:

> I have the wonderful job of re-synch'ing  all the schemas out there not
> conforming to the master. I've looked everywhere for something that
> will help doing this. I'm specifically looking for a way to do a
> sumcheck
> or something similar on tables and/or schema as a whole to be able to
> do a table comparison with the master database.
>

Develop a function that builds a string describing the tables/schemas
you want to compare. Then have your function return the md5 sum of the
string as the result. This will give you a 32 character value you can
use to determine if there is a mismatch.


John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


Re: [ADMIN] Postgres schema comparison.

From
Stef
Date:
Markus Schaber mentioned :
=> There are (at least) two independently developed pgdiff applications,
=> they can be found at:
=>
=> http://pgdiff.sourceforge.net/
=>
=> http://gborg.postgresql.org/project/pgdiff/projdisplay.php

Thanks a lot!

=> I did not try the first one, but the latter one worked on some of my
=> datas, but fails on others. I filed a bug report some time ago, but got
=> no answer, so I'm afraid this tool currently is unmaintained:
=> http://gborg.postgresql.org/project/pgdiff/bugs/bugupdate.php?895
=>
=> But maybe a pg_dump --schema-only on all the databases, and then
=> manually diffing the files may already fulfil your needs.

I've tested something similar, that seems to work ok for me for now :
pg_dump -s -t [TABLE] [DBNAME] | grep -v "^--" | md5sum

The problem I have with this, is that I have to run the command per table,
and seeing that I have over 500 tables in each database, this takes quite a
long time.

I'll test some of the above pgdiffs, and see if either can do it better.

Kind Regards
Stefan

Re: [ADMIN] Postgres schema comparison.

From
Markus Schaber
Date:
Hi, Stef,

Stef schrieb:

> The problem I have with this, is that I have to run the command per table,
> and seeing that I have over 500 tables in each database, this takes quite a
> long time.

Some weeks ago, I posted here a script that uses psql to create split
dumps. Maybe you can reuse some of its logics to create per-table
md5sums for all tables in a database automatically.

Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Attachment

Re: [ADMIN] Postgres schema comparison.

From
Tom Lane
Date:
Stef <svb@ucs.co.za> writes:
> Markus Schaber mentioned :
> => But maybe a pg_dump --schema-only on all the databases, and then
> => manually diffing the files may already fulfil your needs.

> I've tested something similar, that seems to work ok for me for now :
> pg_dump -s -t [TABLE] [DBNAME] | grep -v "^--" | md5sum

> The problem I have with this, is that I have to run the command per table,

Why?

If the problem is varying order of table declarations, try 8.0's
pg_dump.

            regards, tom lane

Re: [ADMIN] Postgres schema comparison.

From
Stef
Date:
John DeSoi mentioned :
=> Develop a function that builds a string describing the tables/schemas
=> you want to compare. Then have your function return the md5 sum of the
=> string as the result. This will give you a 32 character value you can
=> use to determine if there is a mismatch.

OK, this may be exactly what I need. I've compiled and installed contrib/pgcrypto
and I want to use either one of :
 Result data type | Schema |  Name  | Argument data types
------------------+--------+--------+---------------------
 bytea            | public | digest | bytea, text
 bytea            | public | digest | text, text

Is it possible to somehow pass the output of : "\d [TABLE NAME]"
to this function? If not, what would return me consistent text
that will describe the columns, indexes and primary keys of a table?

Kind Regards
Stefan

Re: [ADMIN] Postgres schema comparison.

From
Stef
Date:
Markus Schaber mentioned :
=> Some weeks ago, I posted here a script that uses psql to create split
=> dumps. Maybe you can reuse some of its logics to create per-table
=> md5sums for all tables in a database automatically.


Thanks, but I've got something very similar to this already. I almost
thought you managed to split the output of the single schema dump of
"pg_dump --schema-only" onto portions belonging to the various tables.
That would be very impressive :)

Kind Regards
Stefan

Re: [ADMIN] Postgres schema comparison.

From
Stef
Date:
Tom Lane mentioned :
=> > The problem I have with this, is that I have to run the command per table,
=>
=> Why?
=>
=> If the problem is varying order of table declarations, try 8.0's
=> pg_dump.

Yes, this will solve the global schema check, but I will still need to split
it into "per table" dumps , to do "per table" comparisons.

Kind Regards
Stefan

Re: [ADMIN] Postgres schema comparison.

From
John DeSoi
Date:
On Mar 7, 2005, at 10:09 AM, Stef wrote:

> Is it possible to somehow pass the output of : "\d [TABLE NAME]"
> to this function? If not, what would return me consistent text
> that will describe the columns, indexes and primary keys of a table?
>

I'm not sure you can use \d directly, but if you startup psql with the
-E option it will show you all the SQL it is using to run the \d
command. It should be fairly easy to get the strings you need from the
results of running a similar query. The psql source is a good place to
look also.


John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


Re: [SOLVED] Postgres schema comparison.

From
Stef
Date:
John DeSoi mentioned :
=> I'm not sure you can use \d directly, but if you startup psql with the
=> -E option it will show you all the SQL it is using to run the \d
=> command. It should be fairly easy to get the strings you need from the
=> results of running a similar query. The psql source is a good place to
=> look also.

Sometimes you just need to see things from a different perspective.
Thanks!

Here's my final solution that runs in less than a minute for +- 543 tables :
for x in $(psql -tc "select relname from pg_class where relkind = 'r' and relname not like 'pg_%'")
do
   echo "$(psql -tc "select  encode(digest('$(psql -c  '\d '${x}'' mer9188_test | tr -d \"\'\")', 'md5'), 'hex')"
mer9188_test| grep -v "^$"|tr -d " "):${x}" 
done > compare_list.lst

Re: [ADMIN] Postgres schema comparison.

From
"Jim Buttafuoco"
Date:
Stef,

I use dblink to attach to both databases and query pg_namespace, pg_class, pg_attribute ... to get the diffs.  See
attached as an example.  look for the dblink_connect lines to specify your database.  You will need to install
contrib/dblink.  I used this with 7.4.X series and have NOT tested yet with 8.0.X.

You can adjust the output to fit your needs.

Jim


---------- Original Message -----------
From: Stef <svb@ucs.co.za>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-ADMIN@postgresql.org, pgsql-sql@postgresql.org
Sent: Mon, 7 Mar 2005 17:31:55 +0200
Subject: Re: [ADMIN] [SQL] Postgres schema comparison.

> Tom Lane mentioned :
> => > The problem I have with this, is that I have to run the command per table,
> =>
> => Why?
> =>
> => If the problem is varying order of table declarations, try 8.0's
> => pg_dump.
>
> Yes, this will solve the global schema check, but I will still need to split
> it into "per table" dumps , to do "per table" comparisons.
>
> Kind Regards
> Stefan
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
------- End of Original Message -------


Attachment

Re: [ADMIN] Postgres schema comparison.

From
Stef
Date:
Jim Buttafuoco mentioned :
=> I use dblink to attach to both databases and query pg_namespace, pg_class, pg_attribute ... to get the diffs.  See
=> attached as an example.  look for the dblink_connect lines to specify your database.  You will need to install
=> contrib/dblink.  I used this with 7.4.X series and have NOT tested yet with 8.0.X.

Thanks!

This is something I haven't even thought of.
Only some of the machines have dblink installed at the moment,
but that's the same work as having to install pgcrypto everywhere.

This is actually more thorough. It seems to be working with some minor changes on 7.3
(The dblink functions don't allow multiple connections, and take only one argument, so
I created temp tables in stead). This is actually very fast.

Thanks again.

Kind Regards
Stefan

Re: [SOLVED] Postgres schema comparison.

From
Stef
Date:
Hi all,

If anyone is interested, here's the final solution
that I'm using to build a list of tables and their md5sums
based on what the psql interface queries when you do '\d [TABLE NAME]'

I attached the function I created, and this is the SQL I run :
select relname||':'||get_table_checksum(relname) from pg_class where relkind = 'r' and relname not like ('pg_%') and
relnamenot like ('sql_%') order by relname; 

This gives the same result for a specific table across  all versions of postgres  >= 7.3,
and runs for a minute or so for +- 450 tables on my machine.
It may break if you have some exotic definitions that I didn't test for,
but I think it's pretty solid as it is here.

Kind Regards
Stefan

Stef mentioned :
=> Here's my final solution that runs in less than a minute for +- 543 tables :
=> for x in $(psql -tc "select relname from pg_class where relkind = 'r' and relname not like 'pg_%'")
=> do
=>    echo "$(psql -tc "select  encode(digest('$(psql -c  '\d '${x}'' mer9188_test | tr -d \"\'\")', 'md5'), 'hex')"
mer9188_test| grep -v "^$"|tr -d " "):${x}" 
=> done > compare_list.lst
Attachment

Re: [ADMIN] Postgres schema comparison.

From
Matteo Beccati
Date:
Hi,

> I have the wonderful job of re-synch'ing  all the schemas out there not
> conforming to the master. I've looked everywhere for something that
> will help doing this. I'm specifically looking for a way to do a sumcheck
> or something similar on tables and/or schema as a whole to be able to
> do a table comparison with the master database.
>
> It will be a bonus to pick up exactly what is missing, but for now, just identifying
> differences is what I want to achieve. I'm using postgres 7.3 mostly, but
> I may want to use this for 7.4 and 8.0 databases as well.
>
> Has anybody got some suggestions of what I can do or use to do this.


I've made a simple PHP script which compares the schemas of two
databases. It was made in a hurry and is far from being complete, but it
  works for my purposes :)

If you want give it a try, let me know


Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com