Thread: Comparing two (largish) tables on different servers
This is probably a silly question. Our runtime deployment of database servers (7.4) involves some redundant/duplicate databases. In order to compare tables(about 5 gigs each) on different servers I unload the things (takes a while etc.), sort them with a UNIX sort and thendo a cksum on them. Is there any way to do this from inside postgres that anyone knows of ? I looked through the manual and the contrib stuffand didn't see much ... Thanks, Greg Williamson DBA GlobeXplorer LLC
Idea :Write a program which connects on the two databases, creates a cursor on each to return the rows in order, then compare them as they come (row 1 from cursor 1 == row 1 from cursor 2, etc). Fetchin batchs. If there's a difference you can then know which row.I hope you have an index to sort on, to save you a huge disk sort. On Tue, 9 Nov 2004 14:41:00 -0800, Gregory S. Williamson <gsw@globexplorer.com> wrote: > This is probably a silly question. > > Our runtime deployment of database servers (7.4) involves some > redundant/duplicate databases. In order to compare tables (about 5 gigs > each) on different servers I unload the things (takes a while etc.), > sort them with a UNIX sort and then do a cksum on them. > > Is there any way to do this from inside postgres that anyone knows of ? > I looked through the manual and the contrib stuff and didn't see much ... > > Thanks, > > Greg Williamson > DBA > GlobeXplorer LLC > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
Gregory S. Williamson wrote: >Is there any way to do this from inside postgres that anyone knows of >? I looked through the manual and the contrib stuff and didn't see >much ... Not really "inside postgres"; but could you do something like: mkfifo db1 psql -h "db1" -t -q -c "$query" > db1 mkfifo db2 psql -h "db2" -t -q -c "$query" > db2 diff -u -0 db1 db2 That should work with most shells under Unix. . . Have fun, Sam
On Wed, Nov 10, 2004 at 09:18:21AM +0000, Sam Mason wrote: > > mkfifo db1 > psql -h "db1" -t -q -c "$query" > db1 > mkfifo db2 > psql -h "db2" -t -q -c "$query" > db2 > diff -u -0 db1 db2 This should work for small data sets, but the OP said the tables were about 5G. Unless you use a cursor, psql will fetch the entire result before writing anything. Also, some implementations of diff might read all of the data from one file before reading much from the other file, especially if the files have differences. Hope you have lots of memory.... -- Michael Fuhr http://www.fuhr.org/~mfuhr/