Re: pg_dump and pgpool - Mailing list pgsql-general
From | Scott Marlowe |
---|---|
Subject | Re: pg_dump and pgpool |
Date | |
Msg-id | 1104416174.5893.52.camel@state.g2switchworks.com Whole thread Raw |
In response to | Re: pg_dump and pgpool (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: pg_dump and pgpool
Re: pg_dump and pgpool |
List | pgsql-general |
On Wed, 2004-12-29 at 17:30, Tom Lane wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > On Wed, 2004-12-29 at 16:56, Tom Lane wrote: > >> No, we'd be throwing more, and more complex, queries. Instead of a > >> simple lookup there would be some kind of join, or at least a lookup > >> that uses a multicolumn key. > > > I'm willing to bet the performance difference is less than noise. > > [ shrug... ] I don't have a good handle on that, and neither do you. > What I am quite sure about though is that pg_dump would become internally > a great deal messier and harder to maintain if it couldn't use OIDs. > Look at the DumpableObject manipulations and ask yourself what you're > going to do instead if you have to use a primary key that is of a > different kind (different numbers of columns and datatypes) for each > system catalog. Ugh. Wait, do you mean it's impossible to throw a single SQL query with a proper join clause that USES OIDs but doesn't return them? Or that it's impossible to throw a single query without joining on OIDs. I don't mind joining on OIDs, I just don't want them crossing the connection is all. And yes, it might be ugly, but I can't imagine it being unmaintable for some reason. > I don't think it's worth that price to support a fundamentally bogus > approach to backup. But it's not bogus. IT allows me to compare two databases running under a pgpool synchronous cluster and KNOW if there are inconsistencies in data between them, so it is quite useful to me. > IMHO you don't want extra layers of software in > between pg_dump and the database --- each one just introduces another > risk of getting a wrong backup. You've yet to explain what the > *benefit* of putting pgpool in there is for this problem. Actually, it ensures that I get the right backup, because pgpool will cause the backup to fail if there are any differences between the two backend servers, thus telling me that I have an inconsistency. That's the primary reason I want this. The secondary reason, which I can work around, is that I'm running the individual databases on machines that only answer the specific IP of the pgpool machine's IP, so remote backups aren't possible, and only the pgpool machine would be capable of doing the backups, but we have (like so many other companies) a centralized backup server. I can always allow that machine to connect to the database(s) to do backup, but my fear is that by allowing anything other than pgpool to hit those backend databases they could be placed out of sync with each other. Admitted, a backup process shouldn't be updating the database, so this, as I said, isn't really a big deal. More of a mild kink really. As long as all access is happening through pgpool, they should stay coherent to each other.
pgsql-general by date: