Re: pg_dump and pgpool - Mailing list pgsql-general
From | Scott Marlowe |
---|---|
Subject | Re: pg_dump and pgpool |
Date | |
Msg-id | 1104427339.5893.65.camel@state.g2switchworks.com Whole thread Raw |
In response to | Re: pg_dump and pgpool (Tatsuo Ishii <t-ishii@sra.co.jp>) |
List | pgsql-general |
On Thu, 2004-12-30 at 09:46, Tatsuo Ishii wrote: > > On Wed, 2004-12-29 at 17:30, Tom Lane wrote: > > > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > > > On Wed, 2004-12-29 at 16:56, Tom Lane wrote: > > > >> No, we'd be throwing more, and more complex, queries. Instead of a > > > >> simple lookup there would be some kind of join, or at least a lookup > > > >> that uses a multicolumn key. > > > > > > > I'm willing to bet the performance difference is less than noise. > > > > > > [ shrug... ] I don't have a good handle on that, and neither do you. > > > What I am quite sure about though is that pg_dump would become internally > > > a great deal messier and harder to maintain if it couldn't use OIDs. > > > Look at the DumpableObject manipulations and ask yourself what you're > > > going to do instead if you have to use a primary key that is of a > > > different kind (different numbers of columns and datatypes) for each > > > system catalog. Ugh. > > > > Wait, do you mean it's impossible to throw a single SQL query with a > > proper join clause that USES OIDs but doesn't return them? Or that it's > > impossible to throw a single query without joining on OIDs. I don't > > mind joining on OIDs, I just don't want them crossing the connection is > > all. And yes, it might be ugly, but I can't imagine it being > > unmaintable for some reason. > > > > > I don't think it's worth that price to support a fundamentally bogus > > > approach to backup. > > > > But it's not bogus. IT allows me to compare two databases running under > > a pgpool synchronous cluster and KNOW if there are inconsistencies in > > data between them, so it is quite useful to me. > > > > > IMHO you don't want extra layers of software in > > > between pg_dump and the database --- each one just introduces another > > > risk of getting a wrong backup. You've yet to explain what the > > > *benefit* of putting pgpool in there is for this problem. > > > > Actually, it ensures that I get the right backup, because pgpool will > > cause the backup to fail if there are any differences between the two > > backend servers, thus telling me that I have an inconsistency. > > > > That's the primary reason I want this. The secondary reason, which I > > can work around, is that I'm running the individual databases on > > machines that only answer the specific IP of the pgpool machine's IP, so > > remote backups aren't possible, and only the pgpool machine would be > > capable of doing the backups, but we have (like so many other companies) > > a centralized backup server. I can always allow that machine to connect > > to the database(s) to do backup, but my fear is that by allowing > > anything other than pgpool to hit those backend databases they could be > > placed out of sync with each other. Admitted, a backup process > > shouldn't be updating the database, so this, as I said, isn't really a > > big deal. More of a mild kink really. As long as all access is > > happening through pgpool, they should stay coherent to each other. > > Pgpool could be modified so that it has "no SELECT replication mode", > where pgpool runs SELECT on only master server. I could do this if you > think it's usefull. > > However problem is pg_dump is not only running SELECT but also > modifying database (counting up OID counter), i.e. it creates > temporary tables. Is this a problem for you? Does it? I didn't know it used temp tables. It's not that big of a deal, and I'm certain I can work around it. I just really like the idea of a cluster of pg servers running sychronously behind a redirector and looking, for all the world, like one database. But I think it would take log shipping for it to work the way I'm envisioning. I'd much rather see work go into making pgpool run atop >2 servers than this exercise in (_very_) likely futility.
pgsql-general by date: