Re: pg_dump and pgpool - Mailing list pgsql-general
From | Tatsuo Ishii |
---|---|
Subject | Re: pg_dump and pgpool |
Date | |
Msg-id | 20041231.004629.57439590.t-ishii@sra.co.jp Whole thread Raw |
In response to | Re: pg_dump and pgpool (Scott Marlowe <smarlowe@g2switchworks.com>) |
Responses |
Re: pg_dump and pgpool
(Scott Marlowe <smarlowe@g2switchworks.com>)
|
List | pgsql-general |
> On Wed, 2004-12-29 at 17:30, Tom Lane wrote: > > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > > On Wed, 2004-12-29 at 16:56, Tom Lane wrote: > > >> No, we'd be throwing more, and more complex, queries. Instead of a > > >> simple lookup there would be some kind of join, or at least a lookup > > >> that uses a multicolumn key. > > > > > I'm willing to bet the performance difference is less than noise. > > > > [ shrug... ] I don't have a good handle on that, and neither do you. > > What I am quite sure about though is that pg_dump would become internally > > a great deal messier and harder to maintain if it couldn't use OIDs. > > Look at the DumpableObject manipulations and ask yourself what you're > > going to do instead if you have to use a primary key that is of a > > different kind (different numbers of columns and datatypes) for each > > system catalog. Ugh. > > Wait, do you mean it's impossible to throw a single SQL query with a > proper join clause that USES OIDs but doesn't return them? Or that it's > impossible to throw a single query without joining on OIDs. I don't > mind joining on OIDs, I just don't want them crossing the connection is > all. And yes, it might be ugly, but I can't imagine it being > unmaintable for some reason. > > > I don't think it's worth that price to support a fundamentally bogus > > approach to backup. > > But it's not bogus. IT allows me to compare two databases running under > a pgpool synchronous cluster and KNOW if there are inconsistencies in > data between them, so it is quite useful to me. > > > IMHO you don't want extra layers of software in > > between pg_dump and the database --- each one just introduces another > > risk of getting a wrong backup. You've yet to explain what the > > *benefit* of putting pgpool in there is for this problem. > > Actually, it ensures that I get the right backup, because pgpool will > cause the backup to fail if there are any differences between the two > backend servers, thus telling me that I have an inconsistency. > > That's the primary reason I want this. The secondary reason, which I > can work around, is that I'm running the individual databases on > machines that only answer the specific IP of the pgpool machine's IP, so > remote backups aren't possible, and only the pgpool machine would be > capable of doing the backups, but we have (like so many other companies) > a centralized backup server. I can always allow that machine to connect > to the database(s) to do backup, but my fear is that by allowing > anything other than pgpool to hit those backend databases they could be > placed out of sync with each other. Admitted, a backup process > shouldn't be updating the database, so this, as I said, isn't really a > big deal. More of a mild kink really. As long as all access is > happening through pgpool, they should stay coherent to each other. Pgpool could be modified so that it has "no SELECT replication mode", where pgpool runs SELECT on only master server. I could do this if you think it's usefull. However problem is pg_dump is not only running SELECT but also modifying database (counting up OID counter), i.e. it creates temporary tables. Is this a problem for you? -- Tatsuo Ishii
pgsql-general by date: