Re: pg_dump and pgpool - Mailing list pgsql-general
From | Tatsuo Ishii |
---|---|
Subject | Re: pg_dump and pgpool |
Date | |
Msg-id | 20041231.155537.42772569.t-ishii@sra.co.jp Whole thread Raw |
In response to | Re: pg_dump and pgpool (Scott Marlowe <smarlowe@g2switchworks.com>) |
List | pgsql-general |
> On Thu, 2004-12-30 at 09:20, Tom Lane wrote: > > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > >> I don't think it's worth that price to support a fundamentally bogus > > >> approach to backup. > > > > > But it's not bogus. IT allows me to compare two databases running under > > > a pgpool synchronous cluster and KNOW if there are inconsistencies in > > > data between them, so it is quite useful to me. > > > > As a data comparison tool it is certainly bogus. What about different > > row ordering between the two databases, for instance? > > Apparently pgpool knows that different order is ok. I think pgpool actually behaves different from what you expect. pgpool just ignores the content of data. Let me show you an example. on the master: test=# select * from t1; i --- 1 2 on the secondary: test=# select * from t1; i --- 1 3 (2 rows) result from pgpool: test=# select * from t1; i --- 1 2 However it checks the packet length. Here is another example. on the master: test=# select * from t2; t ----- abc (1 row) on the secondary: test=# select * from t2; t ------ abcd (1 row) result from pgpool: test=# select * from t2; t ----- abc (1 row) LOG: pid 1093: SimpleForwardToFrontend: length does not match between backends master(13) secondary(14) kind:(D) > Having three psql's > open, one to the front end pgpool, one to each of the backends, I can > insert data in different orders on each backend, select it on each, and > get a different order, but from the front end it works: > > on the MASTER database: > test=# \d > List of relations > Schema | Name | Type | Owner > --------+------+-------+---------- > public | test | table | postgres > (1 row) > > test=# insert into test values (2); > INSERT 11839388 1 > test=# insert into test values (1); > INSERT 11839389 1 > test=# select * from test; > id > ---- > 2 > 1 > (2 rows) > > on the SLAVE database: > test=# \d > List of relations > Schema | Name | Type | Owner > --------+------+-------+---------- > public | test | table | postgres > (1 row) > > test=# insert into test values (1); > INSERT 13612414 1 > test=# insert into test values (2); > INSERT 13612415 1 > test=# select * from test; > id > ---- > 1 > 2 > (2 rows) > > On the front end: > test=# select * from test; > id > ---- > 2 > 1 > (2 rows) > > Now I add a wrong row to the slave database: > > test=# insert into test values (3); > INSERT 13612416 1 > > and I get this error from the front end: > test=# select * from test; > ERROR: kind mismatch between backends > HINT: check data consistency between master and secondary > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Succeeded. > After deleting the row, things return to normal: > test=# delete from test where id=3; > DELETE 1 > >From the front end I get: > test=# select * from test; > id > ---- > 2 > 1 > (2 rows) > > > AFAICS this could only work if you were doing physical rather than > > logical replication (eg, shipping WAL logs) in which case the OIDs would > > be just as much in sync as everything else. > > So, for me, the OIDs are the ONLY problem I'm getting here. Note that > the application we're running on the front end only connects to the > database with a single thread, and serializes in the intermediate layer > (not my choice, but it seems to work pretty well so far...) so sequences > also aren't an issue, as all the queries will go in one at a time. I think in this case the row ordering problem will not hurt you. > > Basically my point is that you are proposing to do a lot of work in > > order to solve the first problem you are running up against, but that > > will only get you to the next problem. I'm not prepared to accept a > > significant increase in complexity and loss of maintainability in > > pg_dump in order to move one step closer to the dead end that you will > > certainly hit. > > I'm certainly willing to do the vast majority of the work. As Greg I > think mentioned, maybe a fresh start using the information_schema would > make sense as a sort of non-pg specific backup tool or something. > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend >
pgsql-general by date: