Re: pg_dump and pgpool - Mailing list pgsql-general

From Scott Marlowe
Subject Re: pg_dump and pgpool
Date
Msg-id 1104430532.5893.98.camel@state.g2switchworks.com
Whole thread Raw
In response to Re: pg_dump and pgpool  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_dump and pgpool  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Re: pg_dump and pgpool  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Thu, 2004-12-30 at 09:20, Tom Lane wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
> >> I don't think it's worth that price to support a fundamentally bogus
> >> approach to backup.
>
> > But it's not bogus.  IT allows me to compare two databases running under
> > a pgpool synchronous cluster and KNOW if there are inconsistencies in
> > data between them, so it is quite useful to me.
>
> As a data comparison tool it is certainly bogus.  What about different
> row ordering between the two databases, for instance?

Apparently pgpool knows that different order is ok.  Having three psql's
open, one to the front end pgpool, one to each of the backends, I can
insert data in different orders on each backend, select it on each, and
get a different order, but from the front end it works:

on the MASTER database:
test=# \d
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | test | table | postgres
(1 row)

test=# insert into test values (2);
INSERT 11839388 1
test=# insert into test values (1);
INSERT 11839389 1
test=# select * from test;
 id
----
  2
  1
(2 rows)

on the SLAVE database:
test=# \d
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | test | table | postgres
(1 row)

test=# insert into test values (1);
INSERT 13612414 1
test=# insert into test values (2);
INSERT 13612415 1
test=# select * from test;
 id
----
  1
  2
(2 rows)

On the front end:
test=# select * from test;
 id
----
  2
  1
(2 rows)

Now I add a wrong row to the slave database:

test=# insert into test values (3);
INSERT 13612416 1

and I get this error from the front end:
test=# select * from test;
ERROR:  kind mismatch between backends
HINT:  check data consistency between master and secondary
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
After deleting the row, things return to normal:
test=# delete from test where id=3;
DELETE 1
From the front end I get:
test=# select * from test;
 id
----
  2
  1
(2 rows)

> AFAICS this could only work if you were doing physical rather than
> logical replication (eg, shipping WAL logs) in which case the OIDs would
> be just as much in sync as everything else.

So, for me, the OIDs are the ONLY problem I'm getting here.  Note that
the application we're running on the front end only connects to the
database with a single thread, and serializes in the intermediate layer
(not my choice, but it seems to work pretty well so far...) so sequences
also aren't an issue, as all the queries will go in one at a time.

> Basically my point is that you are proposing to do a lot of work in
> order to solve the first problem you are running up against, but that
> will only get you to the next problem.  I'm not prepared to accept a
> significant increase in complexity and loss of maintainability in
> pg_dump in order to move one step closer to the dead end that you will
> certainly hit.

I'm certainly willing to do the vast majority of the work.  As Greg I
think mentioned, maybe a fresh start using the information_schema would
make sense as a sort of non-pg specific backup tool or something.

pgsql-general by date:

Previous
From: Pierre-Frédéric Caillaud
Date:
Subject: Re: Update rule
Next
From: Miles Keaton
Date:
Subject: possible to DELETE CASCADE?