Re: pg_dump and pgpool - Mailing list pgsql-general

From Tatsuo Ishii
Subject Re: pg_dump and pgpool
Date
Msg-id 20041231.155537.42772569.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: pg_dump and pgpool  (Scott Marlowe <smarlowe@g2switchworks.com>)
List pgsql-general
> On Thu, 2004-12-30 at 09:20, Tom Lane wrote:
> > Scott Marlowe <smarlowe@g2switchworks.com> writes:
> > >> I don't think it's worth that price to support a fundamentally bogus
> > >> approach to backup.
> >
> > > But it's not bogus.  IT allows me to compare two databases running under
> > > a pgpool synchronous cluster and KNOW if there are inconsistencies in
> > > data between them, so it is quite useful to me.
> >
> > As a data comparison tool it is certainly bogus.  What about different
> > row ordering between the two databases, for instance?
>
> Apparently pgpool knows that different order is ok.

I think pgpool actually behaves different from what you expect.
pgpool just ignores the content of data. Let me show you an example.

on the master:

test=# select * from t1;
 i
---
 1
 2

on the secondary:
test=# select * from t1;
 i
---
 1
 3
(2 rows)

result from pgpool:

test=# select * from t1;
 i
---
 1
 2

However it checks the packet length. Here is another example.

on the master:

test=# select * from t2;
  t
-----
 abc
(1 row)

on the secondary:

test=# select * from t2;
  t
------
 abcd
(1 row)

result from pgpool:

test=# select * from t2;
  t
-----
 abc
(1 row)

LOG: pid 1093: SimpleForwardToFrontend: length does not match between backends master(13) secondary(14) kind:(D)

>  Having three psql's
> open, one to the front end pgpool, one to each of the backends, I can
> insert data in different orders on each backend, select it on each, and
> get a different order, but from the front end it works:
>
> on the MASTER database:
> test=# \d
>         List of relations
>  Schema | Name | Type  |  Owner
> --------+------+-------+----------
>  public | test | table | postgres
> (1 row)
>
> test=# insert into test values (2);
> INSERT 11839388 1
> test=# insert into test values (1);
> INSERT 11839389 1
> test=# select * from test;
>  id
> ----
>   2
>   1
> (2 rows)
>
> on the SLAVE database:
> test=# \d
>         List of relations
>  Schema | Name | Type  |  Owner
> --------+------+-------+----------
>  public | test | table | postgres
> (1 row)
>
> test=# insert into test values (1);
> INSERT 13612414 1
> test=# insert into test values (2);
> INSERT 13612415 1
> test=# select * from test;
>  id
> ----
>   1
>   2
> (2 rows)
>
> On the front end:
> test=# select * from test;
>  id
> ----
>   2
>   1
> (2 rows)
>
> Now I add a wrong row to the slave database:
>
> test=# insert into test values (3);
> INSERT 13612416 1
>
> and I get this error from the front end:
> test=# select * from test;
> ERROR:  kind mismatch between backends
> HINT:  check data consistency between master and secondary
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
> After deleting the row, things return to normal:
> test=# delete from test where id=3;
> DELETE 1
> >From the front end I get:
> test=# select * from test;
>  id
> ----
>   2
>   1
> (2 rows)
>
> > AFAICS this could only work if you were doing physical rather than
> > logical replication (eg, shipping WAL logs) in which case the OIDs would
> > be just as much in sync as everything else.
>
> So, for me, the OIDs are the ONLY problem I'm getting here.  Note that
> the application we're running on the front end only connects to the
> database with a single thread, and serializes in the intermediate layer
> (not my choice, but it seems to work pretty well so far...) so sequences
> also aren't an issue, as all the queries will go in one at a time.

I think in this case the row ordering problem will not hurt you.

> > Basically my point is that you are proposing to do a lot of work in
> > order to solve the first problem you are running up against, but that
> > will only get you to the next problem.  I'm not prepared to accept a
> > significant increase in complexity and loss of maintainability in
> > pg_dump in order to move one step closer to the dead end that you will
> > certainly hit.
>
> I'm certainly willing to do the vast majority of the work.  As Greg I
> think mentioned, maybe a fresh start using the information_schema would
> make sense as a sort of non-pg specific backup tool or something.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

pgsql-general by date:

Previous
From: Jaime Casanova
Date:
Subject: Re: [PATCHES] reqd patch
Next
From: Secrétariat
Date:
Subject: Re: Update rule