Re: Two 'identical' DB's not acting identically - Mailing list pgsql-general

From Jean-Christian Imbeault
Subject Re: Two 'identical' DB's not acting identically
Date
Msg-id 3E9CCD7D.80008@mega-bucks.co.jp
Whole thread Raw
In response to Two 'identical' DB's not acting identically  (Jean-Christian Imbeault <jc@mega-bucks.co.jp>)
List pgsql-general
Tom Lane wrote:
>
> You know, this could be a matter of a data discrepancy, like you somehow
> loaded two copies of data into some table.  It'd be worth doing a
> "select count(*)" on each table in each database to see if they agree.

That's exactly what it looks like. I had this error message once before
on the live DB and it was caused by inserting data twice into a table.

I compared the counts of all the tables and they are identical. I
checked the counts of the the tables involved in the query that gives an
error (just to be sure) and they are identical.

The database versions for both are the same (both installed from source,
but the source files where downloaded at different times):

$ psql --version
psql (PostgreSQL) 7.3.2

An explain (after a vacuum full analyze) gave this difference:

$ diff tal.explain test.explain
22,23c22,23
<                                        ->  Materialize
(cost=63.99..63.99 rows=1 width=4)
<                                              ->  Seq Scan on wards
(cost=0.00..63.99 rows=1 width=4)
---
 >                                        ->  Materialize
(cost=63.99..63.99 rows=5 width=4)
 >                                              ->  Seq Scan on wards
(cost=0.00..63.99 rows=5 width=4)


This leaves me to believe that the problem is related to a character
encoding problem? invoices.ward and wards.ward are text fields
containing EUC_JP. Looks like the TEST db cannot properly do a text
comparison on EUC_JP text?

I did some more testing and:

TAL=# select invoices.ward, wards.ward, wards.id from wards where
ward=invoices.ward;
NOTICE:  Adding missing FROM-clause entry for table "invoices"
   ward  |  ward  |  id
--------+--------+------
  ?????| ?????| 1016
  ?????| ?????| 1023
  ?????| ?????| 1023
  ??????| ??????| 1026
  ?????? | ?????? | 1038
(5 rows)


TEST=# select invoices.ward, wards.ward, wards.id from wards where
ward=invoices.ward;
NOTICE:  Adding missing FROM-clause entry for table "invoices"
   ward  |  ward  |  id
--------+--------+------
  ?????| ?????| 1016
  ?????| ?????| 1023
  ?????| ?????| 1023
  ??????| ??????| 1026
  ?????? | ?????? | 1038
  ?????| ?????? | 2065
(6 rows)

The last row does *not* contain a match yet it is returned as a matching
row ....

But both databases have the same encoding:

    Name    |  Owner   | Encoding
-----------+----------+----------
  TAL       | postgres | EUC_JP


    Name    |  Owner   | Encoding
-----------+----------+----------
  TEST      | postgres | EUC_JP


Hope some of this helps in figuring out why the two databases are
different ...

Thanks!

Jean-Christian Imbeault


pgsql-general by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: OT: mail server blocked
Next
From: Robert Treat
Date:
Subject: Re: Postgres Compare