Thread: Replica string comparsion issue

Replica string comparsion issue

From
Andrey Lizenko
Date:
Hello,
I'm observing strange behaviour on comparing ::text field with string while quering replica.

Here is the table structure:

=# \dS+ raw.symbols_aggregates
                                                        Table "raw.symbols_aggregates"
       Column       |  Type   | Collation | Nullable |                    Default                     | Storage  | Stats target | Description
--------------------+---------+-----------+----------+------------------------------------------------+----------+--------------+-------------
 id                 | bigint  |           | not null | nextval('symbols_aggregates_id_seq'::regclass) | plain    |              |
 symbol             | text    |           | not null |                                                | extended |              |
 id_type            | bigint  |           | not null |                                                | plain    |              |
 id_aggregates_list | integer |           | not null |                                                | plain    |              |
 id_regionals_list  | integer |           | not null |                                                | plain    |              |
 date_started       | date    |           | not null |                                                | plain    |              |
Indexes:
    "symbols_aggregates_pkey" PRIMARY KEY, btree (id)
    "symbols_aggregates_uniq" UNIQUE CONSTRAINT, btree (symbol, id_type, id_aggregates_list, id_regionals_list)
Foreign-key constraints:
    "symbols_aggregates_id_aggregates_list_fkey" FOREIGN KEY (id_aggregates_list) REFERENCES aggregates_list(id)
    "symbols_aggregates_id_regionals_list_fkey" FOREIGN KEY (id_regionals_list) REFERENCES regionals_list(id)
    "symbols_aggregates_id_type_fkey" FOREIGN KEY (id_type) REFERENCES types_list(id)


Simple query on master works as expected:
=# select symbol  from raw.symbols_aggregates where symbol='RUT';
 symbol
--------
 RUT
 RUT
 RUT
(3 rows)

The same query on replica works only after casting to varchar, using trim(), or something else:
(3 rows expected here)
=# select symbol from raw.symbols_aggregates where symbol='RUT';
 symbol
--------
(0 rows)

with ::varchar(50) result looks correct:

=# select symbol from raw.symbols_aggregates where symbol::varchar(50)='RUT';
 symbol
--------
 RUT
 RUT
 RUT


There is no hidden characters, it looks identical with ::bytea 

=# select symbol::bytea, 'RUT'::bytea, (symbol::varchar(50))::bytea from raw.symbols_aggregates where symbol::varchar(50)='RUT';
  symbol  |  bytea   |  symbol
----------+----------+----------
 \x525554 | \x525554 | \x525554
 \x525554 | \x525554 | \x525554
 \x525554 | \x525554 | \x525554


PostgreSQL server version is 10.3. It might be important, that master is running on Ubuntu 16.04.4 LTS and replica on Solaris 11.3.
PostgreSQL installed from binaries, not from sources. All locales are en_US.UTF-8

My suggestion it is a kind of collation issue, but I've no idea why :text is not working in this case.


--
Regards, Andrei Lizenko

Re: Replica string comparsion issue

From
Peter Geoghegan
Date:
On Wed, Jun 13, 2018 at 1:10 PM, Andrey Lizenko <lizenko79@gmail.com> wrote:
> I'm observing strange behaviour on comparing ::text field with string while
> quering replica.

The first thing to do here is to install amcheck on the replica, and
verify that the indexes are consistent with the replica's own notion
of how text sorts for the collation that is implicitly in use (the
database default):

https://www.postgresql.org/docs/current/static/amcheck.html

You can modify the example query to check the indexes that you're
interested in. I think that there is a very strong chance that the
replica has incompatible collation rules, given that it uses a totally
different OS.

-- 
Peter Geoghegan


Re: Replica string comparsion issue

From
Tom Lane
Date:
Andrey Lizenko <lizenko79@gmail.com> writes:
> I'm observing strange behaviour on comparing ::text field with string while
> quering replica.

These symptoms seem consistent with the theory that the replica's index
for that column (symbols_aggregates_uniq) is corrupt.  I think your
casts etc are just serving to defeat selection of an indexscan.

Why it's corrupt, I dunno, but you might try forcing a reindex on the
master.

            regards, tom lane


Re: Replica string comparsion issue

From
Andrey Lizenko
Date:
Re-indexing didn't help. Will check via amcheck.

Another point regarding collations: query with ORDER BY returns different record set on master and on replica if more than one column used:

Master:
=# select date_started, symbol from raw.symbols_aggregates order by 1,2 limit 5;
 date_started | symbol
--------------+---------
 1970-01-01   | AADR.EU
 1970-01-01   | AADR.IV
 1970-01-01   | AADR.NV
 1970-01-01   | AADR.SO
 1970-01-01   | AADR.TC

Replica: 
=# select date_started, symbol from raw.symbols_aggregates order by 1,2 limit 5;
 date_started |    symbol
--------------+--------------
 1970-01-01   | /EPRH18:XMON
 1970-01-01   | /EPRM18:XMON
 1970-01-01   | /FCEF18:XMON
 1970-01-01   | /FCEG18:XMON
 1970-01-01   | /FCEH18:XMON


On Wed, Jun 13, 2018 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrey Lizenko <lizenko79@gmail.com> writes:
> I'm observing strange behaviour on comparing ::text field with string while
> quering replica.

These symptoms seem consistent with the theory that the replica's index
for that column (symbols_aggregates_uniq) is corrupt.  I think your
casts etc are just serving to defeat selection of an indexscan.

Why it's corrupt, I dunno, but you might try forcing a reindex on the
master.

                        regards, tom lane


--
Regards, Andrei Lizenko

Re: Replica string comparsion issue

From
Tom Lane
Date:
Andrey Lizenko <lizenko79@gmail.com> writes:
> Another point regarding collations: query with ORDER BY returns different
> record set on master and on replica if more than one column used:

When I saw Peter's reply, I figured he'd nailed it, and this seems to
be more evidence pointing in that direction.  If the standby's OS
sorts strings differently than the master does, the index will seem
corrupt to it, even if it's fine on the master.

            regards, tom lane