Thread: Replica string comparsion issue
Hello,
--
I'm observing strange behaviour on comparing ::text field with string while quering replica.
Here is the table structure:
=# \dS+ raw.symbols_aggregates
Table "raw.symbols_aggregates"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------------------+---------+-----------+----------+------------------------------------------------+----------+--------------+-------------
id | bigint | | not null | nextval('symbols_aggregates_id_seq'::regclass) | plain | |
symbol | text | | not null | | extended | |
id_type | bigint | | not null | | plain | |
id_aggregates_list | integer | | not null | | plain | |
id_regionals_list | integer | | not null | | plain | |
date_started | date | | not null | | plain | |
Indexes:
"symbols_aggregates_pkey" PRIMARY KEY, btree (id)
"symbols_aggregates_uniq" UNIQUE CONSTRAINT, btree (symbol, id_type, id_aggregates_list, id_regionals_list)
Foreign-key constraints:
"symbols_aggregates_id_aggregates_list_fkey" FOREIGN KEY (id_aggregates_list) REFERENCES aggregates_list(id)
"symbols_aggregates_id_regionals_list_fkey" FOREIGN KEY (id_regionals_list) REFERENCES regionals_list(id)
"symbols_aggregates_id_type_fkey" FOREIGN KEY (id_type) REFERENCES types_list(id)
Simple query on master works as expected:
=# select symbol from raw.symbols_aggregates where symbol='RUT';
symbol
--------
RUT
RUT
RUT
(3 rows)
The same query on replica works only after casting to varchar, using trim(), or something else:
(3 rows expected here)
=# select symbol from raw.symbols_aggregates where symbol='RUT';
symbol
--------
(0 rows)
with ::varchar(50) result looks correct:
=# select symbol from raw.symbols_aggregates where symbol::varchar(50)='RUT';
symbol
--------
RUT
RUT
RUT
There is no hidden characters, it looks identical with ::bytea
=# select symbol::bytea, 'RUT'::bytea, (symbol::varchar(50))::bytea from raw.symbols_aggregates where symbol::varchar(50)='RUT';
symbol | bytea | symbol
----------+----------+----------
\x525554 | \x525554 | \x525554
\x525554 | \x525554 | \x525554
\x525554 | \x525554 | \x525554
PostgreSQL server version is 10.3. It might be important, that master is running on Ubuntu 16.04.4 LTS and replica on Solaris 11.3.
PostgreSQL installed from binaries, not from sources. All locales are en_US.UTF-8
My suggestion it is a kind of collation issue, but I've no idea why :text is not working in this case.
Regards, Andrei Lizenko
On Wed, Jun 13, 2018 at 1:10 PM, Andrey Lizenko <lizenko79@gmail.com> wrote: > I'm observing strange behaviour on comparing ::text field with string while > quering replica. The first thing to do here is to install amcheck on the replica, and verify that the indexes are consistent with the replica's own notion of how text sorts for the collation that is implicitly in use (the database default): https://www.postgresql.org/docs/current/static/amcheck.html You can modify the example query to check the indexes that you're interested in. I think that there is a very strong chance that the replica has incompatible collation rules, given that it uses a totally different OS. -- Peter Geoghegan
Andrey Lizenko <lizenko79@gmail.com> writes: > I'm observing strange behaviour on comparing ::text field with string while > quering replica. These symptoms seem consistent with the theory that the replica's index for that column (symbols_aggregates_uniq) is corrupt. I think your casts etc are just serving to defeat selection of an indexscan. Why it's corrupt, I dunno, but you might try forcing a reindex on the master. regards, tom lane
Re-indexing didn't help. Will check via amcheck.
Another point regarding collations: query with ORDER BY returns different record set on master and on replica if more than one column used:
Master:
=# select date_started, symbol from raw.symbols_aggregates order by 1,2 limit 5;
date_started | symbol
--------------+---------
1970-01-01 | AADR.EU
1970-01-01 | AADR.IV
1970-01-01 | AADR.NV
1970-01-01 | AADR.SO
1970-01-01 | AADR.TC
Replica:
=# select date_started, symbol from raw.symbols_aggregates order by 1,2 limit 5;
date_started | symbol
--------------+--------------
1970-01-01 | /EPRH18:XMON
1970-01-01 | /EPRM18:XMON
1970-01-01 | /FCEF18:XMON
1970-01-01 | /FCEG18:XMON
1970-01-01 | /FCEH18:XMON
On Wed, Jun 13, 2018 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrey Lizenko <lizenko79@gmail.com> writes:
> I'm observing strange behaviour on comparing ::text field with string while
> quering replica.
These symptoms seem consistent with the theory that the replica's index
for that column (symbols_aggregates_uniq) is corrupt. I think your
casts etc are just serving to defeat selection of an indexscan.
Why it's corrupt, I dunno, but you might try forcing a reindex on the
master.
regards, tom lane
Regards, Andrei Lizenko
Andrey Lizenko <lizenko79@gmail.com> writes: > Another point regarding collations: query with ORDER BY returns different > record set on master and on replica if more than one column used: When I saw Peter's reply, I figured he'd nailed it, and this seems to be more evidence pointing in that direction. If the standby's OS sorts strings differently than the master does, the index will seem corrupt to it, even if it's fine on the master. regards, tom lane