Strangeness with UNIQUE indexes and UTF-8 - Mailing list pgsql-hackers

From Omar Kilani
Subject Strangeness with UNIQUE indexes and UTF-8
Date
Msg-id CA+8F9hj+ARAwzVzT=vdEykHajQLogixcbNDiaSBDB33sZxGA-Q@mail.gmail.com
Whole thread Raw
Responses Re: Strangeness with UNIQUE indexes and UTF-8  (Laurenz Albe <laurenz.albe@cybertec.at>)
Re: Strangeness with UNIQUE indexes and UTF-8  (David Rowley <dgrowleyml@gmail.com>)
Re: Strangeness with UNIQUE indexes and UTF-8  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
Hi,

There seems to be a weird bug in Postgres (last tested 11.12) where it
allows an INSERT into a table with a UNIQUE / UNIQUE CONSTRAINT index
on a TEXT/VARCHAR when there's already a value present in that index,
but only for UTF-8 input.

I just had this happen on our user table and it somehow made it so
that Postgres returned no results for *any* SELECT ... FROM x WHERE
unique_col = 'x', which unfortunately meant no one could login to our
service.

I had to:

SET enable_indexscan = off;
SET enable_bitmapscan = off;

And then the data was returned properly. I thought maybe the index was
corrupt somehow, so I tried to reindex the unique index, which failed
because "nur" was present twice.

I modified the value in that column by the primary key (which is an
integer), and that allowed me to reindex, after which queries against
the column started working properly again.

My collation settings:

 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |

I've had this happen before on a different table with cyerrlic UTF-8
input, but didn't really have much to go on debugging wise.

What I sort of don't get is... before we insert anything into these
tables, we always check to see if a value already exists. And Postgres
must be returning no results for some reason. So it goes to insert a
duplicate value which somehow succeeds despite the unique index, but
then a reindex says it's a duplicate. Pretty weird.

Regards,
Omar



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: SQL-standard function body
Next
From: Laurenz Albe
Date:
Subject: Re: Strangeness with UNIQUE indexes and UTF-8