LIKE predicate and ERROR: 42P22: could not determine which collation to use for string comparison - HINT: Use the COLLATE clause ... - Mailing list pgsql-bugs

From John Lumby
Subject LIKE predicate and ERROR: 42P22: could not determine which collation to use for string comparison - HINT: Use the COLLATE clause ...
Date
Msg-id COL116-W333FEC5259FA21D7EF20F3A3B90@phx.gbl
Whole thread Raw
Responses Re: LIKE predicate and ERROR: 42P22: could not determine which collation to use for string comparison - HINT: Use the COLLATE clause ...
List pgsql-bugs
postgresql 9.1.2 running on linux debian squeeze on 32-bit x86

I have encountered behaviour relating to subject line which I'm fairly sure=
 is a bug,=A0 although I'm not sure exactly what.

My database was created with LC_LOCALE C,=A0=A0 and loaded using Pg-9.1.1. =
=A0 it looks like

psql -d proteusdb -c "select datname, datdba, encoding, datcollate, datctyp=
e, datistemplate, datallowconn, datconnlimit, datlastsysoid, datfrozenxid, =
dattablespace from pg_database where datname =3D 'proteusdb'";
=A0 datname=A0 | datdba | encoding | datcollate | datctype | datistemplate =
| datallowconn | datconnlimit | datlastsysoid | datfrozenxid | dattablespac=
e=20
-----------+--------+----------+------------+----------+---------------+---=
-----------+--------------+---------------+--------------+---------------
=A0proteusdb |=A0=A0=A0=A0 10 |=A0=A0=A0=A0=A0=A0=A0 6 | C=A0=A0=A0=A0=A0=
=A0=A0=A0=A0 | C=A0=A0=A0=A0=A0=A0=A0 | f=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0 | t=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 |=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 -1=
 |=A0=A0=A0=A0=A0=A0=A0=A0 11910 |=A0=A0=A0=A0=A0=A0=A0=A0=A0 670 |=A0=A0=
=A0=A0=A0=A0=A0=A0=A0 1663
(1 row)


=A0I have a table which as a column named discriminator :

CREATE TABLE entity (
=A0=A0=A0 id bigint DEFAULT nextval('entity_id_seq'::regclass) NOT NULL,
=A0=A0=A0 discriminator character varying(255) NOT NULL,
=A0=A0=A0 version integer NOT NULL,
=A0=A0=A0 inherit_right boolean,
=A0=A0=A0 name text,
=A0=A0=A0 parent_id bigint,
=A0=A0=A0 association_id bigint,
=A0=A0=A0 association2_id bigint,
=A0=A0=A0 long1 bigint,
=A0=A0=A0 long2 bigint
);

which shows up looking like this in the catalog

psql -d proteusdb -c "select N.nspname , C.relname, A.attrelid , A.attname =
, A.atttypid , T.typname , L.collname from pg_attribute A, pg_type T, pg_cl=
ass C , pg_namespace N , pg_collation L where C.oid =3D A.attrelid and T.oi=
d =3D A.atttypid and A.attcollation =3D L.oid and C.relnamespace =3D N.oid =
and N.nspname =3D 'public' and C.relname =3D 'entity' and A.attname =3D 'di=
scriminator'";
=A0nspname | relname | attrelid |=A0=A0=A0 attname=A0=A0=A0 | atttypid | ty=
pname | collname=20
---------+---------+----------+---------------+----------+---------+-------=
---
=A0 public=A0 | entity=A0 |=A0=A0=A0 25349 | discriminator |=A0=A0=A0=A0 10=
43 | varchar | default
(1 row)

and a query like so :

psql -d proteusdb -c "select count(*) from entity=A0 as e1 where ( e1.assoc=
iation_id IN ( select id from entity as e2 where e2.parent_id=3D3043508)=A0=
=A0 OR e1.association_id=3D3043508 ) and e1.discriminator LIKE 'DEPLOY%'";
ERROR:=A0 could not determine which collation to use for string comparison
HINT:=A0 Use the COLLATE clause to set the collation explicitly.

explain gives a clue what is going on :
psql -d proteusdb -c "explain select count(*) from entity=A0 as e1 where ( =
e1.association_id IN ( select id from entity as e2 where e2.parent_id=3D304=
3508)=A0=A0 OR e1.association_id=3D3043508 ) and e1.discriminator LIKE 'DEP=
LOY%'";
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0 QUERY PLAN=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=20
---------------------------------------------------------------------------=
------------------------------------------
=A0Aggregate=A0 (cost=3D85045.70..85045.71 rows=3D1 width=3D0)
=A0=A0 ->=A0 Index Scan using entity_id_association_id_discriminator on ent=
ity e1=A0 (cost=3D523.55..85045.61 rows=3D36 width=3D0)
=A0=A0=A0=A0=A0=A0=A0=A0 Index Cond: (((discriminator)::text >=3D 'DEPLOY':=
:text) AND ((discriminator)::text < 'DEPLOZ'::text))
=A0=A0=A0=A0=A0=A0=A0=A0 Filter: (((discriminator)::text ~~ 'DEPLOY%'::text=
) AND ((hashed SubPlan 1) OR (association_id =3D 3043508)))
=A0=A0=A0=A0=A0=A0=A0=A0 SubPlan 1
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ->=A0 Index Scan using entity_parent_id on e=
ntity e2=A0 (cost=3D0.00..522.84 rows=3D281 width=3D8)
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Index Cond: (parent_id =3D=
 3043508)
(7 rows)


So Pg has rewritten my LIKE pred to add the pair of range predicates so as =
to be able to exploit my index ... Good.
However=A0 ---=A0=A0 It has then discovered that there is a mismatch betwee=
n the collation of the column (default) and the constant (I assume C,=A0 in=
herited from the database).

Well=A0=A0 --=A0=A0=A0 I am not an expert on collation,=A0=A0 but I am fair=
ly sure that the semantics of the LIKE predicate do not involve order,=A0=
=A0 so an error relating to ordering or collation should be impossible.=A0=
=A0=A0=A0 LIKE involves only exact (equality) matching of parts of the colu=
mn value against the pattern.=A0=A0=A0=A0 If the query rewriter has satisif=
ied itself that it is a correct transformation of the query to add the two =
range predicates,=A0=A0 then it must=A0 (should?)=A0 have checked that any =
mismatch of collation does not affect correctness, and therefore no other p=
art of postgres (planner,=A0 runtime) should then object on those grounds.

So=A0 - I *think* there is a=A0 bug,=A0=A0 either that the query should not=
 have been rewritten=A0 (if collation does indeed make the rewrite incorrec=
t),=A0 or else it should have planned and executed the rewritten query.

I re-ran this on PG 9.1.2 to check and the same happens there as well.

Hoping that some expert can make a ruling on this.

John Lumby
=20=09=09=20=09=20=20=20=09=09=20=20

pgsql-bugs by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Security definer "generated column" function used in index
Next
From: John R Pierce
Date:
Subject: Re: Postgres DB creation error WINXP pro