Thread: BUG #18560: Inconsistent Behavior of PostgreSQL 'LIKE' Operator
The following bug has been logged on the website: Bug reference: 18560 Logged by: Hang ammmkilo Email address: ammmkilo@163.com PostgreSQL version: 16.3 Operating system: ubuntu 20.04 Description: ```create table DROP TABLE IF EXISTS t0; CREATE TABLE t0(c0 INT NOT NULL, c1 CHAR(1) UNIQUE); INSERT INTO t0 VALUES (0, '1'); INSERT INTO t0 VALUES (0, ''); ``` ---sql1 select ('' like ''),('1' like '1'); ?column? | ?column? ----------+---------- t | t (1 row) ---sql2 select c1,(c1 like c1) from t0; c1 | ?column? ----+---------- 1 | t | f (2 rows) In sql1, the result of the (" like ") query returns true. In sql2 (c1 like c1) the return value is false when c1 is ". I think this is contradictory and there may be some bug between the 'like' and the query column. ---select version(): version --------------------------------------------------------------------------------------------------------------------- PostgreSQL 16.3 (Debian 16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit (1 row)
Hi, > In sql1, the result of the (" like ") query returns true. In sql2 (c1 like > c1) the return value is false when c1 is ". > I think this is contradictory and there may be some bug between the 'like' > and the query column. Actually the result is consistent: ``` SELECT c1, c1 = ' ' FROM t0; c1 | ?column? ----+---------- 1 | f | t (2 rows) select ' '::char(1) like ' '::char(1); ?column? ---------- f (1 row) ``` Although I find it puzzling too that LIKE operators for CHAR(1) and TEXT work differently. Not 100% sure if this is intended. Thoughts? -- Best regards, Aleksander Alekseev
Re: BUG #18560: Inconsistent Behavior of PostgreSQL 'LIKE' Operator
From
hubert depesz lubaczewski
Date:
On Wed, Jul 31, 2024 at 04:47:23AM +0000, PG Bug reporting form wrote: > CREATE TABLE t0(c0 INT NOT NULL, c1 CHAR(1) UNIQUE); I bet the thing is simply one more edge case that proves that one shouldn't be using char(n) datatype: https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_char.28n.29 Best regards, depesz
On Wed, Jul 31, 2024 at 5:20 PM Aleksander Alekseev <aleksander@timescale.com> wrote: > > Hi, > > > In sql1, the result of the (" like ") query returns true. In sql2 (c1 like > > c1) the return value is false when c1 is ". > > I think this is contradictory and there may be some bug between the 'like' > > and the query column. > > Actually the result is consistent: > > ``` > SELECT c1, c1 = ' ' FROM t0; > c1 | ?column? > ----+---------- > 1 | f > | t > (2 rows) > > select ' '::char(1) like ' '::char(1); > ?column? > ---------- > f > (1 row) > ``` > > Although I find it puzzling too that LIKE operators for CHAR(1) and > TEXT work differently. Not 100% sure if this is intended. > > Thoughts? The reason for *select ' '::char(1) like ' '::char(1)* returns false is that the pattern got trimmed(see rtrim1) to length 0, and the first ' '::char(1) has length 1, so it doesn't match. But for *select ' '::text like ' '::text*, the pattern is not trimmed. The rtrim1 function is in a file named oracle_compat.c, can anybody verify how oracle behaves in this case? > > -- > Best regards, > Aleksander Alekseev > > -- Regards Junwang Zhao
hubert depesz lubaczewski <depesz@depesz.com> writes: > On Wed, Jul 31, 2024 at 04:47:23AM +0000, PG Bug reporting form wrote: >> CREATE TABLE t0(c0 INT NOT NULL, c1 CHAR(1) UNIQUE); > I bet the thing is simply one more edge case that proves that one > shouldn't be using char(n) datatype: > https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_char.28n.29 Yeah, exactly. Some surprising choices were made way-back-when about which char(N) operations ought to ignore trailing blanks and which shouldn't. In the case at hand, a closer look shows the problem: =# explain verbose select c1,(c1 like c1) from t0; QUERY PLAN ------------------------------------------------------------- Seq Scan on public.t0 (cost=0.00..43.00 rows=2200 width=6) Output: c1, (c1 ~~ (c1)::text) There is a "char ~~ text" operator but no "char ~~ char" operator, so the parser coerces the right-hand "c1" to text --- which causes stripping of its trailing blank --- and then applies ~~, which treats the trailing blank in its left-hand argument as significant. Yes, this is arguably inconsistent, but so are a lot of other char(N) behaviors. Proposals to change anything about it have generally failed on the grounds that (1) it's not very clear what would work better and not just move the surprises around; (2) we risk breaking applications that are expecting the current behaviors; and (3) char(N) is a deprecated backwater that we shouldn't be putting any effort into. regards, tom lane