make_greater_string() does not return a string in some cases - Mailing list pgsql-bugs

From Tatsuhito Kasahara
Subject make_greater_string() does not return a string in some cases
Date
Msg-id 4C1F389B.1010606@oss.ntt.co.jp
Whole thread Raw
Responses Re: make_greater_string() does not return a string in some cases  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hi !

make_greater_string() does not return a string when some UTF8 strings
set to str_const.
# Especially UTF8 strings which contains 'BF' in last byte.

Because make_greater_string() only try incrementing the last byte of
the string, and not try same test for upper bytes.

Therefore, some queries which contains "LIKE '<contains 'BF' in last byte>%'"
can not perform (Btree's) index-scan.
# Or may be nearly full-index-scan.

# See follwing example.
===============================================================================
'西' (Japanese Letter) : 0xE8A5BF

[client : UTF8 ⇔ server : EUC_JP]
=# EXPLAIN ANALYZE SELECT * FROM test2 WHERE name LIKE '西%';
                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Index Scan using test2_name on test2  (cost=0.00..8.28 rows=1 width=3) (actual time=0.077..0.078 rows=1 loops=1)
  Index Cond: ((name >= '西'::text) AND (name < '誠'::text))  <-- Index-scan is chosen
  Filter: (name ~~ '西%'::text)
 Total runtime: 0.110 ms
(4 rows)

[client : UTF8 ⇔ server : UTF8]
=# EXPLAIN ANALYZE SELECT * FROM test2 WHERE name LIKE '西%';
                                            QUERY PLAN
----------------------------------------------------------------------------------------------------
 Seq Scan on test2  (cost=0.00..1693.01 rows=1 width=4) (actual time=22.598..22.599 rows=1 loops=1)
  Filter: (name ~~ '西%'::text)  <-- Seq-scan is chosen !
 Total runtime: 22.626 ms
(3 rows)
===============================================================================

Attached patch solve above problem.

Best regards,

--
NTT OSS Center
Tatsuhito Kasahara



diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fc3c5b0..fdf58cf 100644
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
*************** make_greater_string(const Const *str_con
*** 5542,5552 ****
          *lastchar = savelastchar;

          /*
!          * Truncate off the last character, which might be more than 1 byte,
!          * depending on the character encoding.
           */
          if (datatype != BYTEAOID && pg_database_encoding_max_length() > 1)
!             len = pg_mbcliplen(workstr, len, len - 1);
          else
              len -= 1;

--- 5542,5567 ----
          *lastchar = savelastchar;

          /*
!          * Increment the previous character, or truncate off the last character,
!          * which might be more than 1 byte, depending on the character encoding.
           */
          if (datatype != BYTEAOID && pg_database_encoding_max_length() > 1)
!         {
!             int        i;
!             int        cliplen = pg_mbcliplen(workstr, len, len - 1);
!
!             for (i = len - 1; i > cliplen; i--)
!             {
!                 if ((unsigned char) workstr[i] < (unsigned char) 255)
!                 {
!                     workstr[i]++;
!                     memset(workstr + i + 1, 1 /* or 0? */, len - i);
!                     break;
!                 }
!             }
!             if (i <= cliplen)
!                 len = cliplen;
!         }
          else
              len -= 1;


pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Bug in PL/pgSQL FOR cursor variant
Next
From: "Yuhui"
Date:
Subject: BUG #5516: Memory grows up problem