Thread: v7.4.1 text_position() patch

v7.4.1 text_position() patch

From
"Korea PostgreSQL Users' Group"
Date:
In src/backend/utils/adt/varlena.c,
766 line must be exits in block of 'else if (elm >1)' too.

Because, strpos() function make a wrong result in multibyte string. 
line 796
------------
                ps1 = p1 = (pg_wchar *) palloc((len1 + 1) * sizeof(pg_wchar));
                (void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t1), p1, len1);
                len1 = pg_wchar_strlen(p1);
                ps2 = p2 = (pg_wchar *) palloc((len2 + 1) * sizeof(pg_wchar));
                (void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t2), p2, len2);
                len2 = pg_wchar_strlen(p2);

                /*** recalculate px ****/
                px = (len1 - len2);

                for (p = 0; p <= px; p++)


Re: v7.4.1 text_position() patch

From
Tom Lane
Date:
"Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes:
> In src/backend/utils/adt/varlena.c,
> 766 line must be exits in block of 'else if (elm >1)' too.
> Because, strpos() function make a wrong result in multibyte string.

Hm.  I don't think it can actually fail, because the wchar strings are
zero-terminated.  But it does look like there's a missed speedup here.
(Tatsuo, do you agree?)

Thanks for the report!

            regards, tom lane

Re: v7.4.1 text_position() patch

From
"Korea PostgreSQL Users' Group"
Date:
strpos() function ( internal text_postion()) had a bug in unicode database.

dsn=> select id,subject, strpos(subject, ' ') from bd_22 where id = 3927;
  id  |   subject   | strpos
------+-------------+--------
 3927 | 안녕하세요~ |      0
(1 row)

Time: 1.619 ms
dsn=> select id,subject, strpos(subject, ' ') from bd_22 where id between 3925 and 3927;
  id  |                 subject                 | strpos
------+-----------------------------------------+--------
 3925 | 대구에 DB 스터디 관련한...곳..없는가요? |      4
 3927 | 안녕하세요~                             |     11
(2 rows)

Time: 2.490 ms
----
Sorry, above text is korean language. 
strpos returns wrong result.





----- Original Message ----- 
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr>
Cc: <pgsql-patches@postgresql.org>; "Tatsuo Ishii" <t-ishii@sra.co.jp>
Sent: Saturday, January 31, 2004 12:31 AM
Subject: Re: [PATCHES] v7.4.1 text_position() patch 


> Hm.  I don't think it can actually fail, because the wchar strings are
> zero-terminated.  But it does look like there's a missed speedup here.
> (Tatsuo, do you agree?)
> 
> Thanks for the report!
> 
> regards, tom lane
>

Re: v7.4.1 text_position() patch

From
Tom Lane
Date:
"Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes:
>> Hm.  I don't think it can actually fail, because the wchar strings are
>> zero-terminated.

> [ yes it can ]

You're right.  I was confused at first because I couldn't reproduce the
problem, but then I realized it's because I'm running with
CLOBBER_FREED_MEMORY enabled, so the junk beyond the end of the string
won't match the other string.

Will commit the patch.  Thanks.

            regards, tom lane

Re: v7.4.1 text_position() patch

From
Tatsuo Ishii
Date:
> "Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes:
> >> Hm.  I don't think it can actually fail, because the wchar strings are
> >> zero-terminated.
>
> > [ yes it can ]
>
> You're right.  I was confused at first because I couldn't reproduce the
> problem, but then I realized it's because I'm running with
> CLOBBER_FREED_MEMORY enabled, so the junk beyond the end of the string
> won't match the other string.
>
> Will commit the patch.  Thanks.
>
>             regards, tom lane

It's surprising that nobody noticed the bug until now. It seems it has
been there since 7.3 days. I would like to make a back patch for
7.3-stable if nobody objects.

BTW, I'm interested in Korea PostgreSQL Users' Group since I myself am
a bord member of Japan PostgreSQL Users' Group. Please let me know if
both users group could make some collaboration, such as having a
seminar in Korea or in Japan.
--
Tatsuo Ishii

Re: [HACKERS] v7.4.1 text_position() patch

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> It's surprising that nobody noticed the bug until now. It seems it has
> been there since 7.3 days. I would like to make a back patch for
> 7.3-stable if nobody objects.

No objection here.  Note that I applied a minimal patch to the 7.4
branch, but a more extensive one with some cosmetic changes in HEAD.
You probably want to copy the 7.4 change to 7.3.

            regards, tom lane

Re: [HACKERS] v7.4.1 text_position() patch

From
Joe Conway
Date:
Tatsuo Ishii wrote:
> It's surprising that nobody noticed the bug until now. It seems it has
> been there since 7.3 days. I would like to make a back patch for
> 7.3-stable if nobody objects.

It's my bug :( -- sorry about that. Here's a 7.3 patch per Tom's nearby
advice. I'll apply if you'd like.

Joe
Index: src/backend/utils/adt/varlena.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/adt/varlena.c,v
retrieving revision 1.92.2.2
diff -c -r1.92.2.2 varlena.c
*** src/backend/utils/adt/varlena.c    30 Nov 2003 20:52:37 -0000    1.92.2.2
--- src/backend/utils/adt/varlena.c    31 Jan 2004 16:50:37 -0000
***************
*** 665,673 ****
      len1 = (VARSIZE(t1) - VARHDRSZ);
      len2 = (VARSIZE(t2) - VARHDRSZ);

-     /* no use in searching str past point where search_str will fit */
-     px = (len1 - len2);
-
      if (eml == 1)                /* simple case - single byte encoding */
      {
          char       *p1,
--- 665,670 ----
***************
*** 676,681 ****
--- 673,681 ----
          p1 = VARDATA(t1);
          p2 = VARDATA(t2);

+         /* no use in searching str past point where search_str will fit */
+         px = (len1 - len2);
+
          for (p = 0; p <= px; p++)
          {
              if ((*p2 == *p1) && (strncmp(p1, p2, len2) == 0))
***************
*** 702,707 ****
--- 702,710 ----
          ps2 = p2 = (pg_wchar *) palloc((len2 + 1) * sizeof(pg_wchar));
          (void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t2), p2, len2);
          len2 = pg_wchar_strlen(p2);
+
+         /* no use in searching str past point where search_str will fit */
+         px = (len1 - len2);

          for (p = 0; p <= px; p++)
          {

Re: [HACKERS] v7.4.1 text_position() patch

From
Tatsuo Ishii
Date:
> Tatsuo Ishii wrote:
> > It's surprising that nobody noticed the bug until now. It seems it has
> > been there since 7.3 days. I would like to make a back patch for
> > 7.3-stable if nobody objects.
>
> It's my bug :( -- sorry about that. Here's a 7.3 patch per Tom's nearby
> advice. I'll apply if you'd like.
>
> Joe

Thanks. Please apply it.
--
Tatsuo Ishii

Re: [HACKERS] v7.4.1 text_position() patch

From
Joe Conway
Date:
Tatsuo Ishii wrote:
>
> Thanks. Please apply it.

Applied to REL7_3_STABLE.

Thanks,

Joe