Thread: v7.4.1 text_position() patch
In src/backend/utils/adt/varlena.c, 766 line must be exits in block of 'else if (elm >1)' too. Because, strpos() function make a wrong result in multibyte string. line 796 ------------ ps1 = p1 = (pg_wchar *) palloc((len1 + 1) * sizeof(pg_wchar)); (void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t1), p1, len1); len1 = pg_wchar_strlen(p1); ps2 = p2 = (pg_wchar *) palloc((len2 + 1) * sizeof(pg_wchar)); (void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t2), p2, len2); len2 = pg_wchar_strlen(p2); /*** recalculate px ****/ px = (len1 - len2); for (p = 0; p <= px; p++)
"Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes: > In src/backend/utils/adt/varlena.c, > 766 line must be exits in block of 'else if (elm >1)' too. > Because, strpos() function make a wrong result in multibyte string. Hm. I don't think it can actually fail, because the wchar strings are zero-terminated. But it does look like there's a missed speedup here. (Tatsuo, do you agree?) Thanks for the report! regards, tom lane
strpos() function ( internal text_postion()) had a bug in unicode database. dsn=> select id,subject, strpos(subject, ' ') from bd_22 where id = 3927; id | subject | strpos ------+-------------+-------- 3927 | 안녕하세요~ | 0 (1 row) Time: 1.619 ms dsn=> select id,subject, strpos(subject, ' ') from bd_22 where id between 3925 and 3927; id | subject | strpos ------+-----------------------------------------+-------- 3925 | 대구에 DB 스터디 관련한...곳..없는가요? | 4 3927 | 안녕하세요~ | 11 (2 rows) Time: 2.490 ms ---- Sorry, above text is korean language. strpos returns wrong result. ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> Cc: <pgsql-patches@postgresql.org>; "Tatsuo Ishii" <t-ishii@sra.co.jp> Sent: Saturday, January 31, 2004 12:31 AM Subject: Re: [PATCHES] v7.4.1 text_position() patch > Hm. I don't think it can actually fail, because the wchar strings are > zero-terminated. But it does look like there's a missed speedup here. > (Tatsuo, do you agree?) > > Thanks for the report! > > regards, tom lane >
"Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes: >> Hm. I don't think it can actually fail, because the wchar strings are >> zero-terminated. > [ yes it can ] You're right. I was confused at first because I couldn't reproduce the problem, but then I realized it's because I'm running with CLOBBER_FREED_MEMORY enabled, so the junk beyond the end of the string won't match the other string. Will commit the patch. Thanks. regards, tom lane
> "Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes: > >> Hm. I don't think it can actually fail, because the wchar strings are > >> zero-terminated. > > > [ yes it can ] > > You're right. I was confused at first because I couldn't reproduce the > problem, but then I realized it's because I'm running with > CLOBBER_FREED_MEMORY enabled, so the junk beyond the end of the string > won't match the other string. > > Will commit the patch. Thanks. > > regards, tom lane It's surprising that nobody noticed the bug until now. It seems it has been there since 7.3 days. I would like to make a back patch for 7.3-stable if nobody objects. BTW, I'm interested in Korea PostgreSQL Users' Group since I myself am a bord member of Japan PostgreSQL Users' Group. Please let me know if both users group could make some collaboration, such as having a seminar in Korea or in Japan. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > It's surprising that nobody noticed the bug until now. It seems it has > been there since 7.3 days. I would like to make a back patch for > 7.3-stable if nobody objects. No objection here. Note that I applied a minimal patch to the 7.4 branch, but a more extensive one with some cosmetic changes in HEAD. You probably want to copy the 7.4 change to 7.3. regards, tom lane
Tatsuo Ishii wrote: > It's surprising that nobody noticed the bug until now. It seems it has > been there since 7.3 days. I would like to make a back patch for > 7.3-stable if nobody objects. It's my bug :( -- sorry about that. Here's a 7.3 patch per Tom's nearby advice. I'll apply if you'd like. Joe Index: src/backend/utils/adt/varlena.c =================================================================== RCS file: /cvsroot/pgsql-server/src/backend/utils/adt/varlena.c,v retrieving revision 1.92.2.2 diff -c -r1.92.2.2 varlena.c *** src/backend/utils/adt/varlena.c 30 Nov 2003 20:52:37 -0000 1.92.2.2 --- src/backend/utils/adt/varlena.c 31 Jan 2004 16:50:37 -0000 *************** *** 665,673 **** len1 = (VARSIZE(t1) - VARHDRSZ); len2 = (VARSIZE(t2) - VARHDRSZ); - /* no use in searching str past point where search_str will fit */ - px = (len1 - len2); - if (eml == 1) /* simple case - single byte encoding */ { char *p1, --- 665,670 ---- *************** *** 676,681 **** --- 673,681 ---- p1 = VARDATA(t1); p2 = VARDATA(t2); + /* no use in searching str past point where search_str will fit */ + px = (len1 - len2); + for (p = 0; p <= px; p++) { if ((*p2 == *p1) && (strncmp(p1, p2, len2) == 0)) *************** *** 702,707 **** --- 702,710 ---- ps2 = p2 = (pg_wchar *) palloc((len2 + 1) * sizeof(pg_wchar)); (void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t2), p2, len2); len2 = pg_wchar_strlen(p2); + + /* no use in searching str past point where search_str will fit */ + px = (len1 - len2); for (p = 0; p <= px; p++) {
> Tatsuo Ishii wrote: > > It's surprising that nobody noticed the bug until now. It seems it has > > been there since 7.3 days. I would like to make a back patch for > > 7.3-stable if nobody objects. > > It's my bug :( -- sorry about that. Here's a 7.3 patch per Tom's nearby > advice. I'll apply if you'd like. > > Joe Thanks. Please apply it. -- Tatsuo Ishii
Tatsuo Ishii wrote: > > Thanks. Please apply it. Applied to REL7_3_STABLE. Thanks, Joe