Re: [HACKERS] Getting server crash on Windows when using ICU collation - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Getting server crash on Windows when using ICU collation
Date
Msg-id CAA4eK1LVW+cWuVt5yU=ECF+QnkJ6mmkOCjsiqO5M7dScK4EoKQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Getting server crash on Windows when using ICU collation  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Responses Re: [HACKERS] Getting server crash on Windows when using ICUcollation
List pgsql-hackers
On Thu, Jun 15, 2017 at 11:18 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> Hi,
>
> On Thu, Jun 15, 2017 at 8:36 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> On 6/12/17 00:38, Ashutosh Sharma wrote:
>>> PFA patch that fixes the issue described in above thread. As mentioned
>>> in the above thread, the crash is basically happening in varstr_cmp()
>>> function and  it's  only happening on Windows because in varstr_cmp(),
>>> if the collation provider is ICU, we don't even think of calling ICU
>>> functions to compare the string. Infact, we directly attempt to call
>>> the OS function wsccoll*() which is not expected. Thanks.
>>
>> Maybe just
>>
>> diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
>> index a0dd391f09..2506f4eeb8 100644
>> --- a/src/backend/utils/adt/varlena.c
>> +++ b/src/backend/utils/adt/varlena.c
>> @@ -1433,7 +1433,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
>>
>>  #ifdef WIN32
>>                 /* Win32 does not have UTF-8, so we need to map to UTF-16 */
>> -               if (GetDatabaseEncoding() == PG_UTF8)
>> +               if (GetDatabaseEncoding() == PG_UTF8 && (!mylocale || mylocale->provider == COLLPROVIDER_LIBC))
>>                 {
>>                         int                     a1len;
>>                         int                     a2len;
>
> Oh, yes, this looks like the simplest and possibly the ideal way to
> fix the issue. Attached is the patch. Thanks for the inputs.
>

How will this compare UTF-8 strings in UTF-8 encoding?  It seems to me
that ideally, it should use ucol_strcollUTF8 to compare the same,
however, with patch, it will always ucol_strcoll as we never define
HAVE_UCOL_STRCOLLUTF8 flag on Windows.  We have some multi-byte tests
in src/test/mb directory, see if we can use those to verify these
changes.  I admit that I have not tried to execute those on Windows,
so I have no idea if those even work.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Marina Polyakova
Date:
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Next
From: Marina Polyakova
Date:
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors