Re: Clear up strxfrm() in UTF-8 with locale on Windows - Mailing list pgsql-patches

From Magnus Hagander
Subject Re: Clear up strxfrm() in UTF-8 with locale on Windows
Date
Msg-id 4638F7B3.7040602@hagander.net
Whole thread Raw
In response to Clear up strxfrm() in UTF-8 with locale on Windows  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Responses Re: Clear up strxfrm() in UTF-8 with locale on Windows  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
ITAGAKI Takahiro wrote:
> The attached patch clears up the usage of strxfrm() on Windows. If the
> server encoding is UTF-8 and the locale is not C, we should use wcsxfrm()
> instead of strxfrm() because UTF-8 locale are not supported on Windows.
> We've already have a special version of strcoll() for Windows, but the
> usage of strxfrm() was still broken.
>
> When we are caught up in the bug, we see the next error message.
> | ERROR:  invalid memory alloc request size 2147483648
> If the server is wrong configured between the server encoding and the
> locale, strxfrm() could be failed and return values like INT_MAX or
> (size_t)-1. We've passed the result+1 straight to palloc(), so the server
> tried to allocale more than 1GB of memory and gave up.

I was just about to commit this with the following two changes:
* wcsxfrm() sets errno, so you can't use GetLastError() to report problems
* The code added a check for return value >= INT_MAX on Unix as well,
but the spec for strxfrm() says that there is no specific return value
for failure.

Put those in there for reference. But I also recalled a previous
discussion, and found this:
http://archives.postgresql.org/pgsql-hackers/2005-08/msg00760.php

Given this, perhaps the proper approach should instead be to just check
the return value, and go from there? Should be a simple enough patch,
something like the attached.

Tom, can you comment?

Takahiro, can you test if this patch fixes your problem?

//Magnus
Index: src/backend/utils/adt/selfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/selfuncs.c,v
retrieving revision 1.233
diff -c -r1.233 selfuncs.c
*** src/backend/utils/adt/selfuncs.c    21 Apr 2007 21:01:45 -0000    1.233
--- src/backend/utils/adt/selfuncs.c    2 May 2007 20:38:58 -0000
***************
*** 3152,3157 ****
--- 3152,3165 ----
  #else
          xfrmlen = strxfrm(NULL, val, 0);
  #endif
+ #ifdef WIN32
+         /* On win32, if strxfrm fails (for example in UTF8 encoding, since
+          * it's not properly supported), return the original string instead
+          * of trying to allocate 2Gb memory.
+          */
+         if (xfrmlen >= INT_MAX)
+             return val;
+ #endif
          xfrmstr = (char *) palloc(xfrmlen + 1);
          xfrmlen2 = strxfrm(xfrmstr, val, xfrmlen + 1);
          Assert(xfrmlen2 <= xfrmlen);

pgsql-patches by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] autovacuum does not start in HEAD
Next
From: Tom Lane
Date:
Subject: Re: Clear up strxfrm() in UTF-8 with locale on Windows