Home > mailing lists

Patch: add conversion from pg_wchar to multibyte - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Patch: add conversion from pg_wchar to multibyte
Date	April 23, 2012 05:48:58
Msg-id	CAPpHfdshcHe1ZPQhyd2xhAKnNu0VpdMPuGFtvribqJcnH0K2Ew@mail.gmail.com Whole thread Raw
Responses	Re: Patch: add conversion from pg_wchar to multibyte
List	pgsql-hackers

Tree view

Hackers,

attached patch adds conversion from pg_wchar string to multibyte string.

This functionality is needed for my patch on index support for regular expression search http://archives.postgresql.org/pgsql-hackers/2011-11/msg01297.php .

Analyzing conversion from multibyte to pg_wchar I found following types of conversion:

1) Trivial conversion for single-byte encoding. It just adds leading zeros to each byte.

2) Conversion from UTF-8 to unicode.

3) Conversions from euc* encodings. They write bytes of a character to pg_wchar in inverse order starting from lower byte (this explanation assume little endian system).

4) Conversion from mule encoding. This conversion is unclear for me and also seems to be lossy.

It was easy to write inverse conversion for 1-3. I've changed 4 conversion to behave like 3. I'm not sure my change is ok, because I didn't understand original conversion.

------
With best regards,
Alexander Korotkov.

Attachment

wchar2mb-0.1.patch

pgsql-hackers by date:

From: Jan Urbański
Date: 22 April 2012, 21:26:04
Subject: Re: plpython triggers are broken for composite-type columns

From: Boszormenyi Zoltan
Date: 23 April 2012, 05:54:56
Subject: Re: [PATCH] lock_timeout and common SIGALRM framework

Patch: add conversion from pg_wchar to multibyte - Mailing list pgsql-hackers

Attachment

Previous

Next