Where are we on this?
---------------------------------------------------------------------------
Tom Lane wrote:
> I just rearranged the code in mbutils.c a little bit to make it more
> robust if conversion of an over-length string is attempted, and noted
> this comment:
>
> /*
> * When converting strings between different encodings, we assume that space
> * for converted result is 4-to-1 growth in the worst case. The rate for
> * currently supported encoding pairs are within 3 (SJIS JIS X0201 half width
> * kanna -> UTF8 is the worst case). So "4" should be enough for the moment.
> *
> * Note that this is not the same as the maximum character width in any
> * particular encoding.
> */
> #define MAX_CONVERSION_GROWTH 4
>
> It strikes me that this is overly pessimistic, since we do not support
> 5- or 6-byte UTF8 characters, and AFAICS there are no 1-byte characters
> in any supported encoding that require 4 bytes in another. Could we
> reduce the multiplier to 3? Or even 2? This has a direct impact on the
> longest COPY lines we can support, so I'd like it not to be larger than
> necessary.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +