Re: What is the maximum encoding-conversion growth rate, anyway? - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: What is the maximum encoding-conversion growth rate, anyway?
Date
Msg-id 20070718.184824.56046464.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Re: What is the maximum encoding-conversion growth rate, anyway?  (Bruce Momjian <bruce@momjian.us>)
Responses Re: What is the maximum encoding-conversion growth rate, anyway?  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
The conclusion of the discussion appears that we could reduce
MAX_CONVERSION_GROWTH from 4 to 3 safely with all existing built-in
conversions.

However, since user defined conversions could set arbitrary growth
rate, probably it would be better leave it as it is now.

For 8.4, maybe we could change conversion function's signature so that
we don't need to have the fixed conversion rate as Tom suggested.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Where are we on this?
> 
> ---------------------------------------------------------------------------
> 
> Tom Lane wrote:
> > I just rearranged the code in mbutils.c a little bit to make it more
> > robust if conversion of an over-length string is attempted, and noted
> > this comment:
> > 
> > /*
> >  * When converting strings between different encodings, we assume that space
> >  * for converted result is 4-to-1 growth in the worst case. The rate for
> >  * currently supported encoding pairs are within 3 (SJIS JIS X0201 half width
> >  * kanna -> UTF8 is the worst case).  So "4" should be enough for the moment.
> >  *
> >  * Note that this is not the same as the maximum character width in any
> >  * particular encoding.
> >  */
> > #define MAX_CONVERSION_GROWTH  4
> > 
> > It strikes me that this is overly pessimistic, since we do not support
> > 5- or 6-byte UTF8 characters, and AFAICS there are no 1-byte characters
> > in any supported encoding that require 4 bytes in another.  Could we
> > reduce the multiplier to 3?  Or even 2?  This has a direct impact on the
> > longest COPY lines we can support, so I'd like it not to be larger than
> > necessary.
> > 
> >             regards, tom lane
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Have you searched our list archives?
> > 
> >                http://archives.postgresql.org
> 
> -- 
>   Bruce Momjian  <bruce@momjian.us>          http://momjian.us
>   EnterpriseDB                               http://www.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: What is the maximum encoding-conversion growth rate, anyway?
Next
From: Magnus Hagander
Date:
Subject: Re: SSPI authentication