Thread: multibyte support by default
In my understanding, our consensus was enabling multibyte support by default for 7.3. Any objection? -- Tatsuo Ishii
Tatsuo Ishii writes: > In my understanding, our consensus was enabling multibyte support by > default for 7.3. Any objection? It was my understanding (or if I was mistaken, then it is my suggestion) that the build-time option would be removed altogether and certain performance-critical places (if any) would be wrapped into if (encoding_is_single_byte(current_encoding)) { } That's basically what I did with the locale support. -- Peter Eisentraut peter_e@gmx.net
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > In my understanding, our consensus was enabling multibyte support by > default for 7.3. Any objection? Uh, was it? I don't recall that. Do we have any numbers on the performance overhead? regards, tom lane
> > In my understanding, our consensus was enabling multibyte support by > > default for 7.3. Any objection? > > Uh, was it? I don't recall that. Do we have any numbers on the > performance overhead? > > regards, tom lane See below. Subject: Re: [HACKERS] Unicode combining characters From: Tom Lane <tgl@sss.pgh.pa.us> To: Tatsuo Ishii <t-ishii@sra.co.jp> cc: ZeugswetterA@spardat.at, pgman@candle.pha.pa.us, phede-ml@islande.org, pgsql-hackers@postgresql.org Date: Wed, 03 Oct 2001 23:05:16 -0400 Comments: In-reply-to Tatsuo Ishii <t-ishii@sra.co.jp> message dated "Thu, 04 Oct 2001 11:16:42 +0900" Tatsuo Ishii <t-ishii@sra.co.jp> writes: > To accomplish this, I moved MatchText etc. to a separate file and now > like.c includes it *twice* (similar technique used in regexec()). This > makes like.o a little bit larger, but I believe this is worth for the > optimization. That sounds great. What's your feeling now about the original question: whether to enable multibyte by default now, or not? I'm still thinking that Peter's counsel is the wisest: plan to do it in 7.3, not today. But this fix seems to eliminate the only hard reason we have not to do it today ... regards, tom lane
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > In my understanding, our consensus was enabling multibyte support by > default for 7.3. Any objection? >> >> Uh, was it? I don't recall that. Do we have any numbers on the >> performance overhead? > See below. Oh, okay, now I recall that thread. You're right, we did agree. regards, tom lane
On Tue, 2002-04-16 at 03:20, Tatsuo Ishii wrote: > In my understanding, our consensus was enabling multibyte support by > default for 7.3. Any objection? Is there currently some agreed plan for introducing standard NCHAR/NVARCHAR types. What does ISO/ANSI say about multybyteness of simple CHAR types ? -------------- Hannu
> On Tue, 2002-04-16 at 03:20, Tatsuo Ishii wrote: > > In my understanding, our consensus was enabling multibyte support by > > default for 7.3. Any objection? > > Is there currently some agreed plan for introducing standard > NCHAR/NVARCHAR types. I have such a kind of *personal* plan, maybe for 7.4, not for 7.3 due to the limitation of my free time. BTW, NCHAR/NVARCHAR is just a abbreviation of "CHAR(n) CHARACTER SET foo"(where foo is an implementaion defined charset). So I'm not too impressed by an idea implementing NCHAR/NVARCHAR alone. > What does ISO/ANSI say about multybyteness of simple CHAR types ? There's no such that idea "multybyteness" in the standard. In my understanding the standard does not restrict "normal" CHAR types to have only ASCII (more precisely "SQL_CHARACTER"). Moreover, CHAR types without CHARSET specification will a have default charset to SQL_TEXT, and its actual charset will be defined by the implementation. In summary allowing any characters including multibyte ones in CHAR types is not againt the standard at all, IMO. -- Tatsuo Ishii