Re: encoding names v2. - Mailing list pgsql-patches
From | Peter Eisentraut |
---|---|
Subject | Re: encoding names v2. |
Date | |
Msg-id | Pine.LNX.4.30.0108222124120.679-100000@peter.localdomain Whole thread Raw |
In response to | encoding names v2. (Karel Zak <zakkr@zf.jcu.cz>) |
Responses |
Re: encoding names v2.
Re: encoding names v2. |
List | pgsql-patches |
Okay, here is some bad news: I just looked into the SQL99 standard for the names of predefined character set names, and here is the list: SQL_CHARACTER GRAPHIC_IRV or ASCII_GRAPHIC LATIN1 <==== !!! ISO8BIT or ASCII_FULL UTF16 UTF8 UCS2 SQL_TEXT SQL_IDENTIFIER So perhaps we should keep the LATIN1 thing after all? I don't like it, but the rules... Comments? Karel Zak writes: > - getdatabaseencoding() is compatible with old versions, but > in the code is commented as deprecated. > > - getdbencoding() is new function that return correct encoding names See my other message about this. I don't think this is a good choice of names. > - all encoding names use '-'. I hope we will never see a problem with > it and some operator. Encoding names must be used as quoted string. For SQL compliance we will need to access charset names as identifiers in the future. So the name normalization should take effect whereever a charset name is expected. I suppose this is what you did. > Only for SQL_ASCII is used '_', because I see that JDBC has hardcoded > "pg_encoding_to_char(1) = 'SQL_ASCII'" :-((( This is okay, look at the list above for precedent. > - the ./configure.in: > * use new encoding names too for --enable-multibyte > * define MULTIBYTE that handle default encoding id Where is this needed? > * define MULTIBYTE_NAME that handle default encoding name (neeful > for initdb) Can you rename this to something like DEFAULT_CHARACTER_SET? There is really nothing "multibyte" here. > - 'initdb' check if default template encoding is correct for backend DB. > > In the old code it's in initdb very hardcoded. I add to pg_encoding > option '-b' that check if encoding is correct for backend DB (means > encoding is not client only). It's better than > if [ $MULTIBYTEID -gt 31 ] > ^^^^^^ > in scripts. Good. > src/utils/mb/Unicode/KOI8_to_utf8.map --> src/utils/mb/Unicode/KOI8R_to_utf8.map > src/utils/mb/Unicode/WIN_to_utf8.map --> src/utils/mb/Unicode/WIN1251_to_utf8.map > src/utils/mb/Unicode/utf8_to_KOI8.map --> src/utils/mb/Unicode/utf8_to_KOI8R.map > src/utils/mb/Unicode/utf8_to_WIN.map --> src/utils/mb/Unicode/utf8_to_WIN1251.map Can you introduce some uniform capitalization (e.g., all lower case)? > Thanks for all suggestion. > > New comments? Don't worry, we'll get there. ;-) -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
pgsql-patches by date: