Re: Suggestion for Encodings table - Mailing list pgsql-docs

From Bruce Momjian
Subject Re: Suggestion for Encodings table
Date
Msg-id 200503120628.j2C6S5M07431@candle.pha.pa.us
Whole thread Raw
In response to Suggestion for Encodings table  (Preston Landers <planders@journyx.com>)
List pgsql-docs
Thanks for the ideas.  I have applied the following patch which
documents all our encodings.  Also, the URL I added is very extensive.

---------------------------------------------------------------------------

Preston Landers wrote:
>
> http://www.postgresql.org/docs/8.0/interactive/multibyte.html#CHARSET-TABLE
>
> I would humbly suggest a few improvements to that Encodings table to
> improve the clarity.
>
> Many of the entries clearly indicate the language or writing system, such
> as WIN1256 = "Windows CP1256 (Arabic)"
>
> I would suggest that every single entry should be described that way with
> the common language or writing system name.  Even Unicode could say "All
> languages".
>
> In particular, the "WIN" encoding just says "CP1251" -- this is Cyrillic
> (Russian) but some people might just see the WIN and assume it's the
> character set that Western/US Windows uses (CP 1252).
>
> It's an easy mistake to make and one I see repeated frequently on other
> web pages (calling Windows "Western" CP 1251.)  Someone reading English
> language docs and seeing a "WIN" character set might naturally assume that
> it is the English Windows character set.  (Which BTW is not currently
> supported by PG for conversions.)
>
> Some more examples that might improve clarity:
>
>  LATIN5 should say "Turkish"
>
>  LATIN6 should say "Nordic"
>
>  ALT and KOI8 should say "Cyrillic"   (or Russian)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/charset.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v
retrieving revision 2.49
diff -c -c -r2.49 charset.sgml
*** doc/src/sgml/charset.sgml    7 Mar 2005 04:30:48 -0000    2.49
--- doc/src/sgml/charset.sgml    12 Mar 2005 06:24:51 -0000
***************
*** 344,390 ****
          </row>
          <row>
           <entry><literal>MULE_INTERNAL</literal></entry>
!          <entry>Mule internal code</entry>
          </row>
          <row>
           <entry><literal>LATIN1</literal></entry>
!          <entry>ISO 8859-1/<acronym>ECMA</> 94 (Latin alphabet no.1)</entry>
          </row>
          <row>
           <entry><literal>LATIN2</literal></entry>
!          <entry>ISO 8859-2/<acronym>ECMA</> 94 (Latin alphabet no.2)</entry>
          </row>
          <row>
           <entry><literal>LATIN3</literal></entry>
!          <entry>ISO 8859-3/<acronym>ECMA</> 94 (Latin alphabet no.3)</entry>
          </row>
          <row>
           <entry><literal>LATIN4</literal></entry>
!          <entry>ISO 8859-4/<acronym>ECMA</> 94 (Latin alphabet no.4)</entry>
          </row>
          <row>
           <entry><literal>LATIN5</literal></entry>
!          <entry>ISO 8859-9/<acronym>ECMA</> 128 (Latin alphabet no.5)</entry>
          </row>
          <row>
           <entry><literal>LATIN6</literal></entry>
!          <entry>ISO 8859-10/<acronym>ECMA</> 144 (Latin alphabet no.6)</entry>
          </row>
          <row>
           <entry><literal>LATIN7</literal></entry>
!          <entry>ISO 8859-13 (Latin alphabet no.7)</entry>
          </row>
          <row>
           <entry><literal>LATIN8</literal></entry>
!          <entry>ISO 8859-14 (Latin alphabet no.8)</entry>
          </row>
          <row>
           <entry><literal>LATIN9</literal></entry>
!          <entry>ISO 8859-15 (Latin alphabet no.9)</entry>
          </row>
          <row>
           <entry><literal>LATIN10</literal></entry>
!          <entry>ISO 8859-16/<acronym>ASRO</> SR 14111 (Latin alphabet no.10)</entry>
          </row>
          <row>
           <entry><literal>ISO_8859_5</literal></entry>
--- 344,390 ----
          </row>
          <row>
           <entry><literal>MULE_INTERNAL</literal></entry>
!          <entry>Mule internal code (Multi-lingual Emacs)</entry>
          </row>
          <row>
           <entry><literal>LATIN1</literal></entry>
!          <entry>ISO 8859-1/<acronym>ECMA</> 94 (Western European)</entry>
          </row>
          <row>
           <entry><literal>LATIN2</literal></entry>
!          <entry>ISO 8859-2/<acronym>ECMA</> 94 (Central European)</entry>
          </row>
          <row>
           <entry><literal>LATIN3</literal></entry>
!          <entry>ISO 8859-3/<acronym>ECMA</> 94 (South European)</entry>
          </row>
          <row>
           <entry><literal>LATIN4</literal></entry>
!          <entry>ISO 8859-4/<acronym>ECMA</> 94 (North European)</entry>
          </row>
          <row>
           <entry><literal>LATIN5</literal></entry>
!          <entry>ISO 8859-9/<acronym>ECMA</> 128 (Turkish)</entry>
          </row>
          <row>
           <entry><literal>LATIN6</literal></entry>
!          <entry>ISO 8859-10/<acronym>ECMA</> 144 (Nordic)</entry>
          </row>
          <row>
           <entry><literal>LATIN7</literal></entry>
!          <entry>ISO 8859-13 (Baltic)</entry>
          </row>
          <row>
           <entry><literal>LATIN8</literal></entry>
!          <entry>ISO 8859-14 (Celtic)</entry>
          </row>
          <row>
           <entry><literal>LATIN9</literal></entry>
!          <entry>ISO 8859-15 (LATIN1 with Euro and accents)</entry>
          </row>
          <row>
           <entry><literal>LATIN10</literal></entry>
!          <entry>ISO 8859-16/<acronym>ASRO</> SR 14111 (Romanian)</entry>
          </row>
          <row>
           <entry><literal>ISO_8859_5</literal></entry>
***************
*** 404,414 ****
          </row>
          <row>
           <entry><literal>KOI8</literal></entry>
!          <entry><acronym>KOI</acronym>8-R(U)</entry>
          </row>
          <row>
           <entry><literal>WIN866</literal></entry>
!          <entry>Windows CP866</entry>
          </row>
          <row>
           <entry><literal>WIN874</literal></entry>
--- 404,414 ----
          </row>
          <row>
           <entry><literal>KOI8</literal></entry>
!          <entry><acronym>KOI</acronym>8-R(U) (Cyrillic)</entry>
          </row>
          <row>
           <entry><literal>WIN866</literal></entry>
!          <entry>Windows CP866 (Cyrillic)</entry>
          </row>
          <row>
           <entry><literal>WIN874</literal></entry>
***************
*** 416,426 ****
          </row>
          <row>
           <entry><literal>WIN1250</literal></entry>
!          <entry>Windows CP1250</entry>
          </row>
          <row>
           <entry><literal>WIN1251</literal></entry>
!          <entry>Windows CP1251</entry>
          </row>
          <row>
           <entry><literal>WIN1256</literal></entry>
--- 416,426 ----
          </row>
          <row>
           <entry><literal>WIN1250</literal></entry>
!          <entry>Windows CP1250 (Central European)</entry>
          </row>
          <row>
           <entry><literal>WIN1251</literal></entry>
!          <entry>Windows CP1251 (Cyrillic)</entry>
          </row>
          <row>
           <entry><literal>WIN1256</literal></entry>
***************
*** 883,888 ****
--- 883,900 ----

       <variablelist>
        <varlistentry>
+        <term><ulink url="http://www.i18ngurus.com/docs/984813247.html"></ulink></term>
+
+        <listitem>
+         <para>
+          An extensive collection of documents about character sets, encodings,
+          and code pages.
+         </para>
+        </listitem>
+       </varlistentry>
+
+      <variablelist>
+       <varlistentry>
         <term><ulink url="ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf"></ulink></term>

         <listitem>

pgsql-docs by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Fast major-version upgrade (was: [GENERAL] postgresql 8.0 advantages)
Next
From: "Merlin Moncure"
Date:
Subject: more concurrency information in the documentation