Tom Lane writes:
> One thing that's really unclear to me is what's the difference between
> a <character translation> and a <form-of-use conversion>, other than
> that they didn't provide a syntax for defining new conversions.
The standard has this messed up. In part 1, a form-of-use and an encoding
are two distinct things that can be applied to a character repertoire (see
clause 4.6.2.1), whereas in part 2 the term encoding is used in the
definition of form-of-use (clause 3.1.5 r).
When I sort it out, however, I think that what Tatsuo was describing is
indeed a form-of-use conversion. Note that in part 2, clause 4.2.2.1, it
says about form-of-use conversions,
It is intended, though not enforced by this part of ISO/IEC 9075, that S2 be exactly the same sequence of
charactersas S1, but encoded according some different form-of-use. A typical use might be to convert a character
stringfrom two-octet UCS to one-octet Latin1 or vice versa.
This seems to match what we're doing.
A character translation does not make this requirement and it explicitly
calls out the possibility of "many-to-one or one-to-one mapping between
two not necessarily distinct character sets". I imagine that what this is
intended to do is to allow the user to create mappings such as ö
-> oe (as is common in German to avoid using characters with diacritic
marks), or ö -> o (as one might do in French to achieve the same). In
fact, it's a glorified sed command.
So I withdraw my earlier comment. But perhaps the syntax of the proposed
command could be aligned with the CREATE TRANSLATION command.
--
Peter Eisentraut peter_e@gmx.net