Thread: Re: From latin9 to sql_ascii??

Re: From latin9 to sql_ascii??

From

Jaime Casanova

Date:

17 December 2004, 21:03:35

 --- Jaime Casanova <systemguards@yahoo.com> escribió:

>  --- Tom Lane <tgl@sss.pgh.pa.us> escribió:
>> Jaime Casanova <systemguards@yahoo.com> writes:
>>> => select to_ascii('Jiménez');
>>> will retrieve 'Jimenez' at least it works on
>>> Latin1 encoding.
>>>
>>> Why it not work on Latin9,
>>
>> Probably because it hasn't got a table for Latin9.
>>
>> Feel free to contribute one --- see
>> src/backend/utils/adt/ascii.c.

This page shows the differences between Latin1 &
Latin9:
http://www.cs.tut.fi/~jkorpela/latin9.html

The diffs are:

164: the euro symbol.           (sql_ascii = 'E')???
166: an S with a symbol above   (sql_ascii = 'S')
168: the same but lower case    (sql_ascii = 's')
180: an Z with a symbol above   (sql_ascii = 'Z')
184: the same but lower case    (sql_ascii = 'z')
188: it's an O merge with an E  (sql_ascii =  '')???
189: the same but lower case    (sql_ascii =  '')???
190: an Y with a 2 points above (sql_ascii = 'Y')

Comments?

regards,
Jaime Casanova

_________________________________________________________
Do You Yahoo!?
Información de Estados Unidos y América Latina, en Yahoo! Noticias.
Visítanos en http://noticias.espanol.yahoo.com

Attachment

ascii.patch

Re: From latin9 to sql_ascii??

From

Tom Lane

Date:

17 December 2004, 21:21:58

Jaime Casanova <systemguards@yahoo.com> writes:
> Why it not work on Latin9,

>> Probably because it hasn't got a table for Latin9.
>>
>> Feel free to contribute one --- see
>> src/backend/utils/adt/ascii.c.

> This page shows the differences between Latin1 &
> Latin9:
> http://www.cs.tut.fi/~jkorpela/latin9.html

> The diffs are:

> 164: the euro symbol.           (sql_ascii = 'E')???
> 166: an S with a symbol above   (sql_ascii = 'S')
> 168: the same but lower case    (sql_ascii = 's')
> 180: an Z with a symbol above   (sql_ascii = 'Z')
> 184: the same but lower case    (sql_ascii = 'z')
> 188: it's an O merge with an E  (sql_ascii =  '')???
> 189: the same but lower case    (sql_ascii =  '')???
> 190: an Y with a 2 points above (sql_ascii = 'Y')

> Comments?

Works for me.  Anyone feel this is too big a change to push into 8.0?
Strictly speaking it's a new feature, but it sure looks harmless from
here.

Personally I'd say that the euro symbol should map to ' ' not 'E',
but am not set on that.

            regards, tom lane

Re: From latin9 to sql_ascii??

From

Peter Eisentraut

Date:

17 December 2004, 22:09:28

Jaime Casanova wrote:
> 188: it's an O merge with an E  (sql_ascii =  '')???
> 189: the same but lower case    (sql_ascii =  '')???

'OE' and 'oe', most likely, but someone more familiar with French
typography might correct me.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: From latin9 to sql_ascii??

From

Jaime Casanova

Date:

17 December 2004, 23:19:12

 --- Peter Eisentraut <peter_e@gmx.net> escribió:
> Jaime Casanova wrote:
> > 188: it's an O merge with an E  (sql_ascii =
> '')???
> > 189: the same but lower case    (sql_ascii =
> '')???
>
> 'OE' and 'oe', most likely, but someone more
> familiar with French
> typography might correct me.
>

Something like that, i really doesn't know how to
convert to sql_ascii that.

Maybe just blank like Tom suggest about the euro
symbol

regards,
Jaime Casanova

_________________________________________________________
Do You Yahoo!?
Información de Estados Unidos y América Latina, en Yahoo! Noticias.
Visítanos en http://noticias.espanol.yahoo.com

Re: From latin9 to sql_ascii??

From

Alvaro Herrera

Date:

17 December 2004, 23:43:38

On Fri, Dec 17, 2004 at 11:09:17PM +0100, Peter Eisentraut wrote:
> Jaime Casanova wrote:
> > 188: it's an O merge with an E  (sql_ascii =  '')???
> > 189: the same but lower case    (sql_ascii =  '')???
>
> 'OE' and 'oe', most likely, but someone more familiar with French
> typography might correct me.

OE and oe would be correct, but we can't do that with the current code.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)

Re: From latin9 to sql_ascii??

From

Jaime Casanova

Date:

18 December 2004, 04:33:56

--- Tom Lane <tgl@sss.pgh.pa.us> escribió:
> Jaime Casanova <systemguards@yahoo.com> writes:
> > Why it not work on Latin9,
>
> >> Probably because it hasn't got a table for
> Latin9.
> >>
> >> Feel free to contribute one --- see
> >> src/backend/utils/adt/ascii.c.
>
> > This page shows the differences between Latin1 &
> > Latin9:
> > http://www.cs.tut.fi/~jkorpela/latin9.html
>
> > The diffs are:
>
> > 164: the euro symbol.           (sql_ascii =
> 'E')???
> > 166: an S with a symbol above   (sql_ascii = 'S')
> > 168: the same but lower case    (sql_ascii = 's')
> > 180: an Z with a symbol above   (sql_ascii = 'Z')
> > 184: the same but lower case    (sql_ascii = 'z')
> > 188: it's an O merge with an E  (sql_ascii =
> '')???
> > 189: the same but lower case    (sql_ascii =
> '')???
> > 190: an Y with a 2 points above (sql_ascii = 'Y')
>
> > Comments?
>
> Works for me.  Anyone feel this is too big a change
> to push into 8.0?
> Strictly speaking it's a new feature, but it sure
> looks harmless from here.

You guys have the code, you guys have the power.
I don't think it can cause any problem. :)

>
> Personally I'd say that the euro symbol should map
> to ' ' not 'E', but am not set on that.
>

Maybe, someone that uses the euro symbol can comment??
if not, and you said that we can just map that symbol
to ' '.

Here's the *fixed* patch it's up to you wich one to
use.

regards,
Jaime Casanova



_________________________________________________________
Do You Yahoo!?
Información de Estados Unidos y América Latina, en Yahoo! Noticias.
Visítanos en http://noticias.espanol.yahoo.com*** src/backend/utils/adt/ascii.c.orig    2004-08-29 00:06:49.000000000
-0500
--- src/backend/utils/adt/ascii.c    2004-12-17 23:02:01.000000000 -0500
***************
*** 53,58 ****
--- 53,66 ----
          ascii = " A L LS \"SSTZ-ZZ a,l'ls ,sstz\"zzRAAAALCCCEEEEIIDDNNOOOOxRUUUUYTBraaaalccceeeeiiddnnoooo/ruuuuyt.";
          range = RANGE_160;
      }
+     else if (enc == PG_LATIN9)
+     {
+         /*
+          * ISO-8859-15 <range: 160 -- 255>
+          */
+         ascii = "  cL YS sCa  -R     Zu .z   EeY?AAAAAAACEEEEIIII NOOOOOxOUUUUYTBaaaaaaaceeeeiiii nooooo/ouuuuyty";
+         range = RANGE_160;
+     }
      else if (enc == PG_WIN1250)
      {
          /*

Re: From latin9 to sql_ascii??

From

Tom Lane

Date:

18 December 2004, 05:00:18

Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> On Fri, Dec 17, 2004 at 11:09:17PM +0100, Peter Eisentraut wrote:
>> 'OE' and 'oe', most likely, but someone more familiar with French
>> typography might correct me.

> OE and oe would be correct, but we can't do that with the current code.

More to the point, there are no such characters in 7-bit ASCII.

I think Alvaro might be suggesting that to_ascii() should expand these
to the two-character sequences "OE" and "oe", but ISTM that opens a can
of worms better left sealed.  There are a *lot* of characters that have
translations of differing levels of plausibility into ASCII.  I'm okay
with dropping accent marks but I'm not sure about doing more than that.

            regards, tom lane

Re: From latin9 to sql_ascii??

From

Tom Lane

Date:

20 December 2004, 19:01:43

Jaime Casanova <systemguards@yahoo.com> writes:
> Here's the *fixed* patch it's up to you wich one to
> use.

I applied this one.

            regards, tom lane