Thread: Re: [GENERAL] ascii() for utf8

Re: [GENERAL] ascii() for utf8

From
Decibel!
Date:
Moving to -hackers.

On Jul 27, 2007, at 1:22 PM, Stuart wrote:
> Does Postgresql have a function like ascii() that will
> return the unicode codepoint value for a utf8 character?
> (And symmetrically same for question chr() of course).
>
> I didn't find anything in the docs so I think the answer
> is no which leads me to ask...  Why not?  (Hard to believe
> lack of need without concluding that either ascii() is
> not needed, of utf8 text is little used.)
>
> Are there technical problems in implementing such a
> function?  Has anyone else already done this (ie, is
> there somewhere I could get it from?)
>
> Is there some other non-obvious way to get the cp value
> for the utf8 character?
>
> I think I could use plperl or plpython for this but
> this seems like an awful lot of overhead for such a
> basic task.

I suspect that this is just a matter of no one scratching the itch. I
suspect a patch would be accepted, or you could possibly put
something on pgFoundry. I'd set it up so that ascii() and chr() act
according to the appropriate locale setting (I'm not sure which one
would be appropriate).
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



Re: [GENERAL] ascii() for utf8

From
Alvaro Herrera
Date:
Decibel! wrote:
> Moving to -hackers.
>
> On Jul 27, 2007, at 1:22 PM, Stuart wrote:
>> Does Postgresql have a function like ascii() that will
>> return the unicode codepoint value for a utf8 character?
>> (And symmetrically same for question chr() of course).

> I suspect that this is just a matter of no one scratching the itch. I 
> suspect a patch would be accepted, or you could possibly put something on 
> pgFoundry.

Nay; there were some discussions about this not long ago, and I think
one conclusion you could draw from them is that many people want these
functions in the backend.

> I'd set it up so that ascii() and chr() act according to the 
> appropriate locale setting (I'm not sure which one would be appropriate).

I don't see why any of them would react to the locale, but they surely
must honor client encoding.

-- 
Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)


Re: [GENERAL] ascii() for utf8

From
"Stuart McGraw"
Date:
From: Alvaro Herrera
> Decibel! wrote:
> > Moving to -hackers.
> >
> > On Jul 27, 2007, at 1:22 PM, Stuart wrote:
> >> Does Postgresql have a function like ascii() that will
> >> return the unicode codepoint value for a utf8 character?
> >> (And symmetrically same for question chr() of course).
> 
> > I suspect that this is just a matter of no one scratching the itch. I 
> > suspect a patch would be accepted, or you could possibly put something on 
> > pgFoundry.
> 
> Nay; there were some discussions about this not long ago, and I think
> one conclusion you could draw from them is that many people want these
> functions in the backend.

That would certainly be my preference.  I will be distributing an 
application, the database part of which may (not sure yet) require 
this function, to multiple platforms including Windows and (though 
I have never done it) am anticipating it will be significantly harder 
if I have to worry about the recipient compiling an external function 
or making sure a dll goes in the right place, gets updated, etc.

> > I'd set it up so that ascii() and chr() act according to the 
> > appropriate locale setting (I'm not sure which one would be appropriate).
> 
> I don't see why any of them would react to the locale, but they surely
> must honor client encoding.

Wouldn't this be the database encoding?  (I have been using 
strictly utf-8 and admit I am pretty fuzzy on encoding issues.)

If one had written an external function, how much more effort 
would it be to make it acceptable for inclusion in the backend? 



Re: [GENERAL] ascii() for utf8

From
Bruce Momjian
Date:
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Alvaro Herrera wrote:
> Decibel! wrote:
> > Moving to -hackers.
> >
> > On Jul 27, 2007, at 1:22 PM, Stuart wrote:
> >> Does Postgresql have a function like ascii() that will
> >> return the unicode codepoint value for a utf8 character?
> >> (And symmetrically same for question chr() of course).
> 
> > I suspect that this is just a matter of no one scratching the itch. I 
> > suspect a patch would be accepted, or you could possibly put something on 
> > pgFoundry.
> 
> Nay; there were some discussions about this not long ago, and I think
> one conclusion you could draw from them is that many people want these
> functions in the backend.
> 
> > I'd set it up so that ascii() and chr() act according to the 
> > appropriate locale setting (I'm not sure which one would be appropriate).
> 
> I don't see why any of them would react to the locale, but they surely
> must honor client encoding.
> 
> -- 
> Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
> "I dream about dreams about dreams", sang the nightingale
> under the pale moon (Sandman)
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [GENERAL] ascii() for utf8

From
Andrew Dunstan
Date:
Actually, I am working on this as part of the fixes for invalid encoding 
stuff, as recently discussed.

cheers

andrew

Bruce Momjian wrote:
> This has been saved for the 8.4 release:
>
>     http://momjian.postgresql.org/cgi-bin/pgpatches_hold
>
> ---------------------------------------------------------------------------
>
> Alvaro Herrera wrote:
>   
>> Decibel! wrote:
>>     
>>> Moving to -hackers.
>>>
>>> On Jul 27, 2007, at 1:22 PM, Stuart wrote:
>>>       
>>>> Does Postgresql have a function like ascii() that will
>>>> return the unicode codepoint value for a utf8 character?
>>>> (And symmetrically same for question chr() of course).
>>>>         
>>> I suspect that this is just a matter of no one scratching the itch. I 
>>> suspect a patch would be accepted, or you could possibly put something on 
>>> pgFoundry.
>>>       
>> Nay; there were some discussions about this not long ago, and I think
>> one conclusion you could draw from them is that many people want these
>> functions in the backend.
>>
>>     
>>> I'd set it up so that ascii() and chr() act according to the 
>>> appropriate locale setting (I'm not sure which one would be appropriate).
>>>       
>> I don't see why any of them would react to the locale, but they surely
>> must honor client encoding.
>>
>> -- 
>> Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
>> "I dream about dreams about dreams", sang the nightingale
>> under the pale moon (Sandman)
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>        message can get through to the mailing list cleanly
>>     
>
>   


Re: [GENERAL] ascii() for utf8

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> Actually, I am working on this as part of the fixes for invalid encoding 
> stuff, as recently discussed.

OK, I have moved the item into the 8.3 queue.

---------------------------------------------------------------------------


> 
> cheers
> 
> andrew
> 
> Bruce Momjian wrote:
> > This has been saved for the 8.4 release:
> >
> >     http://momjian.postgresql.org/cgi-bin/pgpatches_hold
> >
> > ---------------------------------------------------------------------------
> >
> > Alvaro Herrera wrote:
> >   
> >> Decibel! wrote:
> >>     
> >>> Moving to -hackers.
> >>>
> >>> On Jul 27, 2007, at 1:22 PM, Stuart wrote:
> >>>       
> >>>> Does Postgresql have a function like ascii() that will
> >>>> return the unicode codepoint value for a utf8 character?
> >>>> (And symmetrically same for question chr() of course).
> >>>>         
> >>> I suspect that this is just a matter of no one scratching the itch. I 
> >>> suspect a patch would be accepted, or you could possibly put something on 
> >>> pgFoundry.
> >>>       
> >> Nay; there were some discussions about this not long ago, and I think
> >> one conclusion you could draw from them is that many people want these
> >> functions in the backend.
> >>
> >>     
> >>> I'd set it up so that ascii() and chr() act according to the 
> >>> appropriate locale setting (I'm not sure which one would be appropriate).
> >>>       
> >> I don't see why any of them would react to the locale, but they surely
> >> must honor client encoding.
> >>
> >> -- 
> >> Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
> >> "I dream about dreams about dreams", sang the nightingale
> >> under the pale moon (Sandman)
> >>
> >> ---------------------------(end of broadcast)---------------------------
> >> TIP 1: if posting/reading through Usenet, please send an appropriate
> >>        subscribe-nomail command to majordomo@postgresql.org so that your
> >>        message can get through to the mailing list cleanly
> >>     
> >
> >   

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +