Thread: convert function

convert function

From
"Jan Sunavec"
Date:
Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

   John

Re: convert function

From
"Pavel Stehule"
Date:
Hello

It's look like SQL_ASCII support diacritic chars now. First you have
to encode from bytea to text

postgres=# SELECT encode(convert('ján', 'UNICODE', 'SQL_ASCII'),'escape');
 encode
--------
 ján
(1 row)

you wont
postgres=# SELECT to_ascii(encode(convert_to('ján',
'latin2'),'escape'),'latin2');
 to_ascii
----------
 jan
(1 row)

Regards
Pavel Stehule



convert do conversion from text to bytea type. For diacritic
elimination use to_ascii function:

postgres=# select to_ascii(convert('Příliš žlutý kůň' using
utf8_to_iso_8859_2),'latin2');
     to_ascii
------------------
 Prilis zluty kun
(1 row)


On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:
> Hi all
>
> I have problem with "convert" function. Previous behaviour was
> SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
> =======================================
> jan
>
> In postgresql 8.3 is quite new behaviour.
> SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
> ======================================
> "j\241n"
>
> This, drives me crazy. I mean, this is not useable for non english
> country. I don't need convert to \241 characters. I understand that
> someone need this behavour. But there should be possibility switch to
> "normal" behaviour.
>
>    John
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>

Re: convert function

From
"Pavel Stehule"
Date:
On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:
> Thanks a lot
>
> Lots like nice a easy solution.. I am not sure if this is fast solution..
> Many convertions you know.. :-(
> Thanks a lot anyway.
>

If you do this often, use functional index.

Pavel

>    John
>
> On Wed, 12 Dec 2007 17:13:01 +0100, Pavel Stehule
> <pavel.stehule@gmail.com> wrote:
>
> > Hello
> >
> > It's look like SQL_ASCII support diacritic chars now. First you have
> > to encode from bytea to text
> >
> > postgres=# SELECT encode(convert('ján', 'UNICODE',
> > 'SQL_ASCII'),'escape');
> >  encode
> > --------
> >  ján
> > (1 row)
> >
> > you wont
> > postgres=# SELECT to_ascii(encode(convert_to('ján',
> > 'latin2'),'escape'),'latin2');
> >  to_ascii
> > ----------
> >  jan
> > (1 row)
> >
> > Regards
> > Pavel Stehule
> >
> >
> >
> > convert do conversion from text to bytea type. For diacritic
> > elimination use to_ascii function:
> >
> > postgres=# select to_ascii(convert('Příliš žlutý kůň' using
> > utf8_to_iso_8859_2),'latin2');
> >      to_ascii
> > ------------------
> >  Prilis zluty kun
> > (1 row)
> >
> >
> > On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:
> >> Hi all
> >>
> >> I have problem with "convert" function. Previous behaviour was
> >> SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
> >> =======================================
> >> jan
> >>
> >> In postgresql 8.3 is quite new behaviour.
> >> SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
> >> ======================================
> >> "j\241n"
> >>
> >> This, drives me crazy. I mean, this is not useable for non english
> >> country. I don't need convert to \241 characters. I understand that
> >> someone need this behavour. But there should be possibility switch to
> >> "normal" behaviour.
> >>
> >>    John
> >>
> >> ---------------------------(end of broadcast)---------------------------
> >> TIP 9: In versions below 8.0, the planner will ignore your desire to
> >>        choose an index scan if your joining column's datatypes do not
> >>        match
> >>
>
>
>

Re: convert function

From
"Jan Sunavec"
Date:
Thanks a lot

Lots like nice a easy solution.. I am not sure if this is fast solution..
Many convertions you know.. :-(
Thanks a lot anyway.

   John

On Wed, 12 Dec 2007 17:13:01 +0100, Pavel Stehule
<pavel.stehule@gmail.com> wrote:

> Hello
>
> It's look like SQL_ASCII support diacritic chars now. First you have
> to encode from bytea to text
>
> postgres=# SELECT encode(convert('ján', 'UNICODE',
> 'SQL_ASCII'),'escape');
>  encode
> --------
>  ján
> (1 row)
>
> you wont
> postgres=# SELECT to_ascii(encode(convert_to('ján',
> 'latin2'),'escape'),'latin2');
>  to_ascii
> ----------
>  jan
> (1 row)
>
> Regards
> Pavel Stehule
>
>
>
> convert do conversion from text to bytea type. For diacritic
> elimination use to_ascii function:
>
> postgres=# select to_ascii(convert('Příliš žlutý kůň' using
> utf8_to_iso_8859_2),'latin2');
>      to_ascii
> ------------------
>  Prilis zluty kun
> (1 row)
>
>
> On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:
>> Hi all
>>
>> I have problem with "convert" function. Previous behaviour was
>> SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
>> =======================================
>> jan
>>
>> In postgresql 8.3 is quite new behaviour.
>> SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
>> ======================================
>> "j\241n"
>>
>> This, drives me crazy. I mean, this is not useable for non english
>> country. I don't need convert to \241 characters. I understand that
>> someone need this behavour. But there should be possibility switch to
>> "normal" behaviour.
>>
>>    John
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>>        choose an index scan if your joining column's datatypes do not
>>        match
>>



tsearch2 headline options

From
"Jan Sunavec"
Date:
Hi all

I have following problem when I use this

select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more
more more more more more', to_tsquery('asd'), '');

I got this

"<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more
more more"

So result is shorted than original text. I tryed set MinWords and
MaxWords. But it doesn't help me. So question is how can I get original
text?

Best regards

    John

Re: tsearch2 headline options

From
Oleg Bartunov
Date:
On Wed, 2 Jan 2008, Jan Sunavec wrote:

> Hi all
>
> I have following problem when I use this
>
> select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more
> more more more more more', to_tsquery('asd'), '');
>
> I got this
>
> "<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more more more"
>
> So result is shorted than original text. I tryed set MinWords and MaxWords.
> But it doesn't help me. So question is how can I get original text?

try 'HighlightAll=TRUE

arxiv=# select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more more more more more more',
to_tsquery('asd'),'HighlightAll=TRUE'); 
                                                                                 headline

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
<b>asd</b><b>asd</b> more more more more more more more 


>
> Best regards
>
>  John
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: tsearch2 headline options

From
"Jan Sunavec"
Date:
Thanks a lot. It helps.

On Fri, 04 Jan 2008 16:54:32 +0100, Oleg Bartunov <oleg@sai.msu.su> wrote:

> On Wed, 2 Jan 2008, Jan Sunavec wrote:
>
>> Hi all
>>
>> I have following problem when I use this
>>
>> select headline('asd asd asd asd asd asd asd asd asd asd asd asd more
>> more more more more more more', to_tsquery('asd'), '');
>>
>> I got this
>>
>> "<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
>> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more
>> more more"
>>
>> So result is shorted than original text. I tryed set MinWords and
>> MaxWords. But it doesn't help me. So question is how can I get original
>> text?
>
> try 'HighlightAll=TRUE
>
> arxiv=# select headline('asd asd asd asd asd asd asd asd asd asd asd asd
> more more more more more more more', to_tsquery('asd'),
> 'HighlightAll=TRUE');
>                                                                                  headline
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more
> more more more more more more
>
>
>>
>> Best regards
>>
>>  John
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 2: Don't 'kill -9' the postmaster
>
>      Regards,
>          Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83