Thread: Query inside RTF

Query inside RTF

From
"Leonardo M." Ramé
Date:
Hi, in my database I store RTF files inside a Bytea field. Now a
customer is requesting a search inside RTF fields and I'm trying to
implement it by issuing this query:

select
  *
from my_table
where
  cast(rtf_field as varchar) like '%condition%'

This works ok when the condition doesn't include accented chars. RTF
escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.

To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
doesn't.

Any hint on this?

--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979


Re: Query inside RTF

From
Merlin Moncure
Date:
2009/10/8 Leonardo M. <l.rame@griensu.com>:
> Hi, in my database I store RTF files inside a Bytea field. Now a
> customer is requesting a search inside RTF fields and I'm trying to
> implement it by issuing this query:
>
> select
>  *
> from my_table
> where
>  cast(rtf_field as varchar) like '%condition%'
>
> This works ok when the condition doesn't include accented chars. RTF
> escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.
>
> To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
> that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
> doesn't.
>
> Any hint on this?

first, try dollar quoting:
cast(rtf_field as varchar) like $merlin$%condition%$merlin$

:-)

merlin

Re: Query inside RTF

From
"Leonardo M." Ramé
Date:
On jue, 2009-10-08 at 11:28 -0400, Merlin Moncure wrote:
> 2009/10/8 Leonardo M. <l.rame@griensu.com>:
> > Hi, in my database I store RTF files inside a Bytea field. Now a
> > customer is requesting a search inside RTF fields and I'm trying to
> > implement it by issuing this query:
> >
> > select
> >  *
> > from my_table
> > where
> >  cast(rtf_field as varchar) like '%condition%'
> >
> > This works ok when the condition doesn't include accented chars. RTF
> > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.
> >
> > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
> > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
> > doesn't.
> >
> > Any hint on this?
>
> first, try dollar quoting:
> cast(rtf_field as varchar) like $merlin$%condition%$merlin$
>
> :-)
>
> merlin
>

Thanks Merlin, I found the solution, it was related to
"standard_conforming_strings". My query is this now:

set standard_conforming_strings = 1;

SELECT
idturno,
Upper(cast(InfRes as Varchar))
as InfRes
from turno
where
 cast(InfRes as Varchar) like '%diagn\\\\''f3stica%';

--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979


Re: Query inside RTF

From
Sam Mason
Date:
On Thu, Oct 08, 2009 at 10:58:05AM -0300, Leonardo M. Rammm wrote:
>   cast(rtf_field as varchar) like '%condition%'

I don't think that cast isn't doing what you expect at all.  As an
example, try doing something like:

  select cast(bytea e'hi\nbye' as varchar);

encode() may be what you want instead.  How do RTF files handle
encoding?  You may have better luck with using a specialized program to
pull the text out of the RTF file and then using the text search stuff
in PG.  LIKE conditions can't be indexed well and hence the above is
going to be slow for any reasonable number of documents.

--
  Sam  http://samason.me.uk/

Re: Query inside RTF

From
Emanuel Calvo Franco
Date:
> select
>  *
> from my_table
> where
>  cast(rtf_field as varchar) like '%condition%'
>
> This works ok when the condition doesn't include accented chars. RTF
> escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.
>
> To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
> that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
> doesn't.
>

I prefer use

select * from table where i_bytea::text ~~ $$%\\row%$$;

Dollar quoting is more clean to put strings inside than ' '. ~~ is
like operator.
And :: operand is more clean to the syntax of the query.
Don't think so? :)


--
              Emanuel Calvo Franco
             DBA at:  www.siu.edu.ar
        www.emanuelcalvofranco.com.ar

Re: Query inside RTF

From
"Leonardo M." Ramé
Date:
On jue, 2009-10-08 at 17:01 -0300, Emanuel Calvo Franco wrote:
> > select
> >  *
> > from my_table
> > where
> >  cast(rtf_field as varchar) like '%condition%'
> >
> > This works ok when the condition doesn't include accented chars. RTF
> > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.
> >
> > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
> > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
> > doesn't.
> >
>
> I prefer use
>
> select * from table where i_bytea::text ~~ $$%\\row%$$;
>
> Dollar quoting is more clean to put strings inside than ' '. ~~ is
> like operator.
> And :: operand is more clean to the syntax of the query.
> Don't think so? :)

It looks nice!, the only problem is I'm still forced to set
standard_conforming_strings = 1;


--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979


Re: Query inside RTF

From
Emanuel Calvo Franco
Date:
2009/10/8 Leonardo M. <l.rame@griensu.com>:
> On jue, 2009-10-08 at 17:01 -0300, Emanuel Calvo Franco wrote:
>> > select
>> >  *
>> > from my_table
>> > where
>> >  cast(rtf_field as varchar) like '%condition%'
>> >
>> > This works ok when the condition doesn't include accented chars. RTF
>> > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.
>> >
>> > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
>> > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
>> > doesn't.
>> >
>>
>> I prefer use
>>
>> select * from table where i_bytea::text ~~ $$%\\row%$$;
>>
>> Dollar quoting is more clean to put strings inside than ' '. ~~ is
>> like operator.
>> And :: operand is more clean to the syntax of the query.
>> Don't think so? :)
>
> It looks nice!, the only problem is I'm still forced to set
> standard_conforming_strings = 1;
>
>

part=# set standard_conforming_strings = 0;
SET
part=# select * from p  where i::text ~~ $$%\\\\f3%$$;
   i
--------
 \\\\f3
 \\\\f3
(2 rows)

part=# select * from p  where i::text ~~ $$%\\\\"f3%$$;
    i
---------
 \\\\"f3
(1 row)

part=# select * from p  where i::text ~~ $$%\\\\''f3%$$;
    i
----------
 \\\\''f3
(1 row)

I don't have any problem, wich enconding you are using?
(select getdatabaseencoding(); )



--
              Emanuel Calvo Franco
             DBA at:  www.siu.edu.ar
        www.emanuelcalvofranco.com.ar

Re: Query inside RTF

From
"Leonardo M." Ramé
Date:
On jue, 2009-10-08 at 17:51 -0300, Emanuel Calvo Franco wrote:

> part=# select * from p  where i::text ~~ $$%\\\\''f3%$$;
>     i
> ----------
>  \\\\''f3
> (1 row)
>
> I don't have any problem, wich enconding you are using?
> (select getdatabaseencoding(); )
>

Thanks, now this works:

set standard_conforming_strings = 0;

select
  idturno,
  infres::text
from turno
where
  infres::text ~~ $$%diagn\\\\'f3stico%$$;

This database has WIN1252 encoding.

--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979


Re: Query inside RTF

From
Alvaro Herrera
Date:
Leonardo M. Ramé escribió:

> select
>   idturno,
>   infres::text
> from turno
> where
>   infres::text ~~ $$%diagn\\\\'f3stico%$$;

Just make sure you don't have $$ in the search pattern ...

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support