Thread: Query inside RTF
Hi, in my database I store RTF files inside a Bytea field. Now a customer is requesting a search inside RTF fields and I'm trying to implement it by issuing this query: select * from my_table where cast(rtf_field as varchar) like '%condition%' This works ok when the condition doesn't include accented chars. RTF escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on. To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it doesn't. Any hint on this? -- Leonardo M. Ramé Griensu S.A. - Medical IT Córdoba Tel.: 0351-4247979
2009/10/8 Leonardo M. <l.rame@griensu.com>: > Hi, in my database I store RTF files inside a Bytea field. Now a > customer is requesting a search inside RTF fields and I'm trying to > implement it by issuing this query: > > select > * > from my_table > where > cast(rtf_field as varchar) like '%condition%' > > This works ok when the condition doesn't include accented chars. RTF > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on. > > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it > doesn't. > > Any hint on this? first, try dollar quoting: cast(rtf_field as varchar) like $merlin$%condition%$merlin$ :-) merlin
On jue, 2009-10-08 at 11:28 -0400, Merlin Moncure wrote: > 2009/10/8 Leonardo M. <l.rame@griensu.com>: > > Hi, in my database I store RTF files inside a Bytea field. Now a > > customer is requesting a search inside RTF fields and I'm trying to > > implement it by issuing this query: > > > > select > > * > > from my_table > > where > > cast(rtf_field as varchar) like '%condition%' > > > > This works ok when the condition doesn't include accented chars. RTF > > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on. > > > > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth > > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it > > doesn't. > > > > Any hint on this? > > first, try dollar quoting: > cast(rtf_field as varchar) like $merlin$%condition%$merlin$ > > :-) > > merlin > Thanks Merlin, I found the solution, it was related to "standard_conforming_strings". My query is this now: set standard_conforming_strings = 1; SELECT idturno, Upper(cast(InfRes as Varchar)) as InfRes from turno where cast(InfRes as Varchar) like '%diagn\\\\''f3stica%'; -- Leonardo M. Ramé Griensu S.A. - Medical IT Córdoba Tel.: 0351-4247979
On Thu, Oct 08, 2009 at 10:58:05AM -0300, Leonardo M. Rammm wrote: > cast(rtf_field as varchar) like '%condition%' I don't think that cast isn't doing what you expect at all. As an example, try doing something like: select cast(bytea e'hi\nbye' as varchar); encode() may be what you want instead. How do RTF files handle encoding? You may have better luck with using a specialized program to pull the text out of the RTF file and then using the text search stuff in PG. LIKE conditions can't be indexed well and hence the above is going to be slow for any reasonable number of documents. -- Sam http://samason.me.uk/
> select > * > from my_table > where > cast(rtf_field as varchar) like '%condition%' > > This works ok when the condition doesn't include accented chars. RTF > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on. > > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it > doesn't. > I prefer use select * from table where i_bytea::text ~~ $$%\\row%$$; Dollar quoting is more clean to put strings inside than ' '. ~~ is like operator. And :: operand is more clean to the syntax of the query. Don't think so? :) -- Emanuel Calvo Franco DBA at: www.siu.edu.ar www.emanuelcalvofranco.com.ar
On jue, 2009-10-08 at 17:01 -0300, Emanuel Calvo Franco wrote: > > select > > * > > from my_table > > where > > cast(rtf_field as varchar) like '%condition%' > > > > This works ok when the condition doesn't include accented chars. RTF > > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on. > > > > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth > > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it > > doesn't. > > > > I prefer use > > select * from table where i_bytea::text ~~ $$%\\row%$$; > > Dollar quoting is more clean to put strings inside than ' '. ~~ is > like operator. > And :: operand is more clean to the syntax of the query. > Don't think so? :) It looks nice!, the only problem is I'm still forced to set standard_conforming_strings = 1; -- Leonardo M. Ramé Griensu S.A. - Medical IT Córdoba Tel.: 0351-4247979
2009/10/8 Leonardo M. <l.rame@griensu.com>: > On jue, 2009-10-08 at 17:01 -0300, Emanuel Calvo Franco wrote: >> > select >> > * >> > from my_table >> > where >> > cast(rtf_field as varchar) like '%condition%' >> > >> > This works ok when the condition doesn't include accented chars. RTF >> > escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on. >> > >> > To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth >> > that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it >> > doesn't. >> > >> >> I prefer use >> >> select * from table where i_bytea::text ~~ $$%\\row%$$; >> >> Dollar quoting is more clean to put strings inside than ' '. ~~ is >> like operator. >> And :: operand is more clean to the syntax of the query. >> Don't think so? :) > > It looks nice!, the only problem is I'm still forced to set > standard_conforming_strings = 1; > > part=# set standard_conforming_strings = 0; SET part=# select * from p where i::text ~~ $$%\\\\f3%$$; i -------- \\\\f3 \\\\f3 (2 rows) part=# select * from p where i::text ~~ $$%\\\\"f3%$$; i --------- \\\\"f3 (1 row) part=# select * from p where i::text ~~ $$%\\\\''f3%$$; i ---------- \\\\''f3 (1 row) I don't have any problem, wich enconding you are using? (select getdatabaseencoding(); ) -- Emanuel Calvo Franco DBA at: www.siu.edu.ar www.emanuelcalvofranco.com.ar
On jue, 2009-10-08 at 17:51 -0300, Emanuel Calvo Franco wrote: > part=# select * from p where i::text ~~ $$%\\\\''f3%$$; > i > ---------- > \\\\''f3 > (1 row) > > I don't have any problem, wich enconding you are using? > (select getdatabaseencoding(); ) > Thanks, now this works: set standard_conforming_strings = 0; select idturno, infres::text from turno where infres::text ~~ $$%diagn\\\\'f3stico%$$; This database has WIN1252 encoding. -- Leonardo M. Ramé Griensu S.A. - Medical IT Córdoba Tel.: 0351-4247979
Leonardo M. Ramé escribió: > select > idturno, > infres::text > from turno > where > infres::text ~~ $$%diagn\\\\'f3stico%$$; Just make sure you don't have $$ in the search pattern ... -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support