Thread: Search content within a bytea field
Hi all, I'm trying to search for some content within a bytea field. My PreparedStatement looks like this: SELECT id FROM mytable WHERE myfield LIKE ? Now, when I execute a ps.setString(1, "%" + value + "%") where 'value' is the value I want to search for, the PS becomes SELECT id FROM mytable WHERE myfield LIKE %VALUE% and the query clearly fails ("operator does not exist: bytea ~~ character varying"). Now, my question is....why? What am I doing wrong? Thanks for you help!
Hi Damiano, It seems to me that the problem is comming from the BYTEA type, not from the prepared statement by itself: As far as I know the only operator supported by BYTEA is concatenation. You can't even compare two BYTEA for (in)equality - even less using the LIKE operator. However, you can convert BYTEA to an other type using CAST(). Something like that might do the trick (I think): SELECT id FROM mytable WHERE CAST(myfield AS TEXT) LIKE ? Be warned that performances (and possibly memory consummation) of such a request is certainly far from ideal! Hope this helps, Sylvain. > Hi all, > I'm trying to search for some content within a bytea field. My > PreparedStatement looks like this: > > SELECT id FROM mytable WHERE myfield LIKE ? > > Now, when I execute a > > ps.setString(1, "%" + value + "%") > > where 'value' is the value I want to search for, the PS becomes > > SELECT id FROM mytable WHERE myfield LIKE %VALUE% > > and the query clearly fails ("operator does not exist: bytea ~~ > character varying"). Now, my question is....why? What am I doing > wrong? > > Thanks for you help! > -- Website: http://www.chicoree.fr
Sylvain Leroux <sl20@wanadoo.fr> writes: > It seems to me that the problem is comming from the BYTEA type, not from > the prepared statement by itself: As far as I know the only operator > supported by BYTEA is concatenation. You can't even compare two BYTEA > for (in)equality - even less using the LIKE operator. Nonsense ... regression=# select oid::regoperator from pg_operator where oprleft = 'bytea'::regtype or oprright = 'bytea'::regtype; oid ------------------ =(bytea,bytea) <>(bytea,bytea) <(bytea,bytea) <=(bytea,bytea) >(bytea,bytea) >=(bytea,bytea) ~~(bytea,bytea) !~~(bytea,bytea) ||(bytea,bytea) (9 rows) But this does point out the problem: LIKE (a/k/a ~~) on a bytea wants a bytea on the righthand side, not text or varchar. So setString is the wrong thing to use. regards, tom lane
Hi, And sorry for that: Tom Lane a écrit : > Sylvain Leroux <sl20@wanadoo.fr> writes: >> It seems to me that the problem is comming from the BYTEA type, not from >> the prepared statement by itself: As far as I know the only operator >> supported by BYTEA is concatenation. You can't even compare two BYTEA >> for (in)equality - even less using the LIKE operator. > > Nonsense ... But I was quite sure to have read that somewhere -- and indeed: (in "PostgreSQL - second edition / ISBN:978-0-672-32756-8) "PostgreSQL offers a single BYTEA operator: concatenation. [...] Note that you can't compare two BYTEA values, even for equality/inequality. You can of course convert a BYTEA value into another value using CAST() operator, and that opens up other operators." I remember having found that annoying at the time - that's why I remembered it. But it appears this statement was outdated -- or simple false all along? Anyway once again sorry for the misinformation. Sylvain. -- Website: http://www.chicoree.fr
> But this does point out the problem: LIKE (a/k/a ~~) on a bytea wants > a bytea on the righthand side, not text or varchar. So setString is > the wrong thing to use. Perhaps something like: setBytes(myString.getBytes()) ?
> It seems to me that the problem is comming from the BYTEA type, not from the > prepared statement by itself: As far as I know the only operator supported > by BYTEA is concatenation. You can't even compare two BYTEA for (in)equality > - even less using the LIKE operator. Right, I suspected that too, as I've been reading a little bit around. But...why if I manually query with a SQL client (AquaStudio) as follows: SELECT myfield FROM mytable WHERE myfield LIKE '%hello%' and 'myfield' is a bytea field I get the result I expected? What's the "trick" the client uses behind the curtains?
> > >Hi all, >I'm trying to search for some content within a bytea field. My >PreparedStatement looks like this: > >SELECT id FROM mytable WHERE myfield LIKE ? > >Now, when I execute a > >ps.setString(1, "%" + value + "%") > >where 'value' is the value I want to search for, the PS becomes > >SELECT id FROM mytable WHERE myfield LIKE %VALUE% > >and the query clearly fails ("operator does not exist: bytea ~~ >character varying"). Now, my question is....why? What am I doing >wrong? > >Thanks for you help! > The problem I believe is in your prepare statement ps.setString(). The statement using a string, but your field is bytea, try a cast. >SELECT myfield FROM mytable WHERE myfield LIKE '%hello%' > >and 'myfield' is a bytea field I get the result I expected? What's the >"trick" the client uses behind the curtains? > Of course this work, example given below, because you are not using a prepare statement. PostgreSQL expects the exact type to be used in prepare statements. These fields can be searched, but if the content is binary I don't think this general approach is going to work. The content below for the dump is octal I believe, but does contain the text "find" in the id = 2. The select statement for search in MyJSQLView also shown, that does successfully find the row. danap. -- -- MyJSQLView SQL Dump -- Version: 3.03 -- WebSite: http://myjsqlview.org -- -- Host: 127.0.0.1 -- Generated On: 2009.08.31 AD at 07:20:14 MDT -- SQL version: PostgreSQL 8.3.3 -- Database: postgresql_datatypes -- -- ------------------------------------------ -- -- Table structure for table "public"."bytea_types" -- DROP TABLE IF EXISTS "public"."bytea_types"; CREATE TABLE "public"."bytea_types" ( "data_type_id" serial NOT NULL, "bytea_type" bytea DEFAULT NULL, PRIMARY KEY ("data_type_id") ); -- -- Dumping data for table "public"."bytea_types" -- LOCK TABLE "public"."bytea_types"; INSERT INTO "public"."bytea_types" ("data_type_id", "bytea_type") VALUES('1', E'\\054\\054\\041\\041\\041\\043\\043'); INSERT INTO "public"."bytea_types" ("data_type_id", "bytea_type") VALUES('2', E'\\152\\165\\163\\164\\040\\163\\157\\155\\145\\040\\155\\157\\162\\145\\040\\146\\151\\156\\144\\040\\150\\145\\154\\154\\157'); TableTabPanel_PostgreSQL actionPerformed() Connection Created SELECT "data_type_id" FROM "public"."bytea_types" WHERE "bytea_type"::TEXT LIKE '%find%' ORDER BY "data_type_id" ASC LIMIT 50 OFFSET 0 TableTabPanel_PostgreSQL actionPerformed() Connection Closed
Damiano Bolzoni <damiano.bolzoni@gmail.com> writes: > But...why if I manually query with a SQL client (AquaStudio) as > follows: > SELECT myfield FROM mytable WHERE myfield LIKE '%hello%' > and 'myfield' is a bytea field I get the result I expected? The unknown-type literal is assumed to be of type bytea in this case. In general, if the parser finds "known_type_value operator unmarked_literal" then it preferentially selects an operator with both input types the same, ie, it prefers to believe the unmarked_literal is the same data type as the other input. The JDBC environment behaves differently because setString and so on do not produce unmarked literals --- there's an implicit cast to some datatype. So you have to make sure you've picked a parameter-setting function that is reasonably compatible with the intended operation. This is a good thing, really; it tends to keep the system from choosing an unexpected operation. regards, tom lane
Damiano Bolzoni, 31.08.2009 08:06: > Right, I suspected that too, as I've been reading a little bit around. > But...why if I manually query with a SQL client (AquaStudio) as > follows: > > SELECT myfield FROM mytable WHERE myfield LIKE '%hello%' > > and 'myfield' is a bytea field I get the result I expected? What's the > "trick" the client uses behind the curtains? This looks _very_ strange (if not even wrong) to me. If you are storing character data, why don't you use the text datatype? Thomas
* Tom Lane: > The JDBC environment behaves differently because setString and so on > do not produce unmarked literals --- there's an implicit cast to > some datatype. So you have to make sure you've picked a > parameter-setting function that is reasonably compatible with the > intended operation. You can append "?stringtype=unspecified" to the JDBC URL, and you get bevahior which matches more closely what other (non-Java) interfaces do. > This is a good thing, really; it tends to keep the system from > choosing an unexpected operation. It more or less prevents use of fancy types. There doesn't seem to be to set an unspecified type on a per-parameter basis, unfortunately. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On Thu, 3 Sep 2009, Florian Weimer wrote: > It more or less prevents use of fancy types. There doesn't seem to be > to set an unspecified type on a per-parameter basis, unfortunately. > Use setObject(int, object, Types.OTHER); Kris Jurka
* Kris Jurka: > On Thu, 3 Sep 2009, Florian Weimer wrote: > >> It more or less prevents use of fancy types. There doesn't seem to be >> to set an unspecified type on a per-parameter basis, unfortunately. >> > > Use setObject(int, object, Types.OTHER); I get a PSQLException with "Cannot cast an instance of java.lang.String to type Types.OTHER". Other types (such as Integer) don't work, either. I look rather hard at Debian's 8.2-504-3 source but couldn't find a straightforward way to make this work, beyond specifying stringtype=unspecified for the whole connection. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Florian Weimer wrote: > * Kris Jurka: > >> Use setObject(int, object, Types.OTHER); > > I get a PSQLException with "Cannot cast an instance of > java.lang.String to type Types.OTHER". Other types (such as Integer) > don't work, either. > > I look rather hard at Debian's 8.2-504-3 source but couldn't find a > straightforward way to make this work, beyond specifying > stringtype=unspecified for the whole connection. > This is new functionality in 8.3 and newer drivers. Kris Jurka