>> We are not storing bytea, a customer is. We are trying to work around
>> customer requirements. The data that is being stored is not always text,
>> sometimes it is binary (a flash file or jpeg). We are using escaped text
>> to be able to search the string contents of that file .
>
> Hmm, have you tried to create a functional trigram index on the
> equivalent of "strings(bytea_column)" or something like that?
I did consider that. I wonder what size we are going to deal with
though. Part of the problem is that some of the data we are dealing with
is quite large.
>
> I imagine strings(bytea) would be a function that returns the
> concatenation of all pure (7 bit) ASCII strings in the byte sequence.
>
> On the other hand, based on Teodor's comment on pg_trgm, maybe this
> won't be possible at all.
>> Yes we do (and can) expect to find text among the bytes. We have
>> searches running, we are just running into the maximum size issues for
>> certain rows.
>
> Do you mean you actually find stuff based on text attributes in JPEG
> images and the like? I thought those were compressed ...
Well a jpeg is probably a bad example, but yes they do search jpeg, I am
guessing mostly for header information. A better example would be
postscript files, flash files and of course large amounts of text + Html.
Sincerely,
Joshua D. Drake
--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate