I've written a c-language extension to postgresql to implement a
chemical search of a varchar column (named smiles, typically).
It might be called as:
oe_matches(smiles,'COCC') where 'COCC' is a typical search string.
This uses 3rd party functions to parse the varchar inputs into c++
objects. In order to optimize this search, I'd like to parse the whole
table containing smiles just once, store it and use the parsed data
instead of the varchar smiles.
So, I could create another column, say smiles_ob and have the
oe_matches function take that column. I'd have to be sure the smiles_ob
column was related (by a trigger?) to the smiles column.
But I was thinking I might be able to hide these parsed objects from the
user by somehow incoporating the parsed objects into a type of index.
I'd like also to use additional columns (like molecular formula) in
the match function to "triage" the table to quickly rule out impossible
matches, before doing a full search/match.
Am I way off the track here? Is it a bad idea to clutter the index
with things like this? Is it possible? Is there another
approach that could hide some of these details from the user - meaning
they would not have to create/update these additional columns?
Thanks,
TJ O'Donnell