Re: how to optimize my c-extension functions - Mailing list pgsql-general

From TJ O'Donnell
Subject Re: how to optimize my c-extension functions
Date
Msg-id 41E2CDFD.7030900@acm.org
Whole thread Raw
In response to Re: how to optimize my c-extension functions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I was not hoping that indexing, per se, would help me.
In fact, indexing smiles would be of virtually no use
to me, except for exact matches, e.g. where smiles = 'CCCOC';
I was only trying to subvert the use of indexing for
my own purposes, to store the parsed smiles somewhere
automatic for the sql user, yet transparently available to my
functions for quick searches.

I think I've thought about this enough and gotten enough advice
to realize I should do this the straightforward way.
I should store the parsed smiles in a separate column,
have a trigger to keep it up to date, and require the
user to pass me the parsed_smiles column for quick searches.
And the user could maintain the parsed_smiles in a separate
table, if he so desired, with foreign key relations.

Thanks to everyone for all your advice.  This is my first
postgresql project and I'm liking what I've seen so far.

TJ

Tom Lane wrote:
> "TJ O'Donnell" <tjo@acm.org> writes:
>
>>The only type of search will be of the type:
>
>
>>Select smiles,id from structure where  oe_matches(smiles,'c1ccccc1C(=O)N');
>
>
> You haven't really said much about how you expect an index to be able to
> help you with this, but I think if any index type can help you it will
> be GiST.  What you would do is define an operator on top of the
> oe_matches function, so that the above query is written say
>
> Select smiles,id from structure where smiles ~~ 'c1ccccc1C(=O)N';
>
> and then construct a GiST operator class that accepts ~~ as an
> indexable operator.  There's not a huge amount of
> plain-old-documentation about GiST but there are quite a few examples
> available in the contrib/ tree.
>
> I don't think you can completely hide the existence of the parsed
> version of the smiles data.  The easiest way to go at it would be to
> write the queries like
>
> Select smiles,id from structure where smiles_parsed ~~ 'c1ccccc1C(=O)N';
>
> where smiles_parsed is the extra column holding the parsed data, and
> the ~~ operator is grabbed by a GiST index over that column.
>
> Plan B would be to construct the index as a functional index and write
>
> Select smiles,id from structure where parsed(smiles) ~~ 'c1ccccc1C(=O)N';
>
> However plan B doesn't readily support applying any other operations to
> the parsed data, since it doesn't exist anywhere except inside the
> index.  Since you mentioned having other things you wanted to do with it,
> I think you'll end up wanting the separate column.
>
>             regards, tom lane

pgsql-general by date:

Previous
From: Derik Barclay
Date:
Subject: Re: SELECT from multiple tables (not join though)
Next
From: Bruno Wolff III
Date:
Subject: Re: SELECT from multiple tables (not join though)