Re: Designing an extension for feature-space similarity search - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Designing an extension for feature-space similarity search
Date
Msg-id 23545.1329344981@sss.pgh.pa.us
Whole thread Raw
In response to Designing an extension for feature-space similarity search  (Jay Levitt <jay.levitt@gmail.com>)
Responses Re: Designing an extension for feature-space similarity search  (Jay Levitt <jay.levitt@gmail.com>)
Re: Designing an extension for feature-space similarity search  (Jay Levitt <jay.levitt@gmail.com>)
Re: Designing an extension for feature-space similarity search  (Jay Levitt <jay.levitt@gmail.com>)
List pgsql-hackers
Jay Levitt <jay.levitt@gmail.com> writes:
> - I'm not sure how to represent arbitrary column-like features without 
> reinventing the wheel and putting a database in the database.

ISTM you could define a composite type and then create operators and an
operator class over that type.  If you were trying to make a btree
opclass there might be a conflict with the built-in record_ops opclass,
but since you're only interested in GIST I don't see any real
roadblocks.  The main potential disadvantage of this is that you'd have
the standard tuple header as overhead in index entries --- but maybe the
entries are large enough that that doesn't matter, and in any case you
could probably make use of the GIST "compress" method to get rid of most
of the header.  Maybe convert to MinimalTuple, for instance, if you want
to still be able to leverage existing support code for field extraction.

> - Can domains have operators, or are operators defined on types?

I think the current state of play is that you can have such things but
the system will only consider them for exact type matches, so you might
need more explicit casts than you ordinarily would.  However, we only
support domains over base types not composites, so this isn't really
going to be a profitable direction for you anyway.

> - Does KNN-GiST run into problems when <-> returns values that don't "make 
> sense" in the physical world?

Wouldn't surprise me.  In general, non-strict index operators are a bad
idea.  However, if the indexed entities are records, it would be
entirely your own business how you handled individual fields being NULL.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: run GUC check hooks on RESET
Next
From: "Kevin Grittner"
Date:
Subject: Re: run GUC check hooks on RESET