Thread: Generalized Inverted Generalized Search Tree

Generalized Inverted Generalized Search Tree

From
Takahiro Itagaki
Date:
We can index multiple scalar values per row with GIN access method,
and also can index single vector value per row with GiST AM.

Is it worth having a new AM to index multiple vector values per row?
It will be an AM for the missing feature in below:
               | scalar | vector |
----------------+--------+--------+single per row | btree  | gist   |multi per row  | gin    | *HERE* |

We can call the new AM "gigist". Or, there might be another idea
to support expression indexes for SRF functions, like =# CREATE TABLE tbl (c circle[]); =# CREATE INDEX ON tbl USING
gist(unnest(c));
 

Comments and ideas welcome.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



Re: Generalized Inverted Generalized Search Tree

From
Robert Haas
Date:
On Fri, May 14, 2010 at 12:03 AM, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:
> We can index multiple scalar values per row with GIN access method,
> and also can index single vector value per row with GiST AM.
>
> Is it worth having a new AM to index multiple vector values per row?
> It will be an AM for the missing feature in below:
>
>                | scalar | vector |
> ----------------+--------+--------+
>  single per row | btree  | gist   |
>  multi per row  | gin    | *HERE* |
>
> We can call the new AM "gigist". Or, there might be another idea
> to support expression indexes for SRF functions, like
>  =# CREATE TABLE tbl (c circle[]);
>  =# CREATE INDEX ON tbl USING gist (unnest(c));
>
> Comments and ideas welcome.

I'm not sure I agree with your characterization of the purpose of
GIST.  I think of GIST as a mechanism to support index lookups on
types that are not totally ordered.  Still, I guess I understand the
point - I think what you are trying to do is optimize queries of the
form, e.g.:

SELECT * FROM tbl WHERE [some circle in c overlaps a given box]

I don't personally want to do that :-) but I can easily imagine that
someone else might.  As for adding a new AM, we could certainly do it
that way, but we should at least consider the possibility of trying to
modify the existing GIST code to handle this case in addition to the
things it already does.  If that's not possible, then we should think
about how to minimize code duplication.  I fear that if we're not
careful we might end up having to fix a lot of bugs in two places;
also, any code that is specific to gigist will certainly get less
real-world use than gist itself, so sharing code will help keep our
bug counts down.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company