Re: R-tree and start/end queries - Mailing list pgsql-sql
From | Chris Mungall |
---|---|
Subject | Re: R-tree and start/end queries |
Date | |
Msg-id | Pine.OSX.4.58.0509211204420.24815@skerryvore.dhcp.lbl.gov Whole thread Raw |
In response to | R-tree and start/end queries (Sean Davis <sdavis2@mail.nih.gov>) |
List | pgsql-sql |
On Wed, 21 Sep 2005, Sean Davis wrote: > I have a table like: > > Create table gf ( > pk serial, > start int, > end int, > gf varchar > ); > > I want to do queries along the lines of: > > "find all gf that overlap with (10000,20000)" or > "find all gf that overlap with each other" > > And others. I have read over the documentation, but I still remain unclear > about how to implement R-tree indexing in this situation. Any suggestions? Hi Sean I'm guessing that this is for some kind of genome database, yep? You may want to look at the chado database which has a growing library of functions for this sort of thing; www.gmod.org/schema Here is the code for doing range interval functions; our featureloc is equivalent to your "gf" (though we separate the entity from the entity being located). Our fmin and fmax may be equivalent to your start and end above (unless you indicate directionality with start>end in which case the intersection functions get a bit trickier). Our feature_id is probably equivalent to your gf column. We use the builtin pg types "point" and "box" and make an RTREE index over this: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- -- functions operating on featureloc ranges -- -- create a point CREATE OR REPLACE FUNCTION create_point (int, int) RETURNS point AS'SELECT point ($1, $2)' LANGUAGE 'sql'; -- create a range box -- (make this immutable so we can index it) CREATE OR REPLACE FUNCTION boxrange (int, int) RETURNS box AS'SELECT box (create_point(0, $1), create_point($2,500000000))' LANGUAGE 'sql' IMMUTABLE; -- create a query box CREATE OR REPLACE FUNCTION boxquery (int, int) RETURNS box AS'SELECT box (create_point($1, $2), create_point($1, $2))' LANGUAGE 'sql' IMMUTABLE; --functional index that depends on the above functions CREATE INDEX binloc_boxrange ON featureloc USING RTREE (boxrange(fmin, fmax)); CREATE OR REPLACE FUNCTION featureloc_slice(int, int) RETURNS setof featureloc AS 'SELECT * from featureloc where boxquery($1,$2) @ boxrange(fmin,fmax)' LANGUAGE 'sql'; CREATE OR REPLACE FUNCTION featureloc_slice(varchar, int, int) RETURNS setof featureloc AS 'SELECT featureloc.* FROM featureloc INNER JOIN feature AS srcf ON (srcf.feature_id = featureloc.srcfeature_id) WHERE boxquery($2, $3) @ boxrange(fmin,fmax) AND srcf.name = $1 ' LANGUAGE 'sql'; CREATE OR REPLACE FUNCTION featureloc_slice(int, int, int) RETURNS setof featureloc AS 'SELECT * FROM featureloc WHEREboxquery($2, $3) @ boxrange(fmin,fmax) AND srcfeature_id = $1 ' LANGUAGE 'sql'; -- can we not just do these as views? CREATE OR REPLACE FUNCTION feature_overlaps(int)RETURNS setof feature AS'SELECT feature.* FROM feature INNER JOIN featurelocAS x ON (x.feature_id=feature.feature_id) INNER JOIN featureloc AS y ON (y.feature_id=$1) WHERE x.srcfeature_id= y.srcfeature_id AND ( x.fmax >= y.fmin AND x.fmin <= y.fmax ) ' LANGUAGE 'sql'; CREATE OR REPLACE FUNCTION feature_disjoint_from(int)RETURNS setof feature AS'SELECT feature.* FROM feature INNER JOIN featurelocAS x ON (x.feature_id=feature.feature_id) INNER JOIN featureloc AS y ON (y.feature_id=$1) WHERE x.srcfeature_id= y.srcfeature_id AND ( x.fmax < y.fmin OR x.fmin > y.fmax ) ' LANGUAGE 'sql'; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Cheers Chris > Thanks, > Sean > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >