Re: find close (duplicate) points + create index - Mailing list pgsql-novice

From Elinor Medezinski
Subject Re: find close (duplicate) points + create index
Date
Msg-id 200403101122.47748.elinor@bellatrix.tau.ac.il
Whole thread Raw
In response to Re: find close (duplicate) points + create index  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: find close (duplicate) points + create index
Re: find close (duplicate) points + create index
List pgsql-novice
You suggested:

>     select * from pointtable a, pointtable b
>     where (a.point <-> b.point) <= 1;

Thanks, Thats what I'll do.

> Making it fast is a more difficult problem :-( ... if you write the
> above query as-is then the system will sit there and compare each row of
> pointtable to each other row, looking for pairs of rows that match the
> where-clause.  Okay if you just have some thousands of rows, but on a
> big table this will take longer than you want to wait.

This query will only work on a few thousand lines, but I will have other
queries on type point that will require comparing tables with millions of
rows. Therefore I must learn how to build indexes on points.

> A btree index on a point column would be quite useless, since btree
> understands only a one-dimensional continuum with less-than, equal,
> greater-than relationships.  But I think you might be able to do
> something with an rtree index.

That much I know. I didn't find how I can use rtree to build an index on
points, seeing how when I tried the following:
"create INDEX Phot_point_a ON Phot USING RTREE (point_a);"
I got this error:
"ERROR:  data type point has no default operator class for access method
"rtree"
HINT:  You must specify an operator class for the index or define a default
operator class for the data type."

And then I found out that in postgres the only operator classes defined for
rtree indexes are: bigbox_ops, box_ops and poly_ops. Neither of which works
with points, only with type box and polygon. Therefore I also have to create
an operator class. I didn't understand how to do that. Do you know how?


> I'd look at making an rtree index on
> the unit box around each point, and then using an "overlaps" test as
> an indexable coarse filter before the exact distance check.

I didn't understand this either.

Thanks,
Elinor

pgsql-novice by date:

Previous
From: Ron St-Pierre
Date:
Subject: Re: Finding slow queries?
Next
From: "Ryan Vaughan"
Date:
Subject: composite primary key to foreign key