Thread: RE: [HACKERS] Indexing for geographic objects?

RE: [HACKERS] Indexing for geographic objects?

From
Michael Ansley
Date:

Hi,

Remember also that the GiST library has been integrated into PG, (my brother is doing some thesis workon that at the moment), and you can create new index types relatively quickly (assuming that you understand the indexing theory ;-) using this mechanism.  Run a web search on GiST for more info.

Currently GiST has support for btree and rtree indexes, and possibly r+ or * trees, I can't remember which, if any, and IIRC, at least a couple more.  However, if you have a requirement or 3d indexing, and you have the knowledge available, you should be able to hack a few 3d indexes together quite quickly.

Cheers...
 

-----Original Message-----
From: Tom Lane
To: Franck Martin
Cc: pgsql-general; pgsql-hackers
Sent: 11-26-00 4:35 AM
Subject: Re: [HACKERS] Indexing for geographic objects?

Franck Martin <franck@sopac.org> writes:
> I would greatly appreciate if someone could guide me through the
> methodology to build an index for a custom type or point me to some
> readings where the algorithm is explained (web, book, etc...).

The Programmer's Guide chapter "Interfacing Extensions To Indices"
outlines the procedure for making a new datatype indexable.  It
only discusses the case of adding btree support for a new type,
though.  For other index classes such as R-tree there are different
sets of required operators, which are not as well documented but
you can find out by looking at code for the already-supported
datatypes.

> I plan to use 3D geographical objects...

That's a bit hard since we don't have any indexes suitable for 3-D
coordinates --- the existing R-tree type is for 2-D objects.  What's
more it assumes that coordinates are Euclidean, which is probably
not the model you want for geographical coordinates.

In theory you could build a new index type suitable for indexing
3-D points, using the R-tree code as a starting point.  I wouldn't
class it as a project suitable for a newbie however :-(.

Depending on what your needs are, you might be able to get by with
projecting your objects into a flat 2-D coordinate system and using
an R-tree index in that space.  It'd just be approximate but that
might be close enough for index purposes.

                        regards, tom lane

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************

RE: [HACKERS] Indexing for geographic objects?

From
Oleg Bartunov
Date:
I'm also interested in GiST and would be happy if somebody could provide
workable example. I have an idea to use GiST indices for our fulltextsearch
system.

    Regards,

            Oleg
On Sun, 26 Nov 2000, Michael Ansley wrote:

> Date: Sun, 26 Nov 2000 11:34:16 -0000
> From: Michael Ansley <Michael.Ansley@intec-telecom-systems.com>
> To: 'Tom Lane ' <tgl@sss.pgh.pa.us>, 'Franck Martin ' <franck@sopac.org>
> Cc: 'pgsql-general ' <pgsql-general@postgresql.org>,
>     'pgsql-hackers ' <pgsql-hackers@postgresql.org>,
>     "'t.h.p.ansley@durham.co.uk'" <t.h.p.ansley@durham.co.uk>
> Subject: RE: [HACKERS] Indexing for geographic objects?
>
> Hi,
>
> Remember also that the GiST library has been integrated into PG, (my brother
> is doing some thesis workon that at the moment), and you can create new
> index types relatively quickly (assuming that you understand the indexing
> theory ;-) using this mechanism.  Run a web search on GiST for more info.
>
> Currently GiST has support for btree and rtree indexes, and possibly r+ or *
> trees, I can't remember which, if any, and IIRC, at least a couple more.
> However, if you have a requirement or 3d indexing, and you have the
> knowledge available, you should be able to hack a few 3d indexes together
> quite quickly.
>
>
> Cheers...
>
>
> -----Original Message-----
> From: Tom Lane
> To: Franck Martin
> Cc: pgsql-general; pgsql-hackers
> Sent: 11-26-00 4:35 AM
> Subject: Re: [HACKERS] Indexing for geographic objects?
>
> Franck Martin <franck@sopac.org> writes:
> > I would greatly appreciate if someone could guide me through the
> > methodology to build an index for a custom type or point me to some
> > readings where the algorithm is explained (web, book, etc...).
>
> The Programmer's Guide chapter "Interfacing Extensions To Indices"
> outlines the procedure for making a new datatype indexable.  It
> only discusses the case of adding btree support for a new type,
> though.  For other index classes such as R-tree there are different
> sets of required operators, which are not as well documented but
> you can find out by looking at code for the already-supported
> datatypes.
>
> > I plan to use 3D geographical objects...
>
> That's a bit hard since we don't have any indexes suitable for 3-D
> coordinates --- the existing R-tree type is for 2-D objects.  What's
> more it assumes that coordinates are Euclidean, which is probably
> not the model you want for geographical coordinates.
>
> In theory you could build a new index type suitable for indexing
> 3-D points, using the R-tree code as a starting point.  I wouldn't
> class it as a project suitable for a newbie however :-(.
>
> Depending on what your needs are, you might be able to get by with
> projecting your objects into a flat 2-D coordinate system and using
> an R-tree index in that space.  It'd just be approximate but that
> might be close enough for index purposes.
>
>             regards, tom lane
>
>
> **********************************************************************
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the system manager.
>
> This footnote also confirms that this email message has been swept by
> MIMEsweeper for the presence of computer viruses.
>
> www.mimesweeper.com
> **********************************************************************
>

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: [HACKERS] Indexing for geographic objects?

From
selkovjr@mcs.anl.gov
Date:
Oleg Bartunov <oleg@sai.msu.su> wrote:
>
> I'm also interested in GiST and would be happy if somebody could provide
> workable example. I have an idea to use GiST indices for our fulltextsearch
> system.
>

I have recently replied to Franck Martin in regards to this indexing
question, but I didn't think the subject was popular enough for me to
contaminate the list(s). You prove me wrong. Here goes:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To: Franck Martin <franck@sopac.org>
From: selkovjr@mcs.anl.gov
Reply-to: selkovjr@mcs.anl.gov
Subject: Re: [HACKERS] Indexing for geographic objects?
In-reply-to: <3A1EE0F4.3DC4161B@sopac.org>
Comments: In-reply-to Franck Martin <franck@sopac.org>
   message dated "Sat, 25 Nov 2000 10:43:16 +1300."
Mime-Version: 1.0 (generated by tm-edit 7.108)
Date: Sat, 25 Nov 2000 02:56:03 -0600

It is probably possible to hook up an extension directly with the
R-tree methods available in postgres -- if you stare at the code long
enough and figure how to use the correct strategies. I chose an easier
path years ago and I am still satisfied with the results. Check out
the GiST -- a general access method built on top of R-tree to provide
a user-friendly interface to it and to allow indexing of more abstract
types, for which straight R-tree is not directly applicable.

I have a small set of complete data types, of which a couple
illustrate the use of GiST indexing with the geometrical objects, in:

http://wit.mcs.anl.gov/~selkovjr/pg_extensions/

If you are using a pre-7.0 postrgres, grab the file contrib.tgz,
otherwise take contrib-7.0.tgz. The difference is insignificant, but
the pre-7.0 version will not fit the current schema. Unpack the source
into postgresql-*/contrib and follow instructions in the README
files. The types of interest for you will be seg and cube. You will
find pointers to the original sources and docs in the CREDITS section
of the README file. I also have a version of the original example code
in pggist-patched.tgz, but I did not check if it works with current
postgres. It should not be difficult to fix it if it doesn't -- the
recent development in the optimizer area made certain things
unnecessary.

You might want to check out a working example of the segment data type at:

http://wit.mcs.anl.gov/EMP/indexing.html

(search the page for 'KM')

I will be glad to help, but I would also recommend to send more
sophisticated questions to Joe Hellerstein, the leader of the original
postgres team that developed GiST. He was very helpful whenever I
turned to him during the early stages of my data type project.

--Gene


Re: [HACKERS] Indexing for geographic objects?

From
Tom Lane
Date:
Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:
> Remember also that the GiST library has been integrated into PG, (my brother
> is doing some thesis workon that at the moment),

Yeah?  Does it still work?

Since the GIST code is not tested by any standard regress test, and is
so poorly documented that hardly anyone can be using it, I've always
assumed that it is probably suffering from a severe case of bit-rot.

I'd love to see someone contribute documentation and regression test
cases for it --- it's a great feature, if it works.

            regards, tom lane

Re: [HACKERS] Indexing for geographic objects?

From
Tom Lane
Date:
selkovjr@mcs.anl.gov writes:
> Tom Lane wrote:
>> Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:
>>>> Remember also that the GiST library has been integrated into PG, (my brother
>>>> is doing some thesis workon that at the moment),
>>
>> Yeah?  Does it still work?

> You bet. One would otherwise be hearing from me. I depend on it quite
> heavily and I am checking with almost every release.

That's very good to hear!  I was not aware that anyone was banging on it.

It seems like it would be a fine idea to adopt your stuff at least into
the contrib part of the distribution, maybe even (or eventually) into
the main release.  I think we could probably make it part of the regress
tests even if it's contrib --- there's precedent, as regress already
uses some contrib stuff.

Do you have any problem with releasing your stuff under the Postgres
distribution terms (BSD license)?

            regards, tom lane

Re: [HACKERS] Indexing for geographic objects?

From
selkovjr@mcs.anl.gov
Date:
Tom Lane wrote:

> Do you have any problem with releasing your stuff under the Postgres
> distribution terms (BSD license)?

No, I don't see any problem with the BSD license, or any other
license, for that matter. I just had some reservations about releasing
stuff that was far from perfect, and it took me a while to realize
it could be useful as it is for some, and serve as a good starting
point for others. Now I wonder, what does it take to be in contrib?

> there's precedent, as regress already
> uses some contrib stuff.

I'd be curious to find out what that stuff is and how it's done.

--Gene

Re: [HACKERS] Indexing for geographic objects?

From
Tom Lane
Date:
selkovjr@mcs.anl.gov writes:
> Tom Lane wrote:
>> Do you have any problem with releasing your stuff under the Postgres
>> distribution terms (BSD license)?

> No, I don't see any problem with the BSD license, or any other
> license, for that matter. I just had some reservations about releasing
> stuff that was far from perfect, and it took me a while to realize
> it could be useful as it is for some, and serve as a good starting
> point for others. Now I wonder, what does it take to be in contrib?

Just contributing it ;-), which I take the above as permission to do.
When I come up for air from the IPC-hacking I'm doing, I'll grab your
tarball and see about adding it as a contrib module.

Many thanks!

            regards, tom lane

Re: [HACKERS] Indexing for geographic objects?

From
selkovjr@mcs.anl.gov
Date:
Tom Lane wrote:
> Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:
> > Remember also that the GiST library has been integrated into PG, (my brother
> > is doing some thesis workon that at the moment),
>
> Yeah?  Does it still work?

You bet. One would otherwise be hearing from me. I depend on it quite
heavily and I am checking with almost every release. I am now current
with 7.0.2 -- this time it required some change, although not in the c
code. And that's pretty amazing: I was only screwed once since
postgres95 -- by a beta version I don't remember now; then I
complained and the problem was fixed. I don't even know whom I owe
thanks for that.

> Since the GIST code is not tested by any standard regress test, and is
> so poorly documented that hardly anyone can be using it,
I've always
> assumed that it is probably suffering from a severe case of bit-rot.
>
> I'd love to see someone contribute documentation and regression test
> cases for it --- it's a great feature, if it works.

The bit rot fortunately did not happen, but the documentation I
promised Bruce many months ago is still "in the works" -- meaning,
something interfered and I haven't had a chance to start. Like a
friend of mine muses all the time, "Promise doesn't mean
marriage". Boy, do I feel guilty.

It's a bit better with the testing. I am not sure how to test the
GiST directly, but I have adapted the current version of regression
tests for the data types that depend on it. One can find them in my
contrib directory, under test/ (again, it's
http://wit.mcs.anl.gov/~selkovjr/pg_extensions/contrib.tgz)

It would be nice if at least one of the GiST types became a built-in
(that would provide for a more intensive testing), but I can also
think of the contrib code being (optionally) included into the main
build and regression test trees. The top-level makefile can have a
couple of special targets to build and test the contribs. I believe my
version of the tests can be a useful example to other contributors
whose code is already in the source tree.

--Gene