Hannu Krosing <hannu@trust.ee> wrote:
> "Gene Selkov, Jr." <selkovjr@mcs.anl.gov> wrote:
> >
> > I think I have another problem similar to what I just reported. This time it is in GiST. It does not
> > complain when it builds the index but any attempt to use the table with exesisting GiST indices causes
> > this error:
>
> You seem to be the only one who actually uses GiST <grin>
>
> During last 2-3 years I have posted the question about the usability
> of GiST indexes to various Postgres lists about 3 times with absolutely
> no reaction, so assumed that they didn't work at all ;(
Maybe I came to know postgres too late, but I got the same feeling shortly after I subscribed to the lists. I had a few
peopleask about it but I doubt they made it to the point where it becomes useful. Maybe because it is not exactly a
plugand play thing.
> Could you point me to any information (FAQs, TFMs, ...) about their
> usage ?
If only you and me ask about it, there would hardly be any FAQ. However, please refer to my earlier postings on the
subject.I did make some changes to the code but I believe the description of my experience with GiST is up to date.
Notsure how to quote the exact hyperlink (because of frames), but you can search the old pgsql-questions list for
'selkovgist'. The first two messages that come up are the most relevant (those dated Thu, 19 Feb 1998 13:40:18 and Wed,
08Apr 1998 10:25:11)
> I have been under an impression that the easiest way of adding new
> indexing strategies (I personally need full-text) to postgres would
> be thru GiST, but as I have had no luck in getting them to work as
> they were, I assumed that they were in fact unsupported remnants of
> a long-forgotten project.
This is an abandoned project, but I would be happy to have it preserved in at least the state it was in 6.3.2 and
before.It appears screwed up in 6.4.x
To put the long story short, GiST uses the strategies of R-tree, and it is, in fact, a version of R-tree. Its current
implementationdoes not actually allow you to add new strategies, but it helps you reuse those defined for R-tree with
variousdata types, unlike the postgres R-tree itself, which can be only be used used with built-ins, such as 2D geo
types(boxes, polygons, etc.). There is an example of a GiST over text in Joe Hellerstein's source,
http://selkov-7.mcs.anl.gov/pggist-patched.tgz,it might be close to what you need.
I asked Joe about further development and he told me part of it moved to the project referred to as PREDATOR,
http://simon.cs.cornell.edu/Info/Projects/PREDATOR/predator.html
I did not yet look into it, but I was told that PREDATOR is a practical test bed for the most advanced indexing
technologies.It is also an open source ORDBMS software. I have no idea what it's worth as a database server, but I was
advisedthat it can be used as a development platform for new types and indexing algorithms (even those developed for
othersystems, such as postgres).
As to the GiST in postgres, we're on our own here. It is possible to get help from the original developers (in the form
ofquestions and answers), but they are unlikely to do work on it actively.
> I feel happy and revitalised to know the contrary.
>
> --------------
> Hannu Krosing
Likewise, I am pleased to know someone else is thinking about it. Although I am amazed at the rate of progress
postgreSQLis making, I wish it remained as science-oriented as it originally was. I believe the extensibility continues
tobe its major virtue. I witnessed numerous infertile attempts to use commercial business-oriented software for
scientificdatabasing. There is very little you can do with money, int, float, date and text. It is the extensibility of
typesand access methods that makes any real-world database a gold mine for a researcher. See, for example, this site
(http://wit.mcs.anl.gov/EMP/),where I am trying to put together a retrieval interface to the enzymology database, EMP.
Inparticular, this example illustrates the use of extensions indexed with GiST:
http://wit.mcs.anl.gov/EMP/select_emp_advanced.cgi?E1.ec_code=ec&E1.ec_code.op=%7E%09is+in+range&E1.ec_code.patt=2.1&ec_code.count=1&T1.text=tax&T1.text.op=%7E*%09matches+regex.%2C+case-insensitive&T1.tex!
t.patt=mammalia%7Crodent%7Cprimat%7Caves&T2.text=phd&T2.text.op=%7E%09matches+regex.%2C+case-sensitive&T2.text.patt=v%7CVM%7CMA%7CKC&T3.text=sl&T3.text.op=%7E*%09matches+regex.%2C+case-insensitive&T3.text.patt=cytosol&text.count=1&N1.seg=pho&N1.seg.op=%7E%09contained+in&N1.seg.patt=7+..+7.5&seg.count=1&constraint=%28N1+%26%26+T2%29+%26+E1+%26+T1+%26+T3&do=Run+the+query
Although this is still a very young project (as far as databasing goes), it is considered to be a unique achievement.
Currently,my life depends on it: postgres and extensions are the only tools in their kind that allow me to accomplish
myjob before I am fired. You don't have to be familiar with enzymology to figure out that this kind of data can't be
successfullyused with Oracle or Sybase and clones.
Hope this does not scare you off...
--Gene