Re: OK, does anyone have any better ideas? - Mailing list pgsql-hackers
From | Oleg Bartunov |
---|---|
Subject | Re: OK, does anyone have any better ideas? |
Date | |
Msg-id | Pine.GSO.3.96.SK.1001209114358.4174j-100000@ra Whole thread Raw |
In response to | Re: OK, does anyone have any better ideas? (mlw <markw@mohawksoft.com>) |
Responses |
...
|
List | pgsql-hackers |
We need multi-key B-tree like index for such problem. Our full text search engine is based on arrays and we need to find quickly is some number exists in array - some kind of index over int array. We're currently testing GiST approach and seems will have some conclusions soon. I think multi-key B-tree like index would be better in my opinion, but this requires to much work and knowledge of postgres's internals. Yesterday I read about UBTree, seems like it's good for index and query sets. Currently postgres has no set specific methods. Regards, Oleg On Fri, 8 Dec 2000, mlw wrote: > Date: Fri, 08 Dec 2000 20:17:34 -0500 > From: mlw <markw@mohawksoft.com> > To: Tom Lane <tgl@sss.pgh.pa.us> > Cc: Hackers List <pgsql-hackers@postgresql.org> > Subject: Re: [HACKERS] OK, does anyone have any better ideas? > > Tom Lane wrote: > > > > mlw <markw@mohawksoft.com> writes: > > > I have a working version of a text search engine. I want to make it work > > > for Postgres (I will be releasing it GPL). It can literally find the > > > occurrence of a string of words within 5 million records in a few > > > milliseconds. > > > > Where are the records coming from? Are they inside the database? > > (If not, why do you care about integrating this with Postgres?) > > > > It seems like the right way to integrate this sort of functionality > > is to turn it into a kind of index, so that you can do > > > > SELECT * FROM mytable WHERE keyfield ~~~ 'search string'; > > > > where ~~~ is the name of some operator that is associated with the > > index. The temporary-table approach you are taking seems inherently > > klugy, and would still be awkward even if we had functions returning > > recordsets... > > Oh! Another method I tried and just could not get working was returning > an array of integers. I as thinking about "select * from table where > key_field in ( textsearch('bla bla') ), but I haven't been able to get > that to work, and as per a previous post and belatedly reading a FAQ, > this would probably still force a full table scan. > > Another method I thought about was having a table with some maximum > number of zero initialized records, and trying something like: > > create table temp_table as select * from ts_template limit > textsearch('bla bla', 10); > > select filltable(temp_table, 10); > > select * from table where key_field = temp_table.key; > > As you can see, all of these ideas are heinous hacks, there has to be a > better way. Surely someone has a better idea. > > -- > http://www.mohawksoft.com > _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
pgsql-hackers by date: