Re: OK, does anyone have any better ideas? - Mailing list pgsql-hackers

From mlw
Subject Re: OK, does anyone have any better ideas?
Date
Msg-id 3A324195.1278B162@mohawksoft.com
Whole thread Raw
In response to Re: OK, does anyone have any better ideas?  (Oleg Bartunov <oleg@sai.msu.su>)
Responses RE: OK, does anyone have any better ideas?  ("Edmar Wiggers" <edmar@brasmap.com>)
List pgsql-hackers
Oleg Bartunov wrote:
> postgres....... It would be great.
> 
> Gotcha. It's impossible to return a set from a function, so the only
> way to use perl to parse your bitmap. We did (in one project) external
> search using suffix arrays which incredibly fast and use postgres to
> return results to perl for processing.

Here's a question, and I simply do not know enough about the internals
of postgres to know, I had a brainstorm last night and though of a
method.

Create a table:
Is it possible to call "SPI_exec" in a C function which does this:

"create temp table fubar as select ts_key(10) as 'key', ts_rank(10) as
'rank' from textsearch_template where ts_placeholder(10) limit
ts_count(10)"

In the above example, which call would be called first? I assume the
count would be called first, but I'm probably wrong. Which ever function
would be called first would execute the query. textsearch_template would
be a bogus table with 1000 or so zeros.

So, in a query one does this:

select ts_search('fubar', 'bla bla');

select * from table, fubar where table.field_key = fubar.key;

How about this: Is there a construct in Postgres that represents a row
ID, so a row can be found quickly without using an index? I tried oid
but that didn't seem fast at all.

P.S. If you want to see the system working, I have a test fixture
running on "http://gateway.mohawksoft.com/music.php3" It calls the text
search daemon from PHP and the text search daemon executes a sql query
per result (PQExec). Look for a popular song and press "search." 

A good example is look for "pink floyd pigs," then try "pink floyd pigs
-box." (It is running slow because it has debugging code, but it is
still pretty fast.) This index has been metaphoned so something like
"penk floid" will work too. 

The "+" operator is "requires" this is the default. The "-" operator is
"must not have" and the "?" operator is "may have" (the "?" operator is
a big hit because it increases the selection size.)

I think if you try it, you'll see why I want to be able to get it deep
into postgres, and what the possibilities are.

-- 
http://www.mohawksoft.com


pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: OK, does anyone have any better ideas?
Next
From: Bruce Momjian
Date:
Subject: Re: Using Threads?