Re: fulltext searching via a custom index type - Mailing list pgsql-hackers

From Eric Ridge
Subject Re: fulltext searching via a custom index type
Date
Msg-id C9103CEC-37EB-11D8-A406-000A95BB5944@tcdi.com
Whole thread Raw
In response to Re: fulltext searching via a custom index type  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: fulltext searching via a custom index type  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Dec 26, 2003, at 4:04 PM, Tom Lane wrote:

> Eric Ridge <ebr@tcdi.com> writes:
>> Xapian has it's own storage subsystem, and that's what I'm using to
>> store the index... not using anything internal to postgres (although
>> this could change).
>
> I would say you have absolutely zero chance of making it work that way.

thanks for the encouragement!  :)

> You will not be able to get it to interoperate reliably with
> transactions, checkpointing, or WAL replay; to say nothing of features
> we might add in the future, such as tablespaces and point-in-time 
> recovery.
> You need to migrate all the data into the Postgres storage mechanism.

And these are the things I'm struggling with now.  The basic indexing 
and searching currently works flawlessly, but the moment another user 
connects up, everything goes to hell.

> It might be worth pointing out here than an index AM is not bound to 
> use
> exactly the typical Postgres page layout.  I think you probably do have
> to use the standard page header, but the page contents don't have to
> look like tuples if you don't want 'em to.  For precedent see the hash
> index AM, which stores ordinary index tuples on some index pages but
> uses other pages for its own purposes.

That's useful information.  Thanks.  I've been using the hash AM as my 
guide b/c it's not as complex as the other index types (atleast on the 
public interface side).  Obviously, I'm trying to save the time and 
energy of re-inventing the wheel when it comes full text indexing and 
searching.  Xapian is an awesome standalone engine (and it's amazingly 
fast too!), so it seemed like a good place to start.  It's backend 
storage subsystem is pluggable, and after our little exchange here 
today, I'm now considering writing a postgres backend for Xapian.

I assume the doc chapter on Page Files and the various storage-related 
README files are good places for more information.  Any other tips or 
pointers?

eric



pgsql-hackers by date:

Previous
From: "Thomas Hallgren"
Date:
Subject: Re: PostgreSQL port to pure Java?
Next
From: ivan
Date:
Subject: Re: connections problem