Re: Speaking of Indexing... (Text indexing) - Mailing list pgsql-hackers

From Andrew McMillan
Subject Re: Speaking of Indexing... (Text indexing)
Date
Msg-id 3AD3ED25.6EAAE750@catalyst.net.nz
Whole thread Raw
In response to Speaking of Indexing... (Text indexing)  (Poet/Joshua Drake <poet@linuxports.com>)
List pgsql-hackers
Poet/Joshua Drake wrote:
> 
> Good day,
> 
> I've been experimenting a bit with Full Text Indexing in PostgreSQL. I
> have found several conflicting sites various places on the net pertaining
> to whether or not PostgreSQL supports FTI, and I was hoping I could find
> an authoritative answer here - I tried searching the website's archives,
> but the search seems to be having some problems.
> 
> At any rate, I am running a CVS snapshot of 7.1, and I have been trying to
> create a full text index on a series of resumes. Some of these exceed 8k
> in size, which is no longer a storage problem of course with 7.1, but I
> seem to have run into the wicked 8k once again. Specifically:
> 
> ERROR:  index_formtuple: data takes 9344 bytes, max is 8191
> 
> Furthermore, after trying to just index on a 8191-character long substring
> of the resume, I run into the following:
> 
> ERROR:  btree: index item size 3948 exceeds maximum 2713
> 
> The only way I could actually get the index created was to substring the
> body of the resumes down to 2k. I also later tried using HASH rather than
> BTREE, which worked, but none of these solutions really appreciably
> increased performance in the way we were hoping.
> 
> Are these known and accepted limitations of the current 7.1
> implementation, or am I doing something terribly wrong? ;)
> On Tue, 10 Apr 2001, Thomas Lockhart wrote:

You need to use the 'contrib' code for full-text indexing.  The indexing you are
trying to do with that is just using the whole content of the string as the index
value.  Close to useless.

The contrib code is in contrib/fulltextindex.

I have a hacked version of that which changes it to keyword indexing, if you're
interested.

Regards,                Andrew.
-- 
_____________________________________________________________________          Andrew McMillan, e-mail:
Andrew@catalyst.net.nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267


pgsql-hackers by date:

Previous
From: Andrew McMillan
Date:
Subject: Re: "--tuning" compile and runtime option (?)
Next
From: Zeugswetter Andreas SB
Date:
Subject: AW: AW: Truncation of char, varchar types