Home > mailing lists

Re: Identifying which column matches a full text search - Mailing list pgsql-sql

From	Richard Huxton
Subject	Re: Identifying which column matches a full text search
Date	July 30, 2008 15:33:54
Msg-id	4890B406.8080307@archonet.com Whole thread Raw
In response to	Re: Identifying which column matches a full text search (Ryan Wallace <rywall@interchange.ubc.ca>)
List	pgsql-sql

Tree view

Ryan Wallace wrote:
> Richard Huxton wrote:
>> Failing that, where I've had many (a dozen) different sources but want 
>> to search them all I've built a textsearch_blocks table with columns to 
>> identify the source and have triggers that keep it up to date.
> 
> Once you've built the text search blocks table, how do you search it? Do you
> perform
> twelve separate queries or can you just do one?

OK, you have a table something like:

fulltext_blocks (  section  varchar(32),  itemid   int4,  words    tsvector,them  PRIMARY KEY (section, itemid)
)

Now assume two of the things I search are "news" and "faqs". I'm 
assuming they've both got a simple serial pkey - if not, "itemid" above 
needs to be text and you'll have to cast.

For each target table (news, faqs) add a trigger that updates 
fulltext_blocks appropriately. This can include weighting title and body 
of a news article.

Then, search the fulltext_blocks table, optionally filtering by section. 
If you're going to have lots of results put the ids into a (perhapd 
temporary) results-table. Then join your results back to the original 
tables with the appropriate UNION (if you need to - it might be you 
fetch results one at a time elsewhere in your app).

SELECT n.id, n.title, n.body
FROM news n JOIN results r ON n.id=r.id
WHERE r.section='news'
UNION ALL
SELECT f.id, f,question, f.answer
FROM faqs f JOIN results r ON f.id=r.id
WHERE r.section='faqs'
;

You'll probably want to set ownership/permissions on the triggers / 
fulltext_blocks table so you can't accidentally update it directly.

In mine I even had a "documents" section which relied on an external 
cron-driven script to strip the first 32k of text out of uploaded 
documents (pdf,word) in addition to user-supplied metadata (title, summary).

Note - this is basically simulating what we could do if you could index 
a view. The fulltext_blocks table is nothing more than a materialised view.

HTH

--   Richard Huxton  Archonet Ltd

pgsql-sql by date:

From: Ryan Wallace
Date: 30 July 2008, 14:53:22
Subject: Re: Identifying which column matches a full text search

From: "Marcin Stępnicki"
Date: 31 July 2008, 10:03:35
Subject: Function returning setof taking parameters from another table

Re: Identifying which column matches a full text search - Mailing list pgsql-sql

Previous

Next