Re: tsearch2 in PostgreSQL 8.3? - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: tsearch2 in PostgreSQL 8.3?
Date
Msg-id 200708171731.l7HHVeK24797@momjian.us
Whole thread Raw
In response to Re: tsearch2 in PostgreSQL 8.3?  (Josh Berkus <josh@agliodbs.com>)
Responses Re: tsearch2 in PostgreSQL 8.3?
List pgsql-hackers
Josh Berkus wrote:
> Folks,
> 
> Here's something not to forget in this whole business: the present TSearch2 
> implementation permits you to have a different tsvector configuration for 
> each *row*, not just each column.  That is, applications can be built with 
> "per-cell" configs.
> 
> I know of at least one out there: Ubuntu's Rosetta.  I'm sure there are 
> others.
> 
> Therefore there are two cases we're trying to solve:
> 
> (1) The simple case: someone wants to build a database with text search 
> entirely in one UTF8 language.  All vectors are in that language, and so are 
> all queries.  The user wants the simplest syntax possible.
> 
> (2) The Rosetta case: different configs are used for each cell and all 
> searches have to be language-qualified.
> 
> In both cases, the databases need to backup and restore cleanly.
> 
> >From this, I'd first of all say that I don't see the point of a Superuser 
> default_tsvector_search_config.  There are too many failure conditions with 
> the default once you get away from the simplest case, so I don't see how 
> setting it to Superuser-only protects anything.  Might as well make it a 
> userset and then it will be more useful.

Per my email yesterday, default_tsvector_search_config is _not_
super-user-only:
 o  default_text_search_config stays, not super-user-only, not set    in pg_dump output

> Unfortunately, the way I see it the only permanent solution for this is to 
> alter the TSvector structure to include a config OID at the beginning of it.  
> That doesn't sound like it's doable in time for 8.3, though; is there a way 
> we could work around that until 8.4?

Oh, so you want the config inside each tsvector value.  Interesting
idea.

> And why does this sound exactly like the issues we've had with per-column 
> encodings and the currency type?

Yes, this is a very similar issue except we are trying to allow multiple
encodings.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: Re: cvsweb busted (was Re: [COMMITTERS] pgsql: Repair problems occurring when multiple RI updates have to be)
Next
From: Bruce Momjian
Date:
Subject: Re: tsearch still has external configuration files