Re: tsearch2 in PostgreSQL 8.3? - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: tsearch2 in PostgreSQL 8.3?
Date
Msg-id 200708171005.11427.josh@agliodbs.com
Whole thread Raw
In response to Re: tsearch2 in PostgreSQL 8.3?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: tsearch2 in PostgreSQL 8.3?
Re: tsearch2 in PostgreSQL 8.3?
List pgsql-hackers
Folks,

Here's something not to forget in this whole business: the present TSearch2
implementation permits you to have a different tsvector configuration for
each *row*, not just each column.  That is, applications can be built with
"per-cell" configs.

I know of at least one out there: Ubuntu's Rosetta.  I'm sure there are
others.

Therefore there are two cases we're trying to solve:

(1) The simple case: someone wants to build a database with text search
entirely in one UTF8 language.  All vectors are in that language, and so are
all queries.  The user wants the simplest syntax possible.

(2) The Rosetta case: different configs are used for each cell and all
searches have to be language-qualified.

In both cases, the databases need to backup and restore cleanly.

From this, I'd first of all say that I don't see the point of a Superuser
default_tsvector_search_config.  There are too many failure conditions with
the default once you get away from the simplest case, so I don't see how
setting it to Superuser-only protects anything.  Might as well make it a
userset and then it will be more useful.

Unfortunately, the way I see it the only permanent solution for this is to
alter the TSvector structure to include a config OID at the beginning of it.
That doesn't sound like it's doable in time for 8.3, though; is there a way
we could work around that until 8.4?

And why does this sound exactly like the issues we've had with per-column
encodings and the currency type?

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: tsearch2 in PostgreSQL 8.3?
Next
From: Tom Lane
Date:
Subject: tsearch still has external configuration files