Re: default_text_search_config and expression indexes - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: default_text_search_config and expression indexes
Date
Msg-id 200708140230.l7E2Ucv04362@momjian.us
Whole thread Raw
In response to Re: default_text_search_config and expression indexes  (Heikki Linnakangas <heikki@enterprisedb.com>)
List pgsql-hackers
Heikki Linnakangas wrote:
> Oleg Bartunov wrote:
> > On Wed, 8 Aug 2007, Bruce Momjian wrote:
> >> Heikki Linnakangas wrote:
> >>> If I understood correctly, the basic issue is that a tsvector datum
> >>> created using configuration A is incompatible with a tsquery datum
> >>> created using configuration B, in the sense that you won't get
> >>> reasonable results if you use the tsquery to search the tsvector, or do
> >>> ranking or highlighting. If the configurations happen to be similar
> >>> enough, it can work, but not in general.
> >>
> >> Right.
> > 
> > not fair. There are many cases when one can intentionally use different
> > configurations. But I agree, this is not for beginners.
> 
> Can you give an example of that?
> 
> I certainly can see the need to use different configurations in one
> database, but what's the use case for comparing a tsvector created with
> configuration A against a tsquery created with configuration B?

I assume you could have a configuration with different stop words or
synonymns and compare them.

> >>> - using an expression index instead of a tsvector-field, and always
> >>> explicitly specifying the configuration, you can avoid that problem (a
> >>> query with a different configuration won't use the index). But an
> >>> expression index, without explicitly specifying the configuration, will
> >>> get corrupted if you change the default configuration.
> >>
> >> Right.
> > 
> > the same problem if you drop constrain from table (accidently) and then
> > gets surprised by select results.
> 
> The difference is that if you change the default configuration, you
> won't expect that your queries start to return funny results. It looks
> harmless, like changing the date style. If you drop a constraint, it's
> much more obvious what the consequences are.
> 
> > We should agree that all you describe is only for DUMMY users. From
> > authors point of view I dislike your approach to treat text searching as
> > a very limited tool. But I understand that we should preserve people
> > from stupid errors.
> > 
> > I want for beginners easy setup and error-prone functionality,
> > but leaving experienced users to develop complex search engines.
> > Can we have separate safe interface for text searching and explicitly
> > recommend it for beginners ?
> 
> I don't see how any of the suggestions limits what you can do with it.
> If we remove the default configuration parameter, you just have to be
> explicit. If we go with the type-system I suggested, you could still add
> casts and conversion functions between different tsvector types, where
> it make sense.

I don't think the type system is workable given the ability to create
new configurations on the fly.  I think the configuration must be
specified each time.

At this point, if we keep discussing the tsearch2 API we are not going
to have this in 8.3.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: 2D partitioning of VLDB - sane or not?
Next
From: Bruce Momjian
Date:
Subject: Re: default_text_search_config and expression indexes