Thread: Poorly designed tsearch NOTICEs

Poorly designed tsearch NOTICEs

From
Tom Lane
Date:
regression=# SELECT plainto_tsquery('the any'); 
NOTICE:  query contains only stopword(s) or doesn't contain lexeme(s), ignoredplainto_tsquery 
-----------------
(1 row)

regression=# select ''::tsquery;
NOTICE:  tsearch query doesn't contain lexeme(s): ""tsquery 
---------
(1 row)

IMHO, it's really bad design to have this sort of NOTICE emitted by
tsquery input.  Even if an application uses numnode() or querytree() or
something similar to detect bogus queries, it's going to have its logs
cluttered with these notices.

I could see having the @@ operator emit the notice if the query is
actually used for searching --- though I'm not quite sure how to get it
to come out only once per query ... maybe we could put it into the index
consistent() functions somehow?
        regards, tom lane


Re: Poorly designed tsearch NOTICEs

From
Tom Lane
Date:
Last month I complained:
> regression=# SELECT plainto_tsquery('the any'); 
> NOTICE:  query contains only stopword(s) or doesn't contain lexeme(s), ignored
>  plainto_tsquery 
> -----------------
> (1 row)

> regression=# select ''::tsquery;
> NOTICE:  tsearch query doesn't contain lexeme(s): ""
>  tsquery 
> ---------
> (1 row)

> IMHO, it's really bad design to have this sort of NOTICE emitted by
> tsquery input.  Even if an application uses numnode() or querytree() or
> something similar to detect bogus queries, it's going to have its logs
> cluttered with these notices.

> I could see having the @@ operator emit the notice if the query is
> actually used for searching --- though I'm not quite sure how to get it
> to come out only once per query ... maybe we could put it into the index
> consistent() functions somehow?

I experimented with this and found out that it works all right for GIN
indexes, if the NOTICE is put into gin_extract_query(); that seems to be
called just once per GIN index search.  However, the only possible place
to put it in GIST tsearch support would be in the consistent() routines,
and that's no good because those will be called once per entry on the
index's root page --- so you get multiple copies of the NOTICE.

So it seems that the practical alternatives are:

1. Leave these notices where they are.  Expect complaints from people
who would rather not have their logs cluttered with 'em.

2. Remove the notices altogether.  Expect complaints from people who
get no matches on queries that they don't realize are all-stopwords.

3. Remove the notices from the input routines, and put one into
gin_extract_query only.  We'll still get complaints as in #2, but
only from people using GIST indexes or no index at all for searching.

None of these are really terribly attractive, but I'm kinda leaning
to #2 myself.  I'm not convinced that it's the province of the DB to be
issuing messages like this.  In a lot of common scenarios, NOTICEs
aren't going to be seen by the actual person entering the query anyway,
because there are layers of software between him and the DB.  All they
will accomplish is to bloat some logs somewhere.

Comments?
        regards, tom lane


Re: Poorly designed tsearch NOTICEs

From
Robert Treat
Date:
On Tuesday 27 November 2007 19:03, Tom Lane wrote:
> Last month I complained:
> > regression=# SELECT plainto_tsquery('the any');
> > NOTICE:  query contains only stopword(s) or doesn't contain lexeme(s),
> > ignored plainto_tsquery
> > -----------------
> >
> > (1 row)
> >
> > regression=# select ''::tsquery;
> > NOTICE:  tsearch query doesn't contain lexeme(s): ""
> >  tsquery
> > ---------
> >
> > (1 row)
> >
> > IMHO, it's really bad design to have this sort of NOTICE emitted by
> > tsquery input.  Even if an application uses numnode() or querytree() or
> > something similar to detect bogus queries, it's going to have its logs
> > cluttered with these notices.
> >
> > I could see having the @@ operator emit the notice if the query is
> > actually used for searching --- though I'm not quite sure how to get it
> > to come out only once per query ... maybe we could put it into the index
> > consistent() functions somehow?
>
> I experimented with this and found out that it works all right for GIN
> indexes, if the NOTICE is put into gin_extract_query(); that seems to be
> called just once per GIN index search.  However, the only possible place
> to put it in GIST tsearch support would be in the consistent() routines,
> and that's no good because those will be called once per entry on the
> index's root page --- so you get multiple copies of the NOTICE.
>
> So it seems that the practical alternatives are:
>
> 1. Leave these notices where they are.  Expect complaints from people
> who would rather not have their logs cluttered with 'em.
>
> 2. Remove the notices altogether.  Expect complaints from people who
> get no matches on queries that they don't realize are all-stopwords.
>
> 3. Remove the notices from the input routines, and put one into
> gin_extract_query only.  We'll still get complaints as in #2, but
> only from people using GIST indexes or no index at all for searching.
>
> None of these are really terribly attractive, but I'm kinda leaning
> to #2 myself.  I'm not convinced that it's the province of the DB to be
> issuing messages like this.  In a lot of common scenarios, NOTICEs
> aren't going to be seen by the actual person entering the query anyway,
> because there are layers of software between him and the DB.  All they
> will accomplish is to bloat some logs somewhere.
>
> Comments?

I would lean toward #1 since it seems to be closest to the behavior from 
previous releases. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL