Re: Empty string in lexeme for tsvector - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Empty string in lexeme for tsvector
Date
Msg-id 2997142.1632944171@sss.pgh.pa.us
Whole thread Raw
In response to Re: Empty string in lexeme for tsvector  (Jean-Christophe Arnu <jcarnu@gmail.com>)
Responses Re: Empty string in lexeme for tsvector
List pgsql-hackers
Jean-Christophe Arnu <jcarnu@gmail.com> writes:
> [ empty_string_in_tsvector_v4.patch ]

I looked through this patch a bit.  I don't agree with adding
these new error conditions to tsvector_setweight_by_filter and
tsvector_delete_arr.  Those don't prevent bad lexemes from being
added to tsvectors, so AFAICS they can have no effect other than
breaking existing applications.  In fact, tsvector_delete_arr is
one thing you could use to fix existing bad tsvectors, so making
it throw an error seems actually counterproductive.

(By the same token, I think there's a good argument for
tsvector_delete_arr to just ignore nulls, not throw an error.
That's a somewhat orthogonal issue, though.)

What I'm wondering about more than that is whether array_to_tsvector
is the only place that can inject an empty lexeme ... don't we have
anything else that can add lexemes without going through the parser?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: jsonb crash
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: [BUG] failed assertion in EnsurePortalSnapshotExists()