Re: websearch_to_tsquery() and apostrophe inside double quotes - Mailing list pgsql-general

From Tom Lane
Subject Re: websearch_to_tsquery() and apostrophe inside double quotes
Date
Msg-id 21225.1570714506@sss.pgh.pa.us
Whole thread Raw
In response to websearch_to_tsquery() and apostrophe inside double quotes  (Alastair McKinley <a.mckinley@analyticsengines.com>)
Responses Re: websearch_to_tsquery() and apostrophe inside double quotes
List pgsql-general
Alastair McKinley <a.mckinley@analyticsengines.com> writes:
> I am a little confused about what us being generated by websearch_to_tsquery() in the case of an apostrophe inside
doublequotes. 
> ...

> select websearch_to_tsquery('"peter o''toole"');
>      websearch_to_tsquery
> ------------------------------
>  'peter' <-> ( 'o' & 'tool' )
> (1 row)

> I am not quite sure what text this will actually match?

I believe it's impossible for that to match anything :-(.
It would require 'o' and 'tool' to match the same lexeme
(one immediately after a 'peter') which of course is impossible.

The underlying tsvector type seems to treat the apostrophe the
same as whitespace; it separates 'o' and 'toole' into
distinct words:

# select to_tsvector('peter o''toole');
       to_tsvector
--------------------------
 'o':2 'peter':1 'tool':3
(1 row)

So it seems to me that this is a bug: websearch_to_tsquery
should also treat "'" like whitespace.  There's certainly
not anything in its documentation that suggests it should
treat "'" specially.  If it didn't, you'd get

# select websearch_to_tsquery('"peter o toole"');
    websearch_to_tsquery
----------------------------
 'peter' <-> 'o' <-> 'tool'
(1 row)

which would match this tsvector.

            regards, tom lane



pgsql-general by date:

Previous
From: Thomas Kellerer
Date:
Subject: Re: Case Insensitive Comparison with Postgres 12
Next
From: Ivan Kabaivanov
Date:
Subject: Re: syntax error with v12