query1 followed by query2 at maximum distance vs current fixed distance - Mailing list pgsql-hackers

From Wh isere
Subject query1 followed by query2 at maximum distance vs current fixed distance
Date
Msg-id CAK3r-hOJbZqbQhu5ZQHkhY7j-23Mh9qz_6FDpkq_aWOJ=91DUw@mail.gmail.com
Whole thread Raw
In response to Re: new function for tsquery creartion  (Where is Where <whisere@gmail.com>)
Responses Re: query1 followed by query2 at maximum distance vs current fixeddistance
List pgsql-hackers
Is this possible with the current websearch_to_tsquery function?

Thanks.

Hello everyone, I am wondering if
AROUND(N) or <N, M> is still possible? I found this thread below and the original post https://www.postgresql.org/message-id/fe931111ff7e9ad79196486ada79e268%40postgrespro.ru
mentioned the proposed feature: 'New operator AROUND(N). It matches if the distance between words(or maybe phrases) is less than or equal to N.'

currently in tsquery_phrase(query1 tsquery, query2 tsquery, distance integer) the distaince is searching a fixed distance, is there way to
search maximum distance so the search returns query1 followed by query2 up
to a certain distance? like the AROUND(N) or <N, M> mentioned in the thread?

Thank you!



On Mon, Jul 22, 2019 at 9:13 AM Dmitry Ivanov <d.ivanov@postgrespro.ru> wrote:
Hi everyone,

I'd like to share some intermediate results. Here's what has changed:


1. OR operator is now case-insensitive. Moreover, trailing whitespace is
no longer used to identify it:

select websearch_to_tsquery('simple', 'abc or');
  websearch_to_tsquery
----------------------
  'abc' & 'or'
(1 row)

select websearch_to_tsquery('simple', 'abc or(def)');
  websearch_to_tsquery
----------------------
  'abc' | 'def'
(1 row)

select websearch_to_tsquery('simple', 'abc or!def');
  websearch_to_tsquery
----------------------
  'abc' | 'def'
(1 row)


2. AROUND(N) has been dropped. I hope that <N, M> operator will allow us
to implement it with a few lines of code.

3. websearch_to_tsquery() now tolerates various syntax errors, for
instance:

Misused operators:

'abc &'
'| abc'
'<- def'

Missing parentheses:

'abc & (def <-> (cat or rat'

Other sorts of nonsense:

'abc &--|| def'  =>  'abc' & !!'def'
'abc:def'  =>  'abc':D & 'ef'

This, however, doesn't mean that the result will always be adequate (who
would have thought?). Overall, current implementation follows the GIGO
principle. In theory, this would allow us to use user-supplied websearch
strings (but see gotchas), even if they don't make much sense. Better
then nothing, right?

4. A small refactoring: I've replaced all WAIT* macros with a enum for
better debugging (names look much nicer in GDB). Hope this is
acceptable.

5. Finally, I've added a few more comments and tests. I haven't checked
the code coverage, though.


A few gotchas:

I haven't touched gettoken_tsvector() yet. As a result, the following
queries produce errors:

select websearch_to_tsquery('simple', '''');
ERROR:  syntax error in tsquery: "'"

select websearch_to_tsquery('simple', '\');
ERROR:  there is no escaped character: "\"

Maybe there's more. The question is: should we fix those, or it's fine
as it is? I don't have a strong opinion about this.

--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Ning Yu
Date:
Subject: Re: Possible race condition in pg_mkdir_p()?
Next
From: Fabien COELHO
Date:
Subject: Re: make \d pg_toast.foo show its indices ; and, \d toast show itsmain table ; and \d relkind=I show its partitions