Re: PG 14 release notes, first draft - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: PG 14 release notes, first draft
Date
Msg-id 20210511203141.GX6088@momjian.us
Whole thread Raw
In response to Re: PG 14 release notes, first draft  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: PG 14 release notes, first draft
List pgsql-hackers
On Tue, May 11, 2021 at 01:16:38PM +0300, Alexander Korotkov wrote:
> > OK, what symbols trigger this change?  Underscore?  What else?
> 
> Any symbol, which is recognized as a separator by full-text parser,
> but not tsquery parser.  Fulltext search is extensible and allowing
> pluggable parsers.  In principle, we could dig the exact set of
> symbols, but I'm not sure this worth the effort.
> 
> >  You are
> > saying the previous code allowed 'pg' and 'class' anywhere in the
> > string, while the new code requires them to be adjacent, which more
> > closely matches the pattern.
> 
> Yes, that's it.
> 
> > >  * Fix extra distance in phrase operators for quoted text in
> > > websearch_to_tsquery() (Alexander Korotkov)
> > > For example, websearch_to_tsquery('english', '"aaa: bbb"') becomes
> > > 'aaa <> bbb' instead of  'aaa <2> bbb'.
> >
> > So colon and space were considered to be two tokens between 'aaa' and
> > 'bbb', while is really only one because both tokens are discarded?  Is
> > this true of any discarded tokens, e.g. ''"aaa ?:, bbb"'?
> 
> Yes, that's true for any discarded tokens.

I can up with this text for these two items.  I think it still needs ro
be more specific:

    <listitem>
    <!--
    Author: Alexander Korotkov <akorotkov@postgresql.org>
    2021-01-31 [0c4f355c6] Fix parsing of complex morphs to tsquery
    -->
    
    <para>
    Fix to_tsquery() and websearch_to_tsquery() to properly parse
    certain discarded tokens in quotes (Alexander Korotkov)
    </para>
    
    <para>
    Certain discarded tokens, like underscore, caused the output
    of these functions to produce incorrect tsquery output, e.g.,
    websearch_to_tsquery('"pg_class pg"') used to output '( pg &
    class ) <-> pg', but now outputs 'pg <-> class <-> pg'.
    </para>
    </listitem>
    
    <listitem>
    <!--
    Author: Alexander Korotkov <akorotkov@postgresql.org>
    2021-05-03 [eb086056f] Make websearch_to_tsquery() parse text in quotes as a si
    -->
    
    <para>
    Fix websearch_to_tsquery() to properly parse multiple adjacent
    discarded tokens in quotes (Alexander Korotkov)
    </para>
    
    <para>
    Previously, quoted text that contained multiple adjacent discarded
    tokens were treated as multiple tokens, causing incorrect tsquery
    output, e.g., websearch_to_tsquery('"aaa: bbb"') used to output
    'aaa <2> bbb', but now    outputs 'aaa <-> bbb'.
    </para>
    </listitem>

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pg_receivewal makes a bad daemon
Next
From: Bruce Momjian
Date:
Subject: Re: PG 14 release notes, first draft