Re: prefix search in tsearch - Mailing list pgsql-docs

From Bruce Momjian
Subject Re: prefix search in tsearch
Date
Msg-id 201102190340.p1J3eEj21966@momjian.us
Whole thread Raw
In response to Re: prefix search in tsearch  (Oleg Bartunov <oleg@sai.msu.su>)
List pgsql-docs
I applied a modified documentation patch (attached) that includes Oleg's
suggestions.

---------------------------------------------------------------------------

Oleg Bartunov wrote:
> Erik,
>
> I think it'd be more clear if you say not 'stemmed', but processed in
> according to configuration. Here is an example:
>
> $SHAREDIR/tsearch_data/my_synonyms.syn  contains one line:
> one 1
>
>
> CREATE TEXT SEARCH DICTIONARY my_synonym (
>      TEMPLATE = synonym,
>      SYNONYMS = my_synonyms
> );
>
> ALTER TEXT SEARCH CONFIGURATION english
>      ALTER MAPPING FOR asciiword
>      WITH my_synonym, english_stem;
>
>
> test=# select 'one'::tsvector @@ to_tsquery('english','one:*');
>   ?column?
> ----------
>   f
> (1 row)
>
> because 'one' was processed by my_synonym dictionary.
>
> test=# select ts_debug('english','one');
>                                     ts_debug
> ------------------------------------------------------------------------------
>   (asciiword,"Word, all ASCII",one,"{my_synonym,english_stem}",my_synonym,{1})
> (1 row)
>
>
>
> On Tue, 31 Aug 2010, Erik Rijkers wrote:
>
> > [docs from cvs HEAD]
> >
> > I found the text-search documentation a little unclear about 'prefix search'; specifically, the
> > examples do not show that the so-called 'prefix' is first stemmed, before it is used as prefix.
> >
> > For instance, the following can be a little surprising:
> >
> > SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
> > ?column?
> > ----------
> > t
> > (1 row)
> >
> > Because prefix search is such an important functionality I think this should be better explained,
> > which I hope the attached doc-patch does.
> >
> > (In textsearch.sgml is another mention + example of prefix search, perhaps it should be extended a
> > little there too - which I'm happy to do as well, but I first wanted to see if you agree that it
> > is a little too obscure as it stands)
> >
> >
> > Erik Rijkers
> >
>
>      Regards,
>          Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> --
> Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-docs

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 2bf411d..10f0e59 100644
*** a/doc/src/sgml/datatype.sgml
--- b/doc/src/sgml/datatype.sgml
*************** SELECT 'super:*'::tsquery;
*** 3847,3853 ****
   'super':*
  </programlisting>
       This query will match any word in a <type>tsvector</> that begins
!      with <quote>super</>.
      </para>

      <para>
--- 3847,3874 ----
   'super':*
  </programlisting>
       This query will match any word in a <type>tsvector</> that begins
!      with <quote>super</>.
!     </para>
!
!     <para>
!      Note that text search configuration processing happens before
!      comparisons, which means this comparison returns <literal>true</>:
! <programlisting>
! SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
!  ?column?
! ----------
!  t
! (1 row)
! </programlisting>
!      because <literal>postgres</> gets stemmed to <literal>postgr</>:
! <programlisting>
! SELECT to_tsquery('postgres:*');
!  to_tsquery
! ------------
!  'postgr':*
! (1 row)
! </programlisting>
!      which then matches <literal>postgraduate</>.
      </para>

      <para>

pgsql-docs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Ambiguous index entry for Privileges
Next
From: Bruce Momjian
Date:
Subject: Re: prefix search in tsearch