Re: prefix search in tsearch - Mailing list pgsql-docs
| From | Bruce Momjian |
|---|---|
| Subject | Re: prefix search in tsearch |
| Date | |
| Msg-id | 201102190340.p1J3eEj21966@momjian.us Whole thread Raw |
| In response to | Re: prefix search in tsearch (Oleg Bartunov <oleg@sai.msu.su>) |
| List | pgsql-docs |
I applied a modified documentation patch (attached) that includes Oleg's
suggestions.
---------------------------------------------------------------------------
Oleg Bartunov wrote:
> Erik,
>
> I think it'd be more clear if you say not 'stemmed', but processed in
> according to configuration. Here is an example:
>
> $SHAREDIR/tsearch_data/my_synonyms.syn contains one line:
> one 1
>
>
> CREATE TEXT SEARCH DICTIONARY my_synonym (
> TEMPLATE = synonym,
> SYNONYMS = my_synonyms
> );
>
> ALTER TEXT SEARCH CONFIGURATION english
> ALTER MAPPING FOR asciiword
> WITH my_synonym, english_stem;
>
>
> test=# select 'one'::tsvector @@ to_tsquery('english','one:*');
> ?column?
> ----------
> f
> (1 row)
>
> because 'one' was processed by my_synonym dictionary.
>
> test=# select ts_debug('english','one');
> ts_debug
> ------------------------------------------------------------------------------
> (asciiword,"Word, all ASCII",one,"{my_synonym,english_stem}",my_synonym,{1})
> (1 row)
>
>
>
> On Tue, 31 Aug 2010, Erik Rijkers wrote:
>
> > [docs from cvs HEAD]
> >
> > I found the text-search documentation a little unclear about 'prefix search'; specifically, the
> > examples do not show that the so-called 'prefix' is first stemmed, before it is used as prefix.
> >
> > For instance, the following can be a little surprising:
> >
> > SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
> > ?column?
> > ----------
> > t
> > (1 row)
> >
> > Because prefix search is such an important functionality I think this should be better explained,
> > which I hope the attached doc-patch does.
> >
> > (In textsearch.sgml is another mention + example of prefix search, perhaps it should be extended a
> > little there too - which I'm happy to do as well, but I first wanted to see if you agree that it
> > is a little too obscure as it stands)
> >
> >
> > Erik Rijkers
> >
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> --
> Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-docs
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 2bf411d..10f0e59 100644
*** a/doc/src/sgml/datatype.sgml
--- b/doc/src/sgml/datatype.sgml
*************** SELECT 'super:*'::tsquery;
*** 3847,3853 ****
'super':*
</programlisting>
This query will match any word in a <type>tsvector</> that begins
! with <quote>super</>.
</para>
<para>
--- 3847,3874 ----
'super':*
</programlisting>
This query will match any word in a <type>tsvector</> that begins
! with <quote>super</>.
! </para>
!
! <para>
! Note that text search configuration processing happens before
! comparisons, which means this comparison returns <literal>true</>:
! <programlisting>
! SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
! ?column?
! ----------
! t
! (1 row)
! </programlisting>
! because <literal>postgres</> gets stemmed to <literal>postgr</>:
! <programlisting>
! SELECT to_tsquery('postgres:*');
! to_tsquery
! ------------
! 'postgr':*
! (1 row)
! </programlisting>
! which then matches <literal>postgraduate</>.
</para>
<para>
pgsql-docs by date: