Re: question about to_tsvector and to_tsquery - Mailing list pgsql-sql

From Tom Lane
Subject Re: question about to_tsvector and to_tsquery
Date
Msg-id 334015.1629814371@sss.pgh.pa.us
Whole thread Raw
In response to question about to_tsvector and to_tsquery  (Martin Norbäck Olivers <martin@norpan.org>)
List pgsql-sql
=?UTF-8?Q?Martin_Norb=C3=A4ck_Olivers?= <martin@norpan.org> writes:
> Is there any more information on exactly how to_tsquery and to_tsvector are
> supposed to work?

> select to_tsvector('simple', '1.b') gives '1':1 'b':2
> but
> select to_tsvector('simple', '1.bb') gives '1.bb':1

ts_debug gives a little bit of insight:

postgres=# select * from ts_debug('simple', '1.b');
   alias   |   description    | token | dictionaries | dictionary | lexemes 
-----------+------------------+-------+--------------+------------+---------
 uint      | Unsigned integer | 1     | {simple}     | simple     | {1}
 blank     | Space symbols    | .     | {}           |            | 
 asciiword | Word, all ASCII  | b     | {simple}     | simple     | {b}
(3 rows)

postgres=# select * from ts_debug('simple', '1.bb');
 alias | description | token | dictionaries | dictionary | lexemes 
-------+-------------+-------+--------------+------------+---------
 host  | Host        | 1.bb  | {simple}     | simple     | {1.bb}
(1 row)

I don't know the exact rules that cause classification of something
as a "host" token.  It does seem a little weird that length matters.

            regards, tom lane



pgsql-sql by date:

Previous
From: Martin Norbäck Olivers
Date:
Subject: question about to_tsvector and to_tsquery
Next
From: "David G. Johnston"
Date:
Subject: Re: Partition by outer join