9.13. Text Search Functions and Operators

Table 9.41, Table 9.42 and Table 9.43 summarize the functions and operators that are provided for full text searching. See Chapter 12 for a detailed explanation of PostgreSQL's text search facility.

Table 9.41. Text Search Operators

OperatorReturn TypeDescriptionExampleResult
@@booleantsvector matches tsquery ?to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat')t
@@@booleandeprecated synonym for @@to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat')t
||tsvectorconcatenate tsvectors'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector'a':1 'b':2,5 'c':3 'd':4
&&tsqueryAND tsquerys together'fat | rat'::tsquery && 'cat'::tsquery( 'fat' | 'rat' ) & 'cat'
||tsqueryOR tsquerys together'fat | rat'::tsquery || 'cat'::tsquery( 'fat' | 'rat' ) | 'cat'
!!tsquerynegate a tsquery!! 'cat'::tsquery!'cat'
<->tsquerytsquery followed by tsqueryto_tsquery('fat') <-> to_tsquery('rat')'fat' <-> 'rat'
@>booleantsquery contains another ?'cat'::tsquery @> 'cat & rat'::tsqueryf
<@booleantsquery is contained in ?'cat'::tsquery <@ 'cat & rat'::tsqueryt

Note

The tsquery containment operators consider only the lexemes listed in the two queries, ignoring the combining operators.

In addition to the operators shown in the table, the ordinary B-tree comparison operators (=, <, etc) are defined for types tsvector and tsquery. These are not very useful for text searching but allow, for example, unique indexes to be built on columns of these types.

Table 9.42. Text Search Functions

FunctionReturn TypeDescriptionExampleResult
array_to_tsvector(text[])tsvectorconvert array of lexemes to tsvectorarray_to_tsvector('{fat,cat,rat}'::text[])'cat' 'fat' 'rat'
get_current_ts_config()regconfigget default text search configurationget_current_ts_config()english
length(tsvector)integernumber of lexemes in tsvectorlength('fat:2,4 cat:3 rat:5A'::tsvector)3
numnode(tsquery)integernumber of lexemes plus operators in tsquery numnode('(fat & rat) | cat'::tsquery)5
plainto_tsquery([ config regconfig , ] query text)tsqueryproduce tsquery ignoring punctuationplainto_tsquery('english', 'The Fat Rats')'fat' & 'rat'
phraseto_tsquery([ config regconfig , ] query text)tsqueryproduce tsquery that searches for a phrase, ignoring punctuationphraseto_tsquery('english', 'The Fat Rats')'fat' <-> 'rat'
websearch_to_tsquery([ config regconfig , ] query text)tsqueryproduce tsquery from a web search style querywebsearch_to_tsquery('english', '"fat rat" or rat')'fat' <-> 'rat' | 'rat'
querytree(query tsquery)textget indexable part of a tsqueryquerytree('foo & ! bar'::tsquery)'foo'
setweight(vector tsvector, weight "char")tsvectorassign weight to each element of vectorsetweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A')'cat':3A 'fat':2A,4A 'rat':5A
setweight(vector tsvector, weight "char", lexemes text[])tsvectorassign weight to elements of vector that are listed in lexemessetweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}')'cat':3A 'fat':2,4 'rat':5A
strip(tsvector)tsvectorremove positions and weights from tsvectorstrip('fat:2,4 cat:3 rat:5A'::tsvector)'cat' 'fat' 'rat'
to_tsquery([ config regconfig , ] query text)tsquerynormalize words and convert to tsqueryto_tsquery('english', 'The & Fat & Rats')'fat' & 'rat'
to_tsvector([ config regconfig , ] document text)tsvectorreduce document text to tsvectorto_tsvector('english', 'The Fat Rats')'fat':2 'rat':3
to_tsvector([ config regconfig , ] document json(b))tsvector reduce each string value in the document to a tsvector, and then concatenate those in document order to produce a single tsvectorto_tsvector('english', '{"a": "The Fat Rats"}'::json)'fat':2 'rat':3
json(b)_to_tsvector([ config regconfig, ] document json(b), filter json(b))tsvector reduce each value in the document, specified by filter to a tsvector, and then concatenate those in document order to produce a single tsvector. filter is a jsonb array, that enumerates what kind of elements need to be included into the resulting tsvector. Possible values for filter are "string" (to include all string values), "numeric" (to include all numeric values in the string format), "boolean" (to include all Boolean values in the string format "true"/"false"), "key" (to include all keys) or "all" (to include all above). These values can be combined together to include, e.g., all string and numeric values. json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]')'123':5 'fat':2 'rat':3
ts_delete(vector tsvector, lexeme text)tsvectorremove given lexeme from vectorts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')'cat':3 'rat':5A
ts_delete(vector tsvector, lexemes text[])tsvectorremove any occurrence of lexemes in lexemes from vectorts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])'cat':3
ts_filter(vector tsvector, weights "char"[])tsvectorselect only elements with given weights from vectorts_filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}')'cat':3B 'rat':5A
ts_headline([ config regconfig, ] document text, query tsquery [, options text ])textdisplay a query matchts_headline('x y z', 'z'::tsquery)x y <b>z</b>
ts_headline([ config regconfig, ] document json(b), query tsquery [, options text ])textdisplay a query matchts_headline('{"a":"x y z"}'::json, 'z'::tsquery){"a":"x y <b>z</b>"}
ts_rank([ weights float4[], ] vector tsvector, query tsquery [, normalization integer ])float4rank document for queryts_rank(textsearch, query)0.818
ts_rank_cd([ weights float4[], ] vector tsvector, query tsquery [, normalization integer ])float4rank document for query using cover densityts_rank_cd('{0.1, 0.2, 0.4, 1.0}', textsearch, query)2.01317
ts_rewrite(query tsquery, target tsquery, substitute tsquery)tsqueryreplace target with substitute within queryts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery)'b' & ( 'foo' | 'bar' )
ts_rewrite(query tsquery, select text)tsqueryreplace using targets and substitutes from a SELECT commandSELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases')'b' & ( 'foo' | 'bar' )
tsquery_phrase(query1 tsquery, query2 tsquery)tsquerymake query that searches for query1 followed by query2 (same as <-> operator)tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))'fat' <-> 'cat'
tsquery_phrase(query1 tsquery, query2 tsquery, distance integer)tsquerymake query that searches for query1 followed by query2 at distance distancetsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)'fat' <10> 'cat'
tsvector_to_array(tsvector)text[]convert tsvector to array of lexemestsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector){cat,fat,rat}
tsvector_update_trigger()triggertrigger function for automatic tsvector column updateCREATE TRIGGER ... tsvector_update_trigger(tsvcol, 'pg_catalog.swedish', title, body)
tsvector_update_trigger_column()triggertrigger function for automatic tsvector column updateCREATE TRIGGER ... tsvector_update_trigger_column(tsvcol, configcol, title, body)
unnest(tsvector, OUT lexeme text, OUT positions smallint[], OUT weights text)setof recordexpand a tsvector to a set of rowsunnest('fat:2,4 cat:3 rat:5A'::tsvector)(cat,{3},{D}) ...

Note

All the text search functions that accept an optional regconfig argument will use the configuration specified by default_text_search_config when that argument is omitted.

The functions in Table 9.43 are listed separately because they are not usually used in everyday text searching operations. They are helpful for development and debugging of new text search configurations.

Table 9.43. Text Search Debugging Functions

FunctionReturn TypeDescriptionExampleResult
ts_debug([ config regconfig, ] document text, OUT alias text, OUT description text, OUT token text, OUT dictionaries regdictionary[], OUT dictionary regdictionary, OUT lexemes text[])setof recordtest a configurationts_debug('english', 'The Brightest supernovaes')(asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...
ts_lexize(dict regdictionary, token text)text[]test a dictionaryts_lexize('english_stem', 'stars'){star}
ts_parse(parser_name text, document text, OUT tokid integer, OUT token text)setof recordtest a parserts_parse('default', 'foo - bar')(1,foo) ...
ts_parse(parser_oid oid, document text, OUT tokid integer, OUT token text)setof recordtest a parserts_parse(3722, 'foo - bar')(1,foo) ...
ts_token_type(parser_name text, OUT tokid integer, OUT alias text, OUT description text)setof recordget token types defined by parserts_token_type('default')(1,asciiword,"Word, all ASCII") ...
ts_token_type(parser_oid oid, OUT tokid integer, OUT alias text, OUT description text)setof recordget token types defined by parserts_token_type(3722)(1,asciiword,"Word, all ASCII") ...
ts_stat(sqlquery text, [ weights text, ] OUT word text, OUT ndoc integer, OUT nentry integer)setof recordget statistics of a tsvector columnts_stat('SELECT vector from apod')(foo,10,15) ...