Home > mailing lists

Re: Full text search tsv column aproach vs concat confusion - Mailing list pgsql-general

From	Artur Zakirov
Subject	Re: Full text search tsv column aproach vs concat confusion
Date	November 18, 2016 09:10:53
Msg-id	d9a43710-b53d-9ef9-4a52-ade354223fda@postgrespro.ru Whole thread Raw
In response to	Full text search tsv column aproach vs concat confusion (cen <imbacen@gmail.com>)
List	pgsql-general

Tree view

Hi,

On 16.11.2016 11:54, cen wrote:
> Hi
>
> I am seeking some clarification in regard to full text search across
> multiple tables and what the best approach is. Documentation talks about
> two approaches when it comes to building a document: on-the-fly concat
> of columns and a dedicated tsv column approach. Let's say I want to
> perform a search for |"txt1 & txt2 & txt3" on columns table1.col1,
> table1.col2 and table2.col1. I see the following solutions:|
>
> |1. Concat all three into a document and perform a FTS.|
>
> |SELECT * FROM (
> |
>
> |SELECTto_tsvector(table1.col1)||to_tsvector(table1.col2)||to_tsvector(table2.col1)asdocument
> F|||ROM table1 LEFTJOINtable2 ONtable1.table2_id=table2.id| ) subquery
|||WHEREsubquery.document@@to_tsquery(unaccent(?));|| 
>
> |2. Create a tsv column in each table, concat tsv columns and perform
> FTS on that.|
>
> |SELECT*FROMtable1 LEFTJOINtable2 ONtable1.table2_id=table2.id
> WHEREtable1.tsv ||tale2.tsv @@to_tsquery(unaccent(?));|
>
> |3. Have a tsv column only in table1 and insert table2.col1 to the tsv
> via triggers. Works but seems very hacky.|
>
> |
> |
>
> |It seems to me that option #2 is fast and easy to implement but I am
> not sure what the concat of tsvs really means from index usage and
> performance standpoint. Option #1 is the most flexible and I'd use that
> all the time if it was not THAT much slower than tsv column approacj.
> Documentation on TSV columns states: "||Another advantage is that
> searches will be faster, since it will not be necessary to redo the
> to_tsvector calls to verify index matches."
> |
>
> The question is, how much faster are tsv columns really? Are there any
> benchmarks about this? If the performance difference is negligible I'd
> advocate that using tsv columns is a waste of time and space in most
> general cases. But since there is no information on how much faster it's
> hard to decide.
>

I haven't any such benchmarks. But if you have a real database, you can
perform tests using it on your solutions. Because it depends on your
task and what you need.

By the way, I suppose it is better to use COALESCE() function if your
columns could have NULL value:

SELECT * FROM (

SELECT to_tsvector(coalesce(table1.col1,'')) ||
        to_tsvector(coalesce(table1.col2,'')) ||
        to_tsvector(coalesce(table2.col1,'')) as document FROM table1
LEFT JOIN table2 ON table1.table2_id=table2.id

) subquery WHERE subquery.document @@ to_tsquery(unaccent(?));

And specifying a text search configuration makes queries a little bit
faster:

... to_tsvector('english', coalesce(table1.col1,'')) ...

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

pgsql-general by date:

From: Mark Anns
Date: 18 November 2016, 06:29:51
Subject: Re: How the Planner in PGStrom differs from PostgreSQL?

From: otheus uibk
Date: 18 November 2016, 13:01:15
Subject: Feature request: separate logging

Re: Full text search tsv column aproach vs concat confusion - Mailing list pgsql-general

Previous

Next