Re: [BUGS] pg_trgm word_similarity inconsistencies or bug - Mailing list pgsql-bugs

From Arthur Zakirov
Subject Re: [BUGS] pg_trgm word_similarity inconsistencies or bug
Date
Msg-id 20171028082225.GA2157@arthur.localdomain
Whole thread Raw
In response to [BUGS] pg_trgm word_similarity inconsistencies or bug  (Cristiano Coelho <cristianocca@hotmail.com>)
Responses Re: [BUGS] pg_trgm word_similarity inconsistencies or bug  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-bugs
On Fri, Oct 27, 2017 at 06:48:08PM +0000, Cristiano Coelho wrote:
> Hello all, this is related to postgres 9.6 (9.6.4) and a good description can be found here
https://stackoverflow.com/questions/46966360/postgres-word-similarity-not-comparing-words
> 
> But in summary, word_similarity doesn’t seem to do exactly what the docs say, since it will match trigrams from
multiplewords rather tan doing a word by word comparison.
 
> 
> Below is a table with output and expected output, thanks to kiln from stackoverflow to provide it.
> 

Interesting. An klin's answer from stackoverflow.com is right.

The initial example can be reduced to the next:

=# select word_similarity('sage', 'age sag');word_similarity 
-----------------              1

It computes maximum similarity using closest trigrams not considering order of
'sage' trigrams. It determines that all
trigrams from 'sage' match trigrams from 'age sag'.

Initial order of 'age sag' trigrams:
'  a', ' ag', 'age', 'ge ', '  s', ' sa', 'sag', 'ag '               ^                           ^               |from
                    |to
 
Sorted 'sage' trigrams (all of them occured within 'age sag' trigrams
continuously):
'  s', ' sa', 'age', 'ge ', 'sag'

Maybe the problem should be solved by considering 'sage' trigrams
initial order.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: [BUGS] BUG #14874: Dublicate values in primary key
Next
From: Henri KY
Date:
Subject: Re: [BUGS] BUG #14874: Dublicate values in primary key