OK its does add complexity,
Here is a functional md5 index on the whole string
drop table if exists text_table;
create table text_table
(
mystring text
);
create index text_md5 on text_table(md5(mystring));
insert into text_table (mystring) values
('John Smith'), ('John Smith'), ('John Smith'), ('John Smith'), ('Ian Smith'), ('Ian Smith'), ('Ian Smith'), ('Ian Smith'), ('Jim Smith'), ('J Smith');
select md5, count from
( select md5(mystring) md5 , count(*) count from text_table group by md5(mystring) ) subq
where count > 1 ;
From: "Shmagi Kavtaradze" <kavtaradze.s@gmail.com>
To: "tim child" <tim.child@comcast.net>
Cc: pgsql-novice@postgresql.org
Sent: Friday, November 20, 2015 8:13:15 AM
Subject: Re: [NOVICE] Combine Top-k with similarity search extensions
It will add complexity and also no idea how to do it. Is there any alternative?