Re: Proposal: q-gram GIN and GiST indexes - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Proposal: q-gram GIN and GiST indexes
Date
Msg-id BANLkTi=vY-MG_eyyONEqPj4uy3fOY=P0qg@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: q-gram GIN and GiST indexes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal: q-gram GIN and GiST indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Apr 5, 2011 at 5:05 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I am probably being stupid here, but doesn't the number of links to
rows grow proportionately to the number of n-grams?
Number of links to rows grow proportionally to total number of extracted q-grams, but not proportionally to number of unique q-grams. Though, if extracted q-grams are not unique inside same indexed value, then it can reduce number of links (but it is rarity). 
Lets consider simple example. Two rows contains strings 'aaa' and 'aaab'. We extract 3-gram 'aaa' from first string and 3-grams 'aaa' and 'aab' from second string (for simplicity, there is no padding here). GIN index will contain structure, which can be represented so:
'aaa' => 1, 2
'aab' => 2
We can see, that there are 2 unique 3-grams, but 3 links to the rows.

----
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: cast from integer to money
Next
From: Robert Haas
Date:
Subject: Re: Typed-tables patch broke pg_upgrade