Term positions in GIN fulltext index - Mailing list pgsql-hackers

Hello,
I'm using a GIN index for a text column on a big table. I use it to rank
the rows, but I also need to get the term positions for each document of a
subset of documents for one or more terms. I suppose these positions are stored
in the index as the to_tsvector shows them : 'lexeme':{positions}

I've searched and asked on general postgresql mailing list, and I assume
there is no simple way to get these term positions.

For example, for 2 rows of a 'docs' table with a text column 'text' (indexed with GIN) :
'I get lexemes and I get term positions.'
'Did you get the positions ?'

I'd need a function like this :
select term_positions(text, 'get') from docs; id_doc | positions
--------+-----------      1 |     {2,6}      2 |       {3}

I'd like to add this function in my database, for experimental purpose.
I got a look at the source code but didn't find some code example using the GIN index ;
I can not figure out where the GIN index is read as a tsvector
or where the '@@' operator gets the matching tsvectors for the terms of the tsquery.

Any help about where to start reading would be very welcome :)

Regards,
Yoann Moreau



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Your review of pg_receivexlog/pg_basebackup
Next
From: Simon Riggs
Date:
Subject: Re: heap vacuum & cleanup locks