Re: access to lexems or access to parsed elements - Mailing list pgsql-general

From Sushant Sinha
Subject Re: access to lexems or access to parsed elements
Date
Msg-id 1314290476.1846.13.camel@dragflick
Whole thread Raw
In response to access to lexems or access to parsed elements  ("Massa, Harald Armin" <harald@2ndQuadrant.de>)
Responses Re: access to lexems or access to parsed elements  ("Massa, Harald Armin" <harald@2ndQuadrant.de>)
List pgsql-general
Can this fit?

select plainto_tsquery('english', 'the quick brown fox jumped over the
lazy fox');
                   plainto_tsquery
-----------------------------------------------------
 'quick' & 'brown' & 'fox' & 'jump' & 'lazi' & 'fox'

-Sushant.

On Thu, 2011-08-25 at 18:21 +0200, Massa, Harald Armin wrote:
> I want to access the single words in a text. Better yet: the relevant
> words (i.e. without stop words) in a text.
>
>
> to_tsvector or casting gets me the lexems as a tsvector:
>
>
> select to_tsvector('the quick brown fox jumped over the lazy fox')
> ''brown':3 'fox':4,9 'jump':5 'lazi':8 'quick':2'
>
>
> And I would like to access "brown", "fox", "jump", "lazi" and "quick"
> as single values that I insert into another table.
>
>
> But: no luck with any tries to convert to records, arrays or similiar.
>
>
> Next step, the lesser-known-fts-functions:
>
>
> select ts_parse('default','the quick brown fox jumped over the lazy
> fox')
>
>
> (1,the)
> (12," ")
> (1,quick)
> [...]
> (1,fox)
>
>
> is a set-returning-function, giving me 17 records of type
> pseudo-record. Stopwords still in there, so what. But: No chance of
> accessing the second field in that record.
>
>
> Of course, there is allways:
>
>
> select  substr(what::text,position(',' in
> what::text)+1,char_length(what::text)-position(',' in what::text)-1)
> from
> (
> select ts_parse('default','the quick brown fox jumped over the lazy
> fox') as what
> )x
>
>
> but, comeon: having a two-field-record, casting it to one field of
> text, searching for the "," that separates the two fields and then
> split the one-field into two fields by substring?
>
>
> So, is there a better way to access
>
>
> a) the lexems of a tsvector
> b) the (unnamed) fields of a set-of-record-returning function
>
>
> ?
> Harald
>
>
> --
> Harald Armin Massa     www.2ndQuadrant.de
> PostgreSQL  Training, Services  and Support
>
> 2ndQuadrant Deutschland GmbH
> GF: Harald Armin Massa
> Amtsgericht Stuttgart, HRB 736399



pgsql-general by date:

Previous
From: "Massa, Harald Armin"
Date:
Subject: access to lexems or access to parsed elements
Next
From: Scott Marlowe
Date:
Subject: Re: rollback doubt and connection to remoteDB