Re: syntax for reaching into records, specifically ts_stat results - Mailing list pgsql-hackers

From Oleg Bartunov
Subject Re: syntax for reaching into records, specifically ts_stat results
Date
Msg-id Pine.LNX.4.64.0812092354550.28443@sn.sai.msu.ru
Whole thread Raw
In response to Re: syntax for reaching into records, specifically ts_stat results  (Dan Chak <chak@MIT.EDU>)
Responses Re: syntax for reaching into records, specifically ts_stat results
List pgsql-hackers
On Tue, 9 Dec 2008, Dan Chak wrote:

> Oleg,
>
> This syntax works fine until I also want to get the "sentence_id" column in 
> there as well, so that I can differentiate one set of ts_stat results from 
> another.  With the syntax where ts_stat is treated like a table, it isn't 
> possible to run ts_stat separately on multiple tsvectors as I'm doing below.
>
> Is there some generic record access syntax that I can use?

write function

>
> Thanks,
> Dan
>
> On Dec 9, 2008, at 3:04 PM, Oleg Bartunov wrote:
>
>> try select * from ts_stat(....)
>> btw, performance of ts_stat() was greatly improved in 8.4.
>> 
>> Oleg
>> On Tue, 9 Dec 2008, Dan Chak wrote:
>> 
>>> Dear Postgres Folk,
>>> 
>>> In working with tsvectors (added in 8.3), I've come to a place where my 
>>> syntax-fu has failed me.  I've resorted to turning a result set of records 
>>> into strings so that I can regexp out the record fields I need, rather 
>>> than access them directly, as I'm sure it's possible to do with the right 
>>> syntactic formulation.  Although my solution works, I'm sure it's much 
>>> less efficient than it could be, and hope someone on the list can help do 
>>> this the right way.
>>> 
>>> Basically, I would like to transpose a series of tsvectors (one per row) 
>>> into columns.  E.g., as tsvects, I have this:
>>> 
>>> test=# select * from tsvects;
>>> sentence_id |            tsvect
>>> -------------+------------------------------
>>>        1 | 'fox':3 'brown':2 'quick':1
>>>        2 | 'lazi':1 'eleph':3 'green':2
>>> 
>>> Instead I want this:
>>> 
>>> sentence_id | word  | freq
>>> -------------+-------+------
>>>        1 | fox   | 1
>>>        1 | brown | 1
>>>        1 | quick | 1
>>>        2 | lazi  | 1
>>>        2 | eleph | 1
>>>        2 | green | 1
>>> 
>>> I am able to generate this with the following view, but the problem is 
>>> that to create it, I must first cast the ts_stat results to a string, and 
>>> then regexp out the pertinent pieces:
>>> 
>>> create or replace view words as
>>> select sentence_id,
>>>    substring(stat from '^\\(([^,]+),') as word,
>>>    substring(stat from ',([^,]+)\\)$') as freq
>>> from (select sentence_id,
>>>            ts_stat('select tsvect from tsvects where sentence_id = ' ||
>>>               tsvects.sentence_id)::text as stat
>>>       from tsvects
>>>    ) as foo;
>>> 
>>> It seems like there should be a way to access fields in the records 
>>> returned from ts_stat directly, but I can't figure out how.  Here's the 
>>> result of the subquery:
>>> 
>>> test=# select sentence_id,
>>>            ts_stat('select tsvect from tsvects where sentence_id = ' ||
>>>               tsvects.sentence_id)::text as stat
>>>       from tsvects;
>>> sentence_id |    stat
>>> -------------+-------------
>>>        1 | (fox,1,1)
>>>        1 | (brown,1,1)
>>>        1 | (quick,1,1)
>>>        2 | (lazi,1,1)
>>>        2 | (eleph,1,1)
>>>        2 | (green,1,1)
>>> (6 rows)
>>> 
>>> If I try to get at the elements (which I believe are named 'word', 'ndoc', 
>>> 'nentry'), I get a variety of syntax errors:
>>> 
>>> test=# select sentence_id,
>>> test-#        stat['word'],
>>> test-#        stat['nentry']
>>> test-#   from (select sentence_id,
>>> test(#                ts_stat('select tsvect from tsvects where 
>>> sentence_id = ' ||
>>> test(#                   tsvects.sentence_id) as stat
>>> test(#           from tsvects
>>> test(#        ) as foo;
>>> ERROR:  cannot subscript type record because it is not an array
>>> 
>>> If I say stat.word (instead of subscripting), I get 'missing FROM-clause 
>>> entry for table "stat"'.  If I say foo.stat.word, I get 'ERROR:  schema 
>>> "foo" does not exist'.
>>> 
>>> Any ideas on how to get into these records with resorting to text parsing?
>>> 
>>> Thanks,
>>> Dan
>>> 
>>> 
>>
>>     Regards,
>>         Oleg
>> _____________________________________________________________
>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>> Sternberg Astronomical Institute, Moscow University, Russia
>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>> phone: +007(495)939-16-83, +007(495)939-23-83
>
>
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Quick patch: Display sequence owner
Next
From: Martin Pihlak
Date:
Subject: Re: SQL/MED compatible connection manager