Re: syntax for reaching into records, specifically ts_stat results - Mailing list pgsql-hackers
From | Oleg Bartunov |
---|---|
Subject | Re: syntax for reaching into records, specifically ts_stat results |
Date | |
Msg-id | Pine.LNX.4.64.0812100024260.28443@sn.sai.msu.ru Whole thread Raw |
In response to | Re: syntax for reaching into records, specifically ts_stat results (Oleg Bartunov <oleg@sai.msu.su>) |
List | pgsql-hackers |
ok, here is a function ( credits to Teodor ) CREATE OR REPLACE FUNCTION ts_stat(tsvector, OUT word text, OUT ndoc integer, OUT nentry integer) RETURNS SETOF record AS $$ SELECT ts_stat('SELECT ' || quote_literal( $1::text ) || '::tsvector'); $$ LANGUAGE SQL RETURNS NULL ON NULL INPUT IMMUTABLE; use it like select id, (ts_stat(fts)).* from apod where id=1; Oleg On Tue, 9 Dec 2008, Oleg Bartunov wrote: > On Tue, 9 Dec 2008, Dan Chak wrote: > >> Oleg, >> >> This syntax works fine until I also want to get the "sentence_id" column in >> there as well, so that I can differentiate one set of ts_stat results from >> another. With the syntax where ts_stat is treated like a table, it isn't >> possible to run ts_stat separately on multiple tsvectors as I'm doing >> below. >> >> Is there some generic record access syntax that I can use? > > write function > >> >> Thanks, >> Dan >> >> On Dec 9, 2008, at 3:04 PM, Oleg Bartunov wrote: >> >>> try select * from ts_stat(....) >>> btw, performance of ts_stat() was greatly improved in 8.4. >>> >>> Oleg >>> On Tue, 9 Dec 2008, Dan Chak wrote: >>> >>>> Dear Postgres Folk, >>>> >>>> In working with tsvectors (added in 8.3), I've come to a place where my >>>> syntax-fu has failed me. I've resorted to turning a result set of >>>> records into strings so that I can regexp out the record fields I need, >>>> rather than access them directly, as I'm sure it's possible to do with >>>> the right syntactic formulation. Although my solution works, I'm sure >>>> it's much less efficient than it could be, and hope someone on the list >>>> can help do this the right way. >>>> >>>> Basically, I would like to transpose a series of tsvectors (one per row) >>>> into columns. E.g., as tsvects, I have this: >>>> >>>> test=# select * from tsvects; >>>> sentence_id | tsvect >>>> -------------+------------------------------ >>>> 1 | 'fox':3 'brown':2 'quick':1 >>>> 2 | 'lazi':1 'eleph':3 'green':2 >>>> >>>> Instead I want this: >>>> >>>> sentence_id | word | freq >>>> -------------+-------+------ >>>> 1 | fox | 1 >>>> 1 | brown | 1 >>>> 1 | quick | 1 >>>> 2 | lazi | 1 >>>> 2 | eleph | 1 >>>> 2 | green | 1 >>>> >>>> I am able to generate this with the following view, but the problem is >>>> that to create it, I must first cast the ts_stat results to a string, and >>>> then regexp out the pertinent pieces: >>>> >>>> create or replace view words as >>>> select sentence_id, >>>> substring(stat from '^\\(([^,]+),') as word, >>>> substring(stat from ',([^,]+)\\)$') as freq >>>> from (select sentence_id, >>>> ts_stat('select tsvect from tsvects where sentence_id = ' || >>>> tsvects.sentence_id)::text as stat >>>> from tsvects >>>> ) as foo; >>>> >>>> It seems like there should be a way to access fields in the records >>>> returned from ts_stat directly, but I can't figure out how. Here's the >>>> result of the subquery: >>>> >>>> test=# select sentence_id, >>>> ts_stat('select tsvect from tsvects where sentence_id = ' || >>>> tsvects.sentence_id)::text as stat >>>> from tsvects; >>>> sentence_id | stat >>>> -------------+------------- >>>> 1 | (fox,1,1) >>>> 1 | (brown,1,1) >>>> 1 | (quick,1,1) >>>> 2 | (lazi,1,1) >>>> 2 | (eleph,1,1) >>>> 2 | (green,1,1) >>>> (6 rows) >>>> >>>> If I try to get at the elements (which I believe are named 'word', >>>> 'ndoc', 'nentry'), I get a variety of syntax errors: >>>> >>>> test=# select sentence_id, >>>> test-# stat['word'], >>>> test-# stat['nentry'] >>>> test-# from (select sentence_id, >>>> test(# ts_stat('select tsvect from tsvects where >>>> sentence_id = ' || >>>> test(# tsvects.sentence_id) as stat >>>> test(# from tsvects >>>> test(# ) as foo; >>>> ERROR: cannot subscript type record because it is not an array >>>> >>>> If I say stat.word (instead of subscripting), I get 'missing FROM-clause >>>> entry for table "stat"'. If I say foo.stat.word, I get 'ERROR: schema >>>> "foo" does not exist'. >>>> >>>> Any ideas on how to get into these records with resorting to text >>>> parsing? >>>> >>>> Thanks, >>>> Dan >>>> >>>> >>> >>> Regards, >>> Oleg >>> _____________________________________________________________ >>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), >>> Sternberg Astronomical Institute, Moscow University, Russia >>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ >>> phone: +007(495)939-16-83, +007(495)939-23-83 >> >> >> > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
pgsql-hackers by date: