Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> True. I'll bet you don't like ts_stat() either.
> It seems the right way interface here wouldn't be too different from what's
> there. All we need is a SRF which takes a single tsvector and returns the set
> of words from it.
> Then you could do the aggregates yourself in SQL:
> SELECT count(distinct apodid) as ndoc,
> count(*) as nentry,
> element
> FROM (
> SELECT apodid, ts_elements(vector) AS element
> FROM apod
> ) GROUP BY element
I'm not sure that's so much cleaner than what's there. It's relying on
SRF-in-SELECT-list, which is doable at the C level but it's deprecated;
plus the SRF has to return a record, which makes the notation a bit
klugy --- (element).whatever in the upper select-list, and the GROUP BY
probably won't work the way you show here, either.
Another big problem is that the count(distinct) is going to cause the
whole thing to have pretty awful performance.
Not sure about a better way though...
regards, tom lane