Thread: ts_count

ts_count

From
Andrew Dunstan
Date:
One of our PostgreSQL Experts Inc customers wanted a function to count 
all the occurrences of terms in a tsquery in a tsvector. This has been 
written as a loadable module function, and initial testing shows it is 
working well. With the client's permission we are releasing the code - 
it's available at <https://github.com/pgexperts/ts_count>. The actual 
new code involved here is tiny, some of the code is C&P'd from tsrank.c 
and much of the rest is boilerplate.

A snippet from the regression test:

   select ts_count(to_tsvector('managing managers manage peons   managerially'),
to_tsquery('managers| peon'));     ts_count   ----------            4
 

We'd like to add something like this for 9.2, so I'd like to get the API agreed and then I'll prepare a patch and
submitit for the next CF.
 

Comments? cheers andrew



Re: ts_count

From
Oleg Bartunov
Date:
Well, there are several functions available around tsearch2. so I suggest
somebody to collect all of them and create one extension - ts_addon.
For example, these are what I remember:
1. tsvector2array
2. noccurences(tsvector, tsquery) - like your ts_count
3. nmatches(tsvector, tsquery) - # of matched lexems in query
Of course, we need to think about better names for functions, since
ts_count is a bit ambiguous.


Oleg

On Sat, 4 Jun 2011, Andrew Dunstan wrote:

>
> One of our PostgreSQL Experts Inc customers wanted a function to count all 
> the occurrences of terms in a tsquery in a tsvector. This has been written as 
> a loadable module function, and initial testing shows it is working well. 
> With the client's permission we are releasing the code - it's available at 
> <https://github.com/pgexperts/ts_count>. The actual new code involved here is 
> tiny, some of the code is C&P'd from tsrank.c and much of the rest is 
> boilerplate.
>
> A snippet from the regression test:
>
>
>   select ts_count(to_tsvector('managing managers manage peons
>   managerially'),
>                    to_tsquery('managers | peon'));
>     ts_count
>   ----------
>            4
>
> We'd like to add something like this for 9.2, so I'd like to get the API 
> agreed and then I'll prepare a patch and submit it for the next CF.
>
> Comments? cheers andrew
>
>
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: ts_count

From
Andrew Dunstan
Date:

On 06/04/2011 04:51 PM, Oleg Bartunov wrote:
> Well, there are several functions available around tsearch2. so I suggest
> somebody to collect all of them and create one extension - ts_addon.
> For example, these are what I remember:
> 1. tsvector2array
> 2. noccurences(tsvector, tsquery) - like your ts_count
> 3. nmatches(tsvector, tsquery) - # of matched lexems in query
> Of course, we need to think about better names for functions, since
> ts_count is a bit ambiguous.
>
>

Getting agreed names was one reason for posting. I don't know why these 
need to be an extension. I think they are of sufficiently general 
interest (and sufficiently lightweight) that we could just build them in.

cheers

andrew


Re: ts_count

From
Alvaro Herrera
Date:
Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011:

> A snippet from the regression test:
> 
> 
>     select ts_count(to_tsvector('managing managers manage peons managerially'),
>                      to_tsquery('managers | peon'));
>       ts_count
>     ----------
>              4

Err, shouldn't this return 5?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: ts_count

From
Andrew Dunstan
Date:

On 06/04/2011 08:59 PM, Alvaro Herrera wrote:
> Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011:
>
>> A snippet from the regression test:
>>
>>
>>      select ts_count(to_tsvector('managing managers manage peons managerially'),
>>                       to_tsquery('managers | peon'));
>>        ts_count
>>      ----------
>>               4
> Err, shouldn't this return 5?

No. 'managerially' doesn't get the same stemming.

cheers

andrew


Re: ts_count

From
Andrew Dunstan
Date:

On 06/04/2011 04:51 PM, Oleg Bartunov wrote:
> Well, there are several functions available around tsearch2. so I suggest
> somebody to collect all of them and create one extension - ts_addon.
> For example, these are what I remember:
> 1. tsvector2array
> 2. noccurences(tsvector, tsquery) - like your ts_count
> 3. nmatches(tsvector, tsquery) - # of matched lexems in query
> Of course, we need to think about better names for functions, since
> ts_count is a bit ambiguous.
>
>
>

Oleg, are you doing this? I'd rather this stuff didn't get dropped on 
the floor.

cheers

andrew