Thread: Proposal: collect frequency statistics for arrays

Proposal: collect frequency statistics for arrays

From
Alexander Korotkov
Date:
Hackers,

I have following proposal. Currently the ts_typanalyze function accumulates frequency statistics for ts_vector using lossy counting technique. But no frequency statistics is collecting over arrays. I'm going to generalize ts_typanalyze to make it collecting statistics for arrays too. ts_typanalyze internally uses lexeme comparison and hashing. I'm going to use functions from default btree and hash opclasses of array element type in this capacity. Collected frequency statistics for arrays can be used for && and @> operators selectivity estimation. 

------
With best regards,
Alexander Korotkov.

Re: Proposal: collect frequency statistics for arrays

From
Tom Lane
Date:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> I have following proposal. Currently the ts_typanalyze function accumulates
> frequency statistics for ts_vector using lossy counting technique. But no
> frequency statistics is collecting over arrays. I'm going to generalize
> ts_typanalyze to make it collecting statistics for arrays too. ts_typanalyze
> internally uses lexeme comparison and hashing. I'm going to use functions
> from default btree and hash opclasses of array element type in this
> capacity. Collected frequency statistics for arrays can be used for && and
> @> operators selectivity estimation.

It'd be better to just make a separate function for arrays, instead of
trying to kluge ts_typanalyze to the point where it'd cover both cases.
        regards, tom lane


Re: Proposal: collect frequency statistics for arrays

From
Alexander Korotkov
Date:
Thanks for feedback on my proposal.
Ok, I'll write it as an separate function. After that I'm going to look if is there a way to union them without kluge. If I'll not find such way then I'll propose patch with separate function.

------
With best regards,
Alexander Korotkov.