Re: Collect frequency statistics for arrays - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Collect frequency statistics for arrays
Date
Msg-id 11732.1330720601@sss.pgh.pa.us
Whole thread Raw
In response to Re: Collect frequency statistics for arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Collect frequency statistics for arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> ... So my preference is to align the two
> definitions of STATISTIC_KIND_MCELEM by adding a null-element frequency
> to tsvector's usage (where it'll always be zero) and getting rid of the
> average distinct element count here.

Actually, there's a way we can do this without code changes in the
tsvector stuff.  Since the number of MCELEM stanumber items that provide
frequencies of stavalue items is obviously equal to the length of
stavalues, we could define stanumbers as containing those matching
entries, then two min/max entries, then an *optional* entry for the
frequency of null elements (with the frequency presumed to be zero if
omitted).  This'd be non-ambiguous given access to stavalues.  I'm not
sure though if making the null frequency optional wouldn't introduce
complexity elsewhere that outweighs not having to touch the tsvector
code.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: index-only quals vs. security_barrier views
Next
From: Robert Haas
Date:
Subject: sortsupport for text