Re: gsoc, oprrest function for text search take 2 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: gsoc, oprrest function for text search take 2
Date
Msg-id 6984.1220393343@sss.pgh.pa.us
Whole thread Raw
In response to Re: gsoc, oprrest function for text search take 2  (Jan Urbański <j.urbanski@students.mimuw.edu.pl>)
Responses Re: gsoc, oprrest function for text search take 2  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Jan Urbański <j.urbanski@students.mimuw.edu.pl> writes:
> Pre-sorting introduced one problem (see XXX in code): it's not easy 
> anymore to get the minimal frequency of MCELEM values. I was using it to 
> assert that the selectivity of a tsquery node containing a lexeme not in 
> MCELEM is no more that min(MCELEM freqs) / 2. That's only significant 
> when the minimum frequency is less than DEFAULT_TS_SEL * 2, so I'm kind 
> of inclined to ignore it and maybe drop a comment in the code that this 
> may be a potential problem.

This is easily fixed: there is nothing saying that a pg_statistic slot's
contents must contain the same numbers of Values and Numbers.  Make the
numbers array have one extra element and store the min frequency there.
Maybe it'd be worth having 2 extra elements and dropping the max in,
as well.  I don't immediately have a use for it, but it'll be a lot
harder to add it later if we don't put it in now.

> If nothing is fundamentally broken with this, I'll repeat my profiling 
> tests to see if anything has been gained.

I don't have much except minor stylistic gripes (like the ordering of
the functions in ts_selfuncs.c seeming a bit random).  One possibly
performance-relevant point is to use DatumGetTextPP for detoasting;
you've already paid the costs by using VARDATA_ANY etc, so you might
as well get the benefit.

Please fix the above and do the performance testing ...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Feature request: better debug messages
Next
From: Stephen Frost
Date:
Subject: Re: WIP: Column-level Privileges