Re: gsoc, oprrest function for text search take 2 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: gsoc, oprrest function for text search take 2
Date
Msg-id 489FFB08.6050709@enterprisedb.com
Whole thread Raw
In response to Re: gsoc, oprrest function for text search take 2  (Jan Urbański <j.urbanski@students.mimuw.edu.pl>)
Responses Re: gsoc, oprrest function for text search take 2  (Jan Urbański <j.urbanski@students.mimuw.edu.pl>)
List pgsql-hackers
Jan Urbański wrote:
> Heikki Linnakangas wrote:
>> Jan Urbański wrote:
>>> Another thing are cstring_to_text_with_len calls. I'm doing them so I 
>>> can use bttextcmp in bsearch(). I think I could come up with a 
>>> dedicated function to return text Datums and WordEntries (read: 
>>> non-NULL terminated strings with a given length).
>>
>> Just keep them as cstrings and use strcmp. We're only keeping the 
>> array sorted so that we can binary search it, so we don't need proper 
>> locale-dependent collation. Note that we already assume that two 
>> strings ('text's) are equal if and only if their binary 
>> representations are equal (texteq() uses strcmp).
> 
> OK, I got rid of cstring->text calls and memory contexts as I went 
> through it. The only tiny ugliness is that there's one function used for 
> qsort() and another for bsearch(), because I'm sorting an array of texts 
> (from pg_statistic) and I'm binary searching for a lexeme (non-NULL 
> terminated string with length).

It would be nice to clean that up a bit. I think you could convert the 
lexeme to a TextFreq, or make the TextFreq.element a "text *" instead of 
Datum (ie., detoast it with PG_DETOAST_DATUM while you build the array 
for qsort).

> My medicore gprof skills got me:
>                 0.00    0.22       5/5           OidFunctionCall4 [37]
> [38]    28.4    0.00    0.22       5         tssel [38]
>                 0.00    0.17       5/5 get_restriction_variable [40]
>                 0.03    0.01       5/10          pg_qsort [60]
>                 0.00    0.00       5/5           get_attstatsslot [139]
> 
> Hopefully that says that the qsort() overhead is small compared to 
> munging through the planner Node.

I'd like to see a little bit more testing of that. I can't read gprof 
myself, so the above doesn't give me much confidence. I use oprofile, 
which I find is much simpler to use.

I think the worst case scenario is with statistics_target set to 
maximum, with a simplest possible query and simplest possible tsquery.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Ryan Bradetich"
Date:
Subject: Question regarding the database page layout.
Next
From: Magnus Hagander
Date:
Subject: Re: Parsing of pg_hba.conf and authentication inconsistencies