Re: PATCH: CITEXT 2.0 v2 - Mailing list pgsql-hackers

From David E. Wheeler
Subject Re: PATCH: CITEXT 2.0 v2
Date
Msg-id F8B7A37C-45E7-416A-9AFC-714404A13627@kineticode.com
Whole thread Raw
In response to Re: PATCH: CITEXT 2.0 v2  ("David E. Wheeler" <david@kineticode.com>)
List pgsql-hackers
No, *really*

Sheesh, sorry.

David




On Jul 7, 2008, at 16:26, David E. Wheeler wrote:

> And here is the script. D'oh!
>
> Thanks,
>
> David
>
> <try.sql>
>
>
> On Jul 7, 2008, at 16:24, David E. Wheeler wrote:
>
>> On Jul 7, 2008, at 08:01, Andrew Dunstan wrote:
>>
>>> What does still bother me is its performance. I'd like to know if
>>> any measurement has been done of using citext vs. a functional
>>> index on lower(foo).
>>
>> Okay, here's a start. The attached script inserts random strings of
>> 1-10 space-delimited words into text and citext columns, and then
>> compares the performance of queries with and without indexes. The
>> output for me is as follows:
>>
>> Loading words from dictionary.
>> Inserting into the table.
>>
>> Test =.
>> SELECT * FROM try WHERE LOWER(text) = LOWER('food');
>> Time: 254.254 ms
>> SELECT * FROM try WHERE citext = 'food';
>> Time: 288.535 ms
>>
>> Test LIKE and ILIKE
>> SELECT * FROM try WHERE LOWER(text) LIKE LOWER('C%');
>> Time: 209.385 ms
>> SELECT * FROM try WHERE citext ILIKE 'C%';
>> Time: 236.186 ms
>> SELECT * FROM try WHERE citext LIKE 'C%';
>> Time: 235.818 ms
>>
>> Adding indexes...
>>
>> Test =.
>> SELECT * FROM try WHERE LOWER(text) = LOWER('food');
>> Time: 1.260 ms
>> SELECT * FROM try WHERE citext = 'food';
>> Time: 277.755 ms
>>
>> Test LIKE and ILIKE
>> SELECT * FROM try WHERE LOWER(text) LIKE LOWER('C%');
>> Time: 209.073 ms
>> SELECT * FROM try WHERE citext ILIKE 'C%';
>> Time: 238.430 ms
>> SELECT * FROM try WHERE citext LIKE 'C%';
>> Time: 238.685 ms
>> benedict%
>>
>> So for some reason, after adding the indexes, the queries against
>> the CITEXT column aren't using them. Furthermore, the `lower(text)
>> LIKE lower(?)` query isn't using *its* index. Huh?
>>
>> So this leaves me with two questions:
>>
>> 1. For what reason would the query against the citext column *not*
>> use the index?
>>
>> 2. Is there some way to get the CITEXT index to behave like a
>> LOWER() index, that is, so that its value is stored using the
>> result of the str_tolower() function, thus removing some of the
>> overhead of converting the values for each row fetched from the
>> index? (Does this question make any sense?)
>>
>> Thanks,
>>
>> David
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Attachment

pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: New relkind (was Re: Exposing quals)
Next
From: Tom Lane
Date:
Subject: Re: PATCH: CITEXT 2.0