Re: texteq/byteaeq: avoid detoast [REVIEW] - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: texteq/byteaeq: avoid detoast [REVIEW]
Date
Msg-id 20110119082241.GB11804@svana.org
Whole thread Raw
In response to Re: texteq/byteaeq: avoid detoast [REVIEW]  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Tue, Jan 18, 2011 at 10:03:01AM +0200, Heikki Linnakangas wrote:
>> That isn't ever going to happen, unless you'd like to give up hash joins
>> and hash aggregation on text values.
>
> You could canonicalize the string first in the hash function. I'm not
> sure if we have all the necessary information at hand there, but at
> least with some encoding/locale-specific support functions it'd be
> possible.

This is what strxfrm() was created for.

strcoll(a,b) == strcmp(strxfrm(a),strxfrm(b))

Sure there's a cost, the question is only how much and whether it makes
hash join unfeasible. I doubt it, since by definition it must be faster
than strcoll(). I suppose a test would be interesting.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

pgsql-hackers by date:

Previous
From: Andrea Suisani
Date:
Subject: Re: limiting hint bit I/O
Next
From: Dimitri Fontaine
Date:
Subject: Re: Extending opfamilies for GIN indexes