Re: sortsupport for text - Mailing list pgsql-hackers

From Tom Lane
Subject Re: sortsupport for text
Date
Msg-id 28117.1340210503@sss.pgh.pa.us
Whole thread Raw
In response to Re: sortsupport for text  (Peter Geoghegan <peter@2ndquadrant.com>)
Responses Re: sortsupport for text  (Robert Haas <robertmhaas@gmail.com>)
Re: sortsupport for text  (Peter Geoghegan <peter@2ndquadrant.com>)
Re: sortsupport for text  (Peter Geoghegan <peter@2ndquadrant.com>)
List pgsql-hackers
Peter Geoghegan <peter@2ndquadrant.com> writes:
> No, I'm suggesting it would probably be at least a bit of a win here
> to cache the constant, and only have to do a strxfrm() + strcmp() per
> comparison.

Um, have you got any hard evidence to support that notion?  The
traditional advice is that strcoll is faster than using strxfrm unless
the same strings are to be compared repeatedly.  I'm not convinced that
saving the strxfrm on just one side will swing the balance.

> The fact is that this is likely to be a fairly significant
> performance win, because strxfrm() is quite simply the way you're
> supposed to do collation-aware sorting, and is documented as such. For
> that reason, C standard library implementations should not be expected
> to emphasize its performance - they assume that you're using strxfrm()
> + their highly optimised strcmp()

Have you got any evidence in support of this claim, or is it just
wishful thinking about what's likely to be inside libc?  I'd also note
that any comparisons you may have seen about this are certainly not
accounting for the effects of data bloat from strxfrm (ie, possible
spill to disk, more merge passes, etc).

In any case, if you have to redefine the meaning of equality in order
to justify a performance patch, I'm prepared to walk away at the start.
The range of likely performance costs/benefits across different locales
and different implementations is so wide that if you can't show it to be
a win even with the strcmp tiebreaker, it's not likely to be a reliable
win without that.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: libpq compression
Next
From: Magnus Hagander
Date:
Subject: Re: libpq compression