Re: sortsupport for text - Mailing list pgsql-hackers

From Tom Lane
Subject Re: sortsupport for text
Date
Msg-id 8191.1332083293@sss.pgh.pa.us
Whole thread Raw
In response to Re: sortsupport for text  (Greg Stark <stark@mit.edu>)
Responses Re: sortsupport for text  (Robert Haas <robertmhaas@gmail.com>)
Re: sortsupport for text  (Peter Geoghegan <peter@2ndquadrant.com>)
Re: sortsupport for text  (Peter Geoghegan <peter@2ndquadrant.com>)
List pgsql-hackers
Greg Stark <stark@mit.edu> writes:
> I'm still curious how it would compare to call strxfrm and sort the
> resulting binary blobs.

In principle that should be a win; it's hard to believe that strxfrm
would have gotten into the standards if it were not a win for sorting
applications.

> I don't think the sortsupport stuff actually
> makes this any easier though. Since using it requires storing the
> binary blob somewhere I think the support would have to be baked into
> tuplesort (or hacked into the sortkey as an expr that was evaluated
> earlier somehow).

Well, obviously something has to be done, but I think it might be
possible to express this as another sortsupport API function rather than
doing anything as ugly as hardwiring strxfrm into the callers.

However, it occurred to me that we could pretty easily jury-rig
something that would give us an idea about the actual benefit available
here.  To wit: make a C function that wraps strxfrm, basically
strxfrm(text) returns bytea.  Then compare the performance of
ORDER BY text_col to ORDER BY strxfrm(text_col).

(You would need to have either both or neither of text and bytea
using the sortsupport code paths for this to be a fair comparison.)

One other thing I've always wondered about in this connection is the
general performance of sorting toasted datums.  Is it better to detoast
them in every comparison, or pre-detoast to save comparison cycles at
the cost of having to push much more data around?  I didn't see any
discussion of this point in Robert's benchmarks, but I don't think we
should go very far towards enabling sortsupport for text until we
understand the issue and know whether we need to add more infrastructure
for it.  If you cross your eyes a little bit, this is very much like
the strxfrm question...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: pg_prewarm
Next
From: Yeb Havinga
Date:
Subject: Recent MinGW postgres builds with -O2 do not pass regression tests