Greg Stark <stark@mit.edu> writes:
> I'm still curious how it would compare to call strxfrm and sort the
> resulting binary blobs.
In principle that should be a win; it's hard to believe that strxfrm
would have gotten into the standards if it were not a win for sorting
applications.
> I don't think the sortsupport stuff actually
> makes this any easier though. Since using it requires storing the
> binary blob somewhere I think the support would have to be baked into
> tuplesort (or hacked into the sortkey as an expr that was evaluated
> earlier somehow).
Well, obviously something has to be done, but I think it might be
possible to express this as another sortsupport API function rather than
doing anything as ugly as hardwiring strxfrm into the callers.
However, it occurred to me that we could pretty easily jury-rig
something that would give us an idea about the actual benefit available
here. To wit: make a C function that wraps strxfrm, basically
strxfrm(text) returns bytea. Then compare the performance of
ORDER BY text_col to ORDER BY strxfrm(text_col).
(You would need to have either both or neither of text and bytea
using the sortsupport code paths for this to be a fair comparison.)
One other thing I've always wondered about in this connection is the
general performance of sorting toasted datums. Is it better to detoast
them in every comparison, or pre-detoast to save comparison cycles at
the cost of having to push much more data around? I didn't see any
discussion of this point in Robert's benchmarks, but I don't think we
should go very far towards enabling sortsupport for text until we
understand the issue and know whether we need to add more infrastructure
for it. If you cross your eyes a little bit, this is very much like
the strxfrm question...
regards, tom lane