Comparing toasted data (was improve Chinese locale performance) - Mailing list pgsql-hackers

From Greg Stark
Subject Comparing toasted data (was improve Chinese locale performance)
Date
Msg-id CAM-w4HNRfb6vu6A9VYGoZcvxC7Z5hJBorO6k2Eg4=RK=t0+-OQ@mail.gmail.com
Whole thread Raw
List pgsql-hackers
On Sun, Jul 28, 2013 at 10:39 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> The main issue with strxfrm() is its lame API. If it supported
> returning prefixes you'd be set, but as it is you need >10MB of memory
> just to transform a 10MB string, even if only the first few characers
> would be enough to sort...

It occurs to me that the same issue impacts our handling of toast
data. If you compare a toasted bytea (or string in C locale) it would
be nice to fetch just the first chunk and start the comparison. Only
if you reach the end of that chunk should the next chunk be needed.
Even compressed data need not be decompressed past the point where the
comparison is decided. If the other datum is not toasted then you can
even know upfront  what the worst case is of how much needs to be
detoasted.

It's too bad this wouldn't work for non-C locale strings. The tool to
do it would be strxfrm again but I can't imagine how to store toasted
strxfrm data in addition to the string that wouldn't cost more than it
gained.

-- 
greg



pgsql-hackers by date:

Previous
From: Marko Tiikkaja
Date:
Subject: Re: replication_reserved_connections
Next
From: Gibheer
Date:
Subject: Re: replication_reserved_connections