Re: Optimizer on sort aggregate - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Optimizer on sort aggregate
Date
Msg-id CAM-w4HOSrEvJdvsx=rLL+9Sc2eBPvUKm_rgaE+zAdKHnsqkd4A@mail.gmail.com
Whole thread Raw
In response to Re: Optimizer on sort aggregate  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Optimizer on sort aggregate
List pgsql-hackers
On Sat, Oct 18, 2014 at 3:10 AM, Peter Geoghegan <pg@heroku.com> wrote:
> So the first eight bytes of the first string is 0x131F1F1B2222221E,
> and the second 0x131F1F1B2222220C. The last byte is different.

That's interesting but I think it's mostly a quirk of your example.
Afaics the difference is only that the en_US locale ignores
punctuation like : and /  (or at least treats them as less significant
than alphabetic characters). If you had strings that had less
punctuation or differences that didn't happen to arrive shortly after
the 8-byte boundary then it wouldn't make any difference.

And we still have to run strfrm at least once, write out the whole
binary blob to memory somewhere and if it spills to disk we still have
to write and read much more data. I think recognizing cases where
equality is the only thing we're interested in and locale-sensitive
sorting isn't necessary and using a memcmp would be a clear win.

I'm not immediately clear on what the cleanest way to integrate it
would be. A btree opclass function like the cmp function but that
doesn't need to be consistent with < and >, only = ? Or perhaps a flag
on the btree opclass that indicates that the data types can safely be
compared with memcmp when equality is all that's needed? The latter is
pretty tempting since it would tell code something interesting about
the data type's internal storage that may lead to other optimizations.
On the other hand the former is nice in that the operator could maybe
handle other cases like padding by doing memcmp on only the
significant bits.



-- 
greg



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: pgaudit - an auditing extension for PostgreSQL
Next
From: Nicolas Barbier
Date:
Subject: Re: Materialized views don't show up in information_schema