I would like to re-table my proposal from many moons ago to allow
pattern-matching operations to use indexes under any locale.
The idea is that since pattern matching works on a character-by-character
basis, we also need to interact with the index using operators that do a
comparison strictly on a character-by-character basis. So we define a
separate set of comparison operators and operator classes for each
character type (text, varchar, bpchar, name), teach the optimizer to use
it for suitably anchored pattern-matching operations, and tell users to
create indexes using that special operator class if they want pattern
matching to use indexes.
Here are a couple of details to discuss:
I named the operators #<#, #>=#, etc. If someone can think of better
names, let me know.
Since character-by-character comparison is essentially binary comparison,
I named the operator classes, text_binary_ops, etc. Another idea is to
name them text_like_ops or text_pattern_ops or whatever, so that if some
change in the pattern matching operations would later force us to alter
the behavior of the operators away from being strictly memcmp(), we would
be free to change them.
Should there be a special case for the C locale? Without extra code, if
you use the C locale and would like to use indexes for both equality and
pattern comparisons, you need two indexes with different operator classes.
That might seem wasteful, but then all locales would really work the same.
I'm unclear on how the selectivity estimation should work. The system
doesn't collect statistics based on the new #<#-style operators, so the
estimates are based on the normal comparison, which might have little to
do with reality.
--
Peter Eisentraut peter_e@gmx.net