On Monday 20 July 2009 04:46:53 Robert James wrote:
> I'm storing a lot of words in a database. What's the fastest format for
> finding them? I'm going to be doing a lot of WHERE w LIKE 'marsh%' and
> WHERE w IN ('m', 'ma'). All characters are lowercase a-z, no punctuation,
> no other alphabets. By default I'm using varchar in utf-8 encoding, but
> was wondering if I could specificy something else (perhaps 7bit ascii,
> perhaps lowercase only) that would speed things up even further.
If your data is only lowercase a-z, as you say, then the binary representation
will be the same in all server-side encodings, because they are all supersets
of ASCII.
These concerns will likely be dominated by the question of proper indexing and
caching anyway.