> Will an index on the 'data_is_active' column be used or work
> as I expect? I'm assuming that I may have a million entries
> sharing the same 'data_lookup_key' and I'll be using that to
> search for the active version of the row.
An index just on a boolean column won't be 'selective enough'. eg. The
index will only be able to choose 50% of the table - since it's faster
to do a full table scan in that case, the index won't get used.
A multi keyed index, however will work a bit better, eg an index over
(data_lookup_key, data_is_active).
That way, the index will first be able to find the correct key (which is
nicely selective) and then will be able to halve the resulting search
space to get the active ones.
BTW, you shouldn't use 'banana', 'pear', etc as the data_lookup_key, you
should make another table like this:
id name
1 banana
2 apple
3 pear
And then replace the data_lookup_key col with a column of integers that
is a foreign key to the names table - waaaaay faster to process.
Chris