Matthew,
> Well, this is a development box. But the live box wouldn't be much more
> than RAID 1 on SCSI 10ks so that should only be a halving of seek time,
> not the 1000 times reduction I'm after!
If you're looking for 1000 times reduction, I think you're going to need
*considerably* beefier hardware. You'd pretty much have to count on the
whole DB being in RAM, and a CPU being always available for incoming queries.
> In fact, now I think about it, I have been testing on a 2.4 kernel on a
> dual HT 3GHz Xeon with SCSI RAID array and the performance is only
> marginally better.
Yes, but HT sucks for databases, so you're probably bottlenecking yourself on
CPU on that machine.
However, if this is the query you really want to optimize for:
select street, locality_1, locality_2, city from address
where (city = 'Nottingham' or locality_2 = 'Nottingham'
or locality_1 = 'Nottingham')
and upper(substring(street from 1 for 1)) = 'A'
group by street, locality_1, locality_2, city
order by street
limit 20 offset 0
... then this is the query you should test on. Although I will say that your
denormalized schema is actually hurting you siginificantly with the above
type of query; indexes aren't going to be possible for it.
--
Josh Berkus
Aglio Database Solutions
San Francisco