On 1-9-2005 19:42, Matthew Sackman wrote:
> Obviously, to me, this is a problem, I need these queries to be under a
> second to complete. Is this unreasonable? What can I do to make this "go
> faster"? I've considered normalising the table but I can't work out
> whether the slowness is in dereferencing the pointers from the index
> into the table or in scanning the index in the first place. And
> normalising the table is going to cause much pain when inserting values
> and I'm not entirely sure if I see why normalising it should cause a
> massive performance improvement.
In this case, I think normalising will give a major decrease in on-disk
table-size of this large table and the indexes you have. If that's the
case, that in itself will speed-up all i/o-bound queries quite a bit.
locality_1, _2, city and county can probably be normalised away without
much problem, but going from varchar's to integers will probably safe
you quite a bit of (disk)space.
But since it won't change the selectivity of indexes, so you won't get
more index-scans instead of sequential scans, I suppose.
I think its not that hard to create a normalized set of tables from this
data-set (using insert into tablename select distinct ... from address
and such, insert into address_new (..., city) select ... (select cityid
from cities where city = address.city) from address)
So its at least relatively easy to figure out the performance
improvement from normalizing the dataset a bit.
If you want to improve your hardware, have a look at the Western Digital
Raptor-series SATA disks, they are fast scsi-like SATA drives. You may
also have a look at the amount of memory available, to allow caching
this (entire) table.
Best regards,
Arjen