Re: Massive performance issues - Mailing list pgsql-performance

From Arjen van der Meijden
Subject Re: Massive performance issues
Date
Msg-id 43176A95.70506@tweakers.net
Whole thread Raw
In response to Massive performance issues  (Matthew Sackman <matthew@lshift.net>)
Responses Re: Massive performance issues
List pgsql-performance
On 1-9-2005 19:42, Matthew Sackman wrote:
> Obviously, to me, this is a problem, I need these queries to be under a
> second to complete. Is this unreasonable? What can I do to make this "go
> faster"? I've considered normalising the table but I can't work out
> whether the slowness is in dereferencing the pointers from the index
> into the table or in scanning the index in the first place. And
> normalising the table is going to cause much pain when inserting values
> and I'm not entirely sure if I see why normalising it should cause a
> massive performance improvement.

In this case, I think normalising will give a major decrease in on-disk
table-size of this large table and the indexes you have. If that's the
case, that in itself will speed-up all i/o-bound queries quite a bit.

locality_1, _2, city and county can probably be normalised away without
much problem, but going from varchar's to integers will probably safe
you quite a bit of (disk)space.

But since it won't change the selectivity of indexes, so you won't get
more index-scans instead of sequential scans, I suppose.
I think its not that hard to create a normalized set of tables from this
  data-set (using insert into tablename select distinct ... from address
and such, insert into address_new (..., city) select ... (select cityid
from cities where city = address.city) from address)
So its at least relatively easy to figure out the performance
improvement from normalizing the dataset a bit.

If you want to improve your hardware, have a look at the Western Digital
Raptor-series SATA disks, they are fast scsi-like SATA drives. You may
also have a look at the amount of memory available, to allow caching
this (entire) table.

Best regards,

Arjen

pgsql-performance by date:

Previous
From: Sebastian Hennebrueder
Date:
Subject: Re: Massive performance issues
Next
From: Ron
Date:
Subject: Re: Massive performance issues