Home > mailing lists

Re: Massive performance issues - Mailing list pgsql-performance

From	Steinar H. Gunderson
Subject	Re: Massive performance issues
Date	September 1, 2005 17:10:06
Msg-id	20050901200930.GA23339@uio.no Whole thread Raw
In response to	Massive performance issues (Matthew Sackman <matthew@lshift.net>)
Responses	Re: Massive performance issues
List	pgsql-performance

Tree view

On Thu, Sep 01, 2005 at 06:42:31PM +0100, Matthew Sackman wrote:
>  flat_extra           | character varying(100) | not null
>  number               | character varying(100) | not null
>  street               | character varying(100) | not null
>  locality_1           | character varying(100) | not null
>  locality_2           | character varying(100) | not null
>  city                 | character varying(100) | not null
>  county               | character varying(100) | not null

Having these fixed probably won't give you any noticeable improvements;
unless there's something natural about your data setting 100 as a hard limit,
you could just as well drop these.

>     "address_city_index" btree (city)
>     "address_county_index" btree (county)
>     "address_locality_1_index" btree (locality_1)
>     "address_locality_2_index" btree (locality_2)
>     "address_pc_bottom_index" btree (postcode_bottom)
>     "address_pc_middle_index" btree (postcode_middle)
>     "address_pc_top_index" btree (postcode_top)
>     "address_pc_top_middle_bottom_index" btree (postcode_top,
>                              postcode_middle, postcode_bottom)
>     "address_pc_top_middle_index" btree (postcode_top, postcode_middle)
>     "address_postcode_index" btree (postcode)
>     "address_property_type_index" btree (property_type)
>     "address_street_index" btree (street)
>     "street_prefix" btree (lower("substring"((street)::text, 1, 1)))

Wow, that's quite a lof of indexes... but your problem isn't reported as
being in insert/update/delete.

> This is with postgresql 7.4 running on linux 2.6.11 with a 3GHz P4 and a
> SATA harddrive.

8.0 or 8.1 might help you some -- better (and more!) disks will probably help
a _lot_.

> Queries such as:
>
> select locality_2 from address where locality_2 = 'Manchester';
>
> are taking 14 seconds to complete, and this is only 2 years worth of
> data - we will have up to 15 years (so over 15 million rows).

As Tom pointed out; you're effectively doing random searches here, and using
CLUSTER might help. Normalizing your data to get smaller rows (and avoid
possibly costly string comparisons if your strcoll() is slow) will probably
also help.

> I need to get to the stage where I can run queries such as:
> select street, locality_1, locality_2, city from address
> where (city = 'Nottingham' or locality_2 = 'Nottingham'
>        or locality_1 = 'Nottingham')
>   and upper(substring(street from 1 for 1)) = 'A'
> group by street, locality_1, locality_2, city
> order by street
> limit 20 offset 0

This might be a lot quicker than pulling all the records like in your example
queries...

/* Steinar */
--
Homepage: http://www.sesse.net/

pgsql-performance by date:

From: Ron
Date: 01 September 2005, 17:07:48
Subject: Re: Massive performance issues

From: Alvaro Herrera
Date: 01 September 2005, 17:17:57
Subject: Re: Massive performance issues

Re: Massive performance issues - Mailing list pgsql-performance

Previous

Next