Re: Performance Optimization for Dummies 2 - the SQL - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Performance Optimization for Dummies 2 - the SQL
Date
Msg-id b42b73150610050630t76e9eedeh45a382729d35f2f6@mail.gmail.com
Whole thread Raw
In response to Re: Performance Optimization for Dummies 2 - the SQL  ("Carlo Stonebanks" <stonec.register@sympatico.ca>)
Responses Re: Performance Optimization for Dummies 2 - the SQL  ("Jim C. Nasby" <jim@nasby.net>)
List pgsql-performance
On 10/5/06, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
> > do we have an multi-column index on
> > facility_address(facility_id, address_id)?  did you run analyze?
>
> There is an index on facility_address on facility_id.
>
> I didn't create an index on facility_address.address_id because I expected
> joins to go in the other direction (from facility_address to address).
> Nor did I create a multi-column index on facility_id, address_id because I
> had yet to come up with a query that required that.

right. well, since you are filtering on address, I would consider
added an index on address_id or a multi column on address_id,
facility_id (in addition to facility_id).  also, I'd consider removing
all the explicit joins like this:

explain analyze select
     f.facility_id,
     fa.facility_address_id,
     a.address_id,
     f.facility_type_code,
     f.name,
     a.address,
     a.city,
     a.state_code,
     a.postal_code,
     a.country_code
  from
    mdx_core.facility f,
    mdx_core.facility_address fa,
    mdx_core.address a
  where
    fa.facility_id = f.facility_id and
    a.address_id = fa.address_id and
    a.country_code = 'US' and
    a.state_code = 'IL' and
    a.postal_code like '60640-5759'||'%'
    order by facility_id;

yet another way to write that where clause is:

    (fa_address_id, fa.facility_id) = (a.address_id, f.facility_id)  and
    a.country_code = 'US' and
    a.state_code = 'IL' and
    a.postal_code like '60640-5759'||'%'
    order by facility_id;

I personally only use explicit joins when doing outer joins and even
them push them out as far as possible.

I like the row constructor style better because it shows the key
relationships more clearly.  I don't think it makes a difference in
execution (go ahead and try it).  If you do make a multi column key on
facility_address, though, make sure to put they key fields in left to
right order in the row constructor.   Try adding a multi key on
address_id and facility_id and run it this way.  In a proper design
you would have a primary key on these fields but with imported data
you obviously have to make compromises :).

> However, I still have a lot to learn about how SQL chooses its indexes, how
> multi-column indexes are used, and when to use them (other than the
> obvious - i.e. sort orders or relational expressions which request those
> columns in one search expression)

well, it's kind of black magic but if the database is properly laid
out the function usually follows form pretty well.

> Analyse is actually run every time a page of imported data loads into the
> client program. This is currently set at 500 rows.

ok.

merlin

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Performance Optimization for Dummies 2 - the SQL
Next
From: "Carlo Stonebanks"
Date:
Subject: Re: Performance Optimization for Dummies 2 - the SQL