Re: Performance Optimization for Dummies 2 - the SQL - Mailing list pgsql-performance

From Carlo Stonebanks
Subject Re: Performance Optimization for Dummies 2 - the SQL
Date
Msg-id eg611q$l0m$1@news.hub.org
Whole thread Raw
In response to Performance Optimization for Dummies 2 - the SQL  ("Carlo Stonebanks" <stonec.register@sympatico.ca>)
Responses Re: Performance Optimization for Dummies 2 - the SQL
List pgsql-performance
This didn't work right away, but DID work after running a VACUUM FULL. In
other words, i was still stuck with a sequential scan until after the
vacuum.

I turned autovacuum off in order to help with the import, but was perfoming
an ANALYZE with every 500 rows imported.

With autovacuum off for imports, how frequently should I VACUUM?



""Merlin Moncure"" <mmoncure@gmail.com> wrote in message
news:b42b73150610050630t76e9eedeh45a382729d35f2f6@mail.gmail.com...
> On 10/5/06, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
>> > do we have an multi-column index on
>> > facility_address(facility_id, address_id)?  did you run analyze?
>>
>> There is an index on facility_address on facility_id.
>>
>> I didn't create an index on facility_address.address_id because I
>> expected
>> joins to go in the other direction (from facility_address to address).
>> Nor did I create a multi-column index on facility_id, address_id because
>> I
>> had yet to come up with a query that required that.
>
> right. well, since you are filtering on address, I would consider
> added an index on address_id or a multi column on address_id,
> facility_id (in addition to facility_id).  also, I'd consider removing
> all the explicit joins like this:
>
> explain analyze select
>     f.facility_id,
>     fa.facility_address_id,
>     a.address_id,
>     f.facility_type_code,
>     f.name,
>     a.address,
>     a.city,
>     a.state_code,
>     a.postal_code,
>     a.country_code
>  from
>    mdx_core.facility f,
>    mdx_core.facility_address fa,
>    mdx_core.address a
>  where
>    fa.facility_id = f.facility_id and
>    a.address_id = fa.address_id and
>    a.country_code = 'US' and
>    a.state_code = 'IL' and
>    a.postal_code like '60640-5759'||'%'
>    order by facility_id;
>
> yet another way to write that where clause is:
>
>    (fa_address_id, fa.facility_id) = (a.address_id, f.facility_id)  and
>    a.country_code = 'US' and
>    a.state_code = 'IL' and
>    a.postal_code like '60640-5759'||'%'
>    order by facility_id;
>
> I personally only use explicit joins when doing outer joins and even
> them push them out as far as possible.
>
> I like the row constructor style better because it shows the key
> relationships more clearly.  I don't think it makes a difference in
> execution (go ahead and try it).  If you do make a multi column key on
> facility_address, though, make sure to put they key fields in left to
> right order in the row constructor.   Try adding a multi key on
> address_id and facility_id and run it this way.  In a proper design
> you would have a primary key on these fields but with imported data
> you obviously have to make compromises :).
>
>> However, I still have a lot to learn about how SQL chooses its indexes,
>> how
>> multi-column indexes are used, and when to use them (other than the
>> obvious - i.e. sort orders or relational expressions which request those
>> columns in one search expression)
>
> well, it's kind of black magic but if the database is properly laid
> out the function usually follows form pretty well.
>
>> Analyse is actually run every time a page of imported data loads into the
>> client program. This is currently set at 500 rows.
>
> ok.
>
> merlin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>



pgsql-performance by date:

Previous
From: Jim Nasby
Date:
Subject: Re: slow queue-like empty table
Next
From: Scott Marlowe
Date:
Subject: Re: Performance Optimization for Dummies 2 - the SQL