Re: PostgreSQL Query Speed Issues - Mailing list pgsql-novice

From Joseph Pravato
Subject Re: PostgreSQL Query Speed Issues
Date
Msg-id 5126A4EA.6000407@pluto.nomagic.com
Whole thread Raw
In response to Re: PostgreSQL Query Speed Issues  (Douglas J Hunley <doug.hunley@gmail.com>)
List pgsql-novice
On 2/21/2013 3:00 PM, Douglas J Hunley wrote:
> On Thu, Feb 21, 2013 at 3:56 PM, Joseph Pravato
> <joseph.pravato@nomagic.com> wrote:
>> NOTE: All database tests were done without changing or updating any settings after install.
> The defaults are sub-optimal. You really do need to tune them to the
> server in question.
Do you have an recommendations that are optimized for database with
600,000 user records and tables with up to 2,000,000 records?


>
>> # rows in contact: 574939
>> # rows in contact_address_map: 574924
> After loading this data, did you run an ANALYZE?
No we did not, why would this be useful?


>
>> select c.*
>> from contact c
>> left join CONTACT_ADDRESS_MAP cam on cam.CONTACT_ID = c.id
>> where cam.ADDRESS_ID is null
> Table definitions would probably help. You might be missing indexes.
Table definitions link (http://pastebin.com/GyCsYpBn). See index comment
below.


>
>> Result from an 'explain analyze':
>>
>> QUERY PLAN
>> Merge Left Join  (cost=83512.87..158059.12 rows=1 width=952) (actual time=5224.171..5224.270 rows=15 loops=1)
>>   Merge Cond: (c.id = cam.contact_id)
>>   Filter: (cam.address_id IS NULL)
>>   Rows Removed by Filter: 574924
>>   ->  Index Scan using contact_pkey on contact c  (cost=0.00..63048.48 rows=574917 width=952) (actual
time=0.009..852.708rows=574939 loops=1) 
>>   ->  Materialize  (cost=83512.59..86387.17 rows=574917 width=16) (actual time=1407.387..3107.246 rows=574924
loops=1)
>>         ->  Sort  (cost=83512.59..84949.88 rows=574917 width=16) (actual time=1407.379..2027.946 rows=574924
loops=1)
>>               Sort Key: cam.contact_id
>>               Sort Method: external sort  Disk: 14616kB
> disk sorts imply work_mem isn't big enough.
I was surprised that the work_mem default was so low, we changed it from
1MB to 50MB and this is the `explain analyze' for
select * from contact where id not in (select contact_id from
contact_address_map)

QUERY PLAN
Seq Scan on contact  (cost=10294.55..31256.01 rows=287458 width=952)
(actual time=1555.473..1582.885 rows=16 loops=1)
  Filter: (NOT (hashed SubPlan 1))
  Rows Removed by Filter: 574924
  SubPlan 1
    ->  Seq Scan on contact_address_map  (cost=0.00..8857.24 rows=574924
width=8) (actual time=0.013..586.107 rows=574924 loops=1)
Total runtime: 1597.773 ms

Increasing work_mem did not improve speeds for an unrelated query, it
ended up returning in 26 minutes. We had set the work_mem to 1000MB for
that test, we will try to get a simplified version of the query that
does not expose company data to you tomorrow.


>
>>               ->  Seq Scan on contact_address_map cam  (cost=0.00..8857.17 rows=574917 width=16) (actual
time=0.018..578.348rows=574924 loops=1) 
> This should probably be using an index
We added indexes to both columns of contact_address_map, but they made
no difference in speed for both queries.

Any additional assistance is appreciated.


pgsql-novice by date:

Previous
From: Douglas J Hunley
Date:
Subject: Re: PostgreSQL Query Speed Issues
Next
From: Joseph Pravato
Date:
Subject: Re: PostgreSQL Query Speed Issues