Re: Slow query with joins - Mailing list pgsql-performance

From Bendik Rognlien Johansen
Subject Re: Slow query with joins
Date
Msg-id 65451320-7D72-4544-A8E8-CD93212E4A5D@gmail.com
Whole thread Raw
In response to Re: Slow query with joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Slow query with joins  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-performance
Yes,  the rowcount estimates are real, however, it has been a long
time since the last VACUUM FULL (there is never a good time).

I have clustered the tables, reindexed, analyzed, vacuumed and the
plan now looks like this:


no_people=# explain SELECT r.id AS r_id, r.firstname || ' ' ||
r.lastname AS r_name, ad.id AS ad_id, ad.type AS ad_type, ad.address
AS ad_address, ad.postalcode AS ad_postalcode, ad.postalsite AS
ad_postalsite, ad.priority AS ad_priority, ad.position[0] AS ad_lat,
ad.position[1] AS ad_lon, ad.uncertainty AS ad_uncertainty, ad.extra
AS ad_extra, ad.deleted AS ad_deleted, co.id AS co_id, co.type AS
co_type, co.value AS co_value, co.description AS co_description,
co.priority AS co_priority, co.visible AS co_visible, co.searchable
AS co_searchable, co.deleted AS co_deleted FROM people r LEFT OUTER
JOIN addresses ad ON(r.id = ad.record) LEFT OUTER JOIN contacts co ON
(r.id = co.record) WHERE NOT r.deleted AND  r.original IS NULL ORDER
BY r.id;
                                                         QUERY PLAN
------------------------------------------------------------------------
--------------------------------------------------
Sort  (cost=182866.49..182943.12 rows=30655 width=587)
    Sort Key: r.id
    ->  Nested Loop Left Join  (cost=0.00..170552.10 rows=30655
width=587)
          ->  Nested Loop Left Join  (cost=0.00..75054.96 rows=26325
width=160)
                ->  Index Scan using people_deleted_original_is_null
on people r  (cost=0.00..1045.47 rows=23861 width=27)
                      Filter: ((NOT deleted) AND (original IS NULL))
                ->  Index Scan using addresses_record_idx on
addresses ad  (cost=0.00..3.05 rows=4 width=137)
                      Index Cond: ("outer".id = ad.record)
          ->  Index Scan using contacts_record_idx on contacts co
(cost=0.00..3.32 rows=24 width=431)
                Index Cond: ("outer".id = co.record)
(10 rows)






Looks faster, but still very slow. I added limit 1000 and it has been
running for about 25 minutes now with no output. top shows:


   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
29994 postgres  18   0 95768  78m  68m R 17.0  7.7   0:53.27 postmaster



which is unusual, I usually get 99.9 %cpu for just about any query,
which leads me to believe this is disk related.



postgresql.conf:
shared_buffers = 8192
work_mem = 8192
maintenance_work_mem = 524288




Hardware 2x2.8GHz cpu
1GB ram

Could this be an issue related to lack of VACUUM FULL? The tables get
a lot of updates.


Thank you very much so far!




On Jan 11, 2006, at 4:45 PM, Tom Lane wrote:

> Bendik Rognlien Johansen <bendik.johansen@gmail.com> writes:
>> Has anyone got any tips for speeding up this query? It currently
>> takes hours to start.
>
> Are the rowcount estimates close to reality?  The plan doesn't look
> unreasonable to me if they are.  It might help to increase work_mem
> to ensure that the hash tables don't spill to disk.
>
> Indexes:
>      "people_original_is_null" btree (original) WHERE original IS NULL
>
> This index seems poorly designed: the actual index entries are dead
> weight since all of them are necessarily NULL.  You might as well make
> the index carry something that you frequently test in conjunction with
> "original IS NULL".  For instance, if this particular query is a
> common
> case, you could replace this index with
>
> CREATE INDEX people_deleted_original_is_null ON people(deleted)
>   WHERE original IS NULL;
>
> This index is still perfectly usable for queries that only say
> "original
> IS NULL", but it can also filter out rows with the wrong value of
> deleted.  Now, if there are hardly any rows with deleted = true, maybe
> this won't help much for your problem.  But in any case you ought to
> consider whether you can make the index entries do something useful.
>
>             regards, tom lane


pgsql-performance by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Postgres8.0 planner chooses WRONG plan
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Slow query with joins