Re: Simple queries take forever to run - Mailing list pgsql-performance
From | Michael Guerin |
---|---|
Subject | Re: Simple queries take forever to run |
Date | |
Msg-id | 3F4E6112.2060807@rentec.com Whole thread Raw |
In response to | Re: Simple queries take forever to run (Stephan Szabo <sszabo@megazone.bigpanda.com>) |
List | pgsql-performance |
Stephan Szabo wrote: >On Thu, 28 Aug 2003, Michael Guerin wrote: > > > >>Stephan Szabo wrote: >> >> >> >>>On Thu, 28 Aug 2003, Michael Guerin wrote: >>> >>> >>> >>> >>> >>>>Stephan Szabo wrote: >>>> >>>> >>>> >>>> >>>> >>>>>On Wed, 27 Aug 2003, Michael Guerin wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>I'm running into some performance problems trying to execute simple >>>>>>queries. >>>>>> >>>>>>postgresql version 7.3.3 >>>>>>.conf params changed from defaults. >>>>>>shared_buffers = 64000 >>>>>>sort_mem = 64000 >>>>>>fsync = false >>>>>>effective_cache_size = 400000 >>>>>> >>>>>>ex. query: select * from x where id in (select id from y); >>>>>> >>>>>>There's an index on each table for id. SQL Server takes <1s to return, >>>>>>postgresql doesn't return at all, neither does explain analyze. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>IN(subquery) is known to run poorly in 7.3.x and earlier. 7.4 is >>>>>generally much better (for reasonably sized subqueries) but in earlier >>>>>versions you'll probably want to convert into an EXISTS or join form. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>Something else seems to be going on, even switching to an exists clause >>>>gives much better but poor performance. >>>>count(*) where exists clause: Postgresql 19s, SQL Server <1s >>>>count(*) where not exists: 23.3s SQL Server 1.5s >>>> >>>> >>>> >>>> >>>What does explain analyze show for the two queries? >>> >>> >>> >>> >>> >>> >>explain analyze select count(*) from tbltimeseries where exists(select >>uniqid from tblobjectname where timeseriesid = uniqid); >>Aggregate (cost=5681552.18..5681552.18 rows=1 width=0) (actual >>time=22756.64..22756.64 rows=1 loops=1) >> -> Seq Scan on tbltimeseries (cost=0.00..5680051.34 rows=600336 >>width=0) (actual time=22.06..21686.78 rows=1200113 loops=1) >> Filter: (NOT (subplan)) >> SubPlan >> -> Index Scan using idx_objectname on tblobjectname >>(cost=0.00..4.70 rows=1 width=4) (actual time=0.01..0.01 rows=0 >>loops=1200673) >> Index Cond: ($0 = uniqid) >> Total runtime: 22756.83 msec >>(7 rows) >> >> > >Hmm... I'd thought that it had options for a better plan than that. > >What do things like: > >explain analyze select count(distinct timeseriesid) from tbltimeseries, > tblobjectname where timeseriesid=uniquid; > >and > >explain analyze select count(distinct timeseriesid) from > tbltimeseries left outer join tblobjectname on (timeseriesid=uniqid) > where uniqid is null; > >give you? > > > > much better performance: explain analyze select count(distinct timeseriesid) from tbltimeseries, tblobjectname where timeseriesid=uniquid; Aggregate (cost=7384.03..7384.03 rows=1 width=8) (actual time=668.15..668.15 rows=1 loops=1) -> Nested Loop (cost=0.00..7380.83 rows=1282 width=8) (actual time=333.31..666.13 rows=561 loops=1) -> Seq Scan on tblobjectname (cost=0.00..33.82 rows=1282 width=4) (actual time=0.05..4.98 rows=1282 loops=1) -> Index Scan using xx on tbltimeseries (cost=0.00..5.72 rows=1 width=4) (actual time=0.51..0.51 rows=0 loops=1282) Index Cond: (tbltimeseries.timeseriesid = "outer".uniqid) Total runtime: 669.61 msec (6 rows) explain analyze select count(distinct timeseriesid) from tbltimeseries left outer join tblobjectname on (timeseriesid=uniqid) where uniqid is null; Aggregate (cost=59144.19..59144.19 rows=1 width=8) (actual time=12699.47..12699.47 rows=1 loops=1) -> Hash Join (cost=37.02..56142.51 rows=1200673 width=8) (actual time=7.41..6376.12 rows=1200113 loops=1) Hash Cond: ("outer".timeseriesid = "inner".uniqid) Filter: ("inner".uniqid IS NULL) -> Seq Scan on tbltimeseries (cost=0.00..44082.73 rows=1200673 width=4) (actual time=0.01..3561.61 rows=1200673loops=1) -> Hash (cost=33.82..33.82 rows=1282 width=4) (actual time=4.84..4.84 rows=0 loops=1) -> Seq Scan on tblobjectname (cost=0.00..33.82 rows=1282 width=4) (actual time=0.04..2.84 rows=1282 loops=1) Total runtime: 12699.76 msec (8 rows)
pgsql-performance by date: