Query gets slow when where clause increases - Mailing list pgsql-performance

From James Antill
Subject Query gets slow when where clause increases
Date
Msg-id m3oen281uz.fsf@code.and.org
Whole thread Raw
Responses Re: Query gets slow when where clause increases
List pgsql-performance
 I have a problem where a query gets _much_ slower if I add statements
to the where clause than if I just get everything and filter in the
code. I expected the query to be faster as the where clause gets
bigger (and thus the data to return gets lower).

 I've put inline the SQL and explain analyze of both the general and
specific queries. I assume the problem is something to do with the
fact that in the specific query "ticket_groups" is uses an index and
is assumed to return 5 rows, but actually returns 604 and in the
generic case it doesn't use an index and is assumed to return 3.5-4
thousand and does? Is this right, and if so is it possible to get
postgres to re-think using an index in this case (or possible up it's
estimated row count)?
 Any help appreciated.

----------------------- GENERIC -------------------------------------

SELECT   g.assigned_to, g.opened AS escalated, t.tid, t.opened,
         e.userid AS commiter, e.eid, e.performed_on, e.status
  FROM (events e JOIN (ticket_groups g JOIN tickets t USING(tid)) USING(tid))
   WHERE (g.gid = '37' AND
          (e.performed_on <= CAST('1088480270' AS bigint)) AND
          ((t.modified >= CAST('1086875491' AS bigint)) OR
           (t.status != '4' AND
            t.status != '5' AND
            t.status != '9')))
   ORDER BY e.userid, t.tid, e.performed_on;

 Sort  (cost=28017.06..28054.97 rows=15166 width=58) (actual time=2057.25..2076.12 rows=26594 loops=1)
   Sort Key: e.userid, t.tid, e.performed_on
   ->  Merge Join  (cost=1251.59..26963.93 rows=15166 width=58) (actual time=231.81..1972.29 rows=26594 loops=1)
         Merge Cond: ("outer".tid = "inner".tid)
         ->  Index Scan using idx_tid_events on events e  (cost=0.00..18943.61 rows=268784 width=26) (actual
time=11.48..1358.15rows=268803 loops=1) 
               Filter: (performed_on <= 1088480270::bigint)
         ->  Materialize  (cost=7160.00..7160.00 rows=1725 width=32) (actual time=217.14..237.94 rows=26592 loops=1)
               ->  Merge Join  (cost=1251.59..7160.00 rows=1725 width=32) (actual time=63.14..214.75 rows=983 loops=1)
                     Merge Cond: ("outer".tid = "inner".tid)
                     ->  Index Scan using idx_tickets_tid on tickets t  (cost=0.00..5820.13 rows=18823 width=12)
(actualtime=2.97..135.57 rows=6020 loops=1) 
                           Filter: (((status <> 4::smallint) OR (modified >= 1086875491::bigint)) AND ((status <>
5::smallint)OR (modified >= 1086875491::bigint)) AND ((status <> 9::smallint) OR (modified >= 1086875491::bigint))) 
                     ->  Sort  (cost=1251.59..1261.38 rows=3915 width=20) (actual time=60.13..62.96 rows=3699 loops=1)
                           Sort Key: g.tid
                           ->  Seq Scan on ticket_groups g  (cost=0.00..1017.94 rows=3915 width=20) (actual
time=0.05..53.02rows=3699 loops=1) 
                                 Filter: (gid = 37)
 Total runtime: 2100.75 msec

--------------------- SPECIFIC -------------------------------------

SELECT   g.assigned_to, g.opened AS escalated, t.tid, t.opened,
         e.userid AS commiter, e.eid, e.performed_on, e.status
  FROM (events e JOIN (ticket_groups g JOIN tickets t USING(tid)) USING(tid))
   WHERE (g.gid = '37' AND
          (e.performed_on <= CAST('1088480270' AS bigint)) AND
          ((t.modified >= CAST('1086875491' AS bigint)) OR
           (t.status != '4' AND
            t.status != '5' AND
            t.status != '9')) AND (g.assigned_to IS NOT NULL AND g.assigned_to='1540') AND e.userid='1540')
   ORDER BY e.userid, t.tid, e.performed_on;

 Sort  (cost=5079.17..5079.18 rows=1 width=58) (actual time=218121.00..218122.02 rows=1441 loops=1)
   Sort Key: e.userid, t.tid, e.performed_on
   ->  Nested Loop  (cost=0.00..5079.16 rows=1 width=58) (actual time=85.28..218115.36 rows=1441 loops=1)
         Join Filter: ("outer".tid = "inner".tid)
         ->  Nested Loop  (cost=0.00..305.53 rows=1 width=46) (actual time=0.22..261.78 rows=2420 loops=1)
               ->  Index Scan using idx_ticket_groups_assigned on ticket_groups g  (cost=0.00..241.76 rows=5 width=20)
(actualtime=0.13..12.67 rows=604 loops=1) 
                     Index Cond: (assigned_to = 1540)
                     Filter: ((gid = 37) AND (assigned_to IS NOT NULL))
               ->  Index Scan using idx_tid_events on events e  (cost=0.00..12.50 rows=1 width=26) (actual
time=0.11..0.38rows=4 loops=604) 
                     Index Cond: (e.tid = "outer".tid)
                     Filter: ((performed_on <= 1088480270::bigint) AND (userid = 1540))
         ->  Seq Scan on tickets t  (cost=0.00..4538.35 rows=18823 width=12) (actual time=0.16..83.53 rows=6020
loops=2420)
               Filter: (((status <> 4::smallint) OR (modified >= 1086875491::bigint)) AND ((status <> 5::smallint) OR
(modified>= 1086875491::bigint)) AND ((status <> 9::smallint) OR (modified >= 1086875491::bigint))) 
 Total runtime: 218123.24 msec


--
James Antill -- james@and.org
Need an efficient and powerful string library for C?
http://www.and.org/vstr/

pgsql-performance by date:

Previous
From: Litao Wu
Date:
Subject: network address query
Next
From: Dennis Bjorklund
Date:
Subject: Re: Query gets slow when where clause increases