Re: poor execution plan because column dependence - Mailing list pgsql-performance

From Václav Ovsík
Subject Re: poor execution plan because column dependence
Date
Msg-id 20110413082139.GB24050@bobek.localdomain
Whole thread Raw
In response to Re: poor execution plan because column dependence  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: poor execution plan because column dependence
List pgsql-performance
Dear Tom,

On Tue, Apr 12, 2011 at 08:52:15PM -0400, Tom Lane wrote:
>..
> Yeah, that main.EffectiveId = main.id clause is going to be
> underestimated by a factor of about 200, which is most though not all of
> your rowcount error for that table.  Not sure whether you can do much
> about it, if the query is coming from a query generator that you can't
> change.  If you can change it, try replacing main.EffectiveId = main.id
> with the underlying function, eg if they're integers use
> int4eq(main.EffectiveId, main.id).  This will bypass the overoptimistic
> estimator for the "=" operator and get you a default selectivity
> estimate of (IIRC) 0.3333.  Which is still off, but only by 3x not 200x,
> and that should be close enough to get a decent plan.

Great idea!

rt=# EXPLAIN ANALYZE SELECT DISTINCT  main.* FROM Tickets main JOIN Transactions Transactions_1  ON (
Transactions_1.ObjectId= main.id ) JOIN Attachments Attachments_2  ON ( Attachments_2.TransactionId = Transactions_1.id
) WHERE (Transactions_1.ObjectType = 'RT::Ticket') AND (main.Status != 'deleted') AND (main.Status = 'resolved' AND
main.LastUpdated> '2008-12-31 23:00:00' AND main.Created > '2005-12-31 23:00:00' AND main.Queue = '15' AND  (
Attachments_2.trigrams@@ text_to_trgm_tsquery('uir') AND Attachments_2.Content ILIKE '%uir%' ) ) AND (main.Type =
'ticket')AND int4eq(main.EffectiveId, main.id)  ORDER BY main.id ASC; 

                                                         QUERY PLAN

    

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=37504.61..37505.00 rows=6 width=162) (actual time=1377.087..1383.844 rows=649 loops=1)
   ->  Sort  (cost=37504.61..37504.62 rows=6 width=162) (actual time=1377.085..1377.973 rows=5280 loops=1)
         Sort Key: main.id, main.effectiveid, main.issuestatement, main.resolution, main.owner, main.subject,
main.initialpriority,main.finalpriority, main.priority, main.timeestimated, main.timeworked, main.timeleft, main.told,
main.starts,main.started, main.due, main.resolved, main.lastupdatedby, main.lastupdated, main.creator, main.created,
main.disabled
         Sort Method:  quicksort  Memory: 1598kB
         ->  Nested Loop  (cost=7615.47..37504.53 rows=6 width=162) (actual time=13.678..1322.292 rows=5280 loops=1)
               ->  Nested Loop  (cost=7615.47..37179.22 rows=74 width=4) (actual time=5.670..1266.703 rows=15593
loops=1)
                     ->  Bitmap Heap Scan on attachments attachments_2  (cost=7615.47..36550.26 rows=74 width=4)
(actualtime=5.658..1196.160 rows=15593 loops=1) 
                           Recheck Cond: (trigrams @@ '''uir'''::tsquery)
                           Filter: (content ~~* '%uir%'::text)
                           ->  Bitmap Index Scan on attachments_textsearch  (cost=0.00..7615.45 rows=8016 width=0)
(actualtime=3.863..3.863 rows=15972 loops=1) 
                                 Index Cond: (trigrams @@ '''uir'''::tsquery)
                     ->  Index Scan using transactions_pkey on transactions transactions_1  (cost=0.00..8.49 rows=1
width=8)(actual time=0.003..0.003 rows=1 loops=15593) 
                           Index Cond: (transactions_1.id = attachments_2.transactionid)
                           Filter: ((transactions_1.objecttype)::text = 'RT::Ticket'::text)
               ->  Index Scan using tickets5 on tickets main  (cost=0.00..4.38 rows=1 width=162) (actual
time=0.003..0.003rows=0 loops=15593) 
                     Index Cond: (main.id = transactions_1.objectid)
                     Filter: (((main.status)::text <> 'deleted'::text) AND (main.lastupdated > '2008-12-31
23:00:00'::timestampwithout time zone) AND (main.created > '2005-12-31 23:00:00'::timestamp without time zone) AND
int4eq(main.effectiveid,main.id) AND (main.queue = 15) AND ((main.type)::text = 'ticket'::text) AND
((main.status)::text= 'resolved'::text)) 
 Total runtime: 1384.038 ms
(18 rows)

Execution plan desired! :)

Indexes:
    "tickets_pkey" PRIMARY KEY, btree (id)
    "tickets1" btree (queue, status)
    "tickets2" btree (owner)
    "tickets3" btree (effectiveid)
    "tickets4" btree (id, status)
    "tickets5" btree (id, effectiveid)

Interesting the original index tickets5 is still used for
int4eq(main.effectiveid, main.id), no need to build a different.
Great!

I think no problem to do this small hack into the SearchBuilder. I did
already one for full text search so there will be two hacks :).

Thanks very much.
Best Regards
--
Zito

pgsql-performance by date:

Previous
From: Václav Ovsík
Date:
Subject: Re: poor execution plan because column dependence
Next
From: Glyn Astill
Date:
Subject: Re: Linux: more cores = less concurrency.