Re: poor execution plan because column dependence - Mailing list pgsql-performance

From Václav Ovsík
Subject Re: poor execution plan because column dependence
Date
Msg-id 20110414081152.GB11328@bobek.localdomain
Whole thread Raw
In response to Re: poor execution plan because column dependence  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: poor execution plan because column dependence
List pgsql-performance
On Wed, Apr 13, 2011 at 12:24:06PM -0400, Tom Lane wrote:
> > Interesting the original index tickets5 is still used for
> > int4eq(main.effectiveid, main.id), no need to build a different.
>
> Well, no, it won't be.  This hack is entirely dependent on the fact that
> the optimizer mostly works with operator expressions, and is blind to
> the fact that the underlying functions are really the same thing.
> (Which is something I'd like to see fixed someday, but in the meantime
> it gives you an escape hatch.)  If you use the int4eq() construct in a
> context where you'd like to see it transformed into an index qual, it
> won't be.  For this particular case that doesn't matter because there's
> no use in using an index for that clause anyway.  But you'll need to be
> very careful that your changes in the query generator don't result in
> using int4eq() in any contexts other than the "main.EffectiveId=main.id"
> check.

Sorry I'm not certain understand your paragraph completely...

I perfectly understand the fact that change from
    A = B    into    int4eq(A, B)
stopped bad estimate and execution plan is corrected, but that can
change someday in the future.

I'm not certain about your sentence touching int4eq() and index. The
execution plan as show in my previous mail contains information about
using index tickets5:

...
               ->  Index Scan using tickets5 on tickets main  (cost=0.00..4.38 rows=1 width=162) (actual
time=0.006..0.006rows=0 loops=15593) 
                     Index Cond: (main.id = transactions_1.objectid)
                     Filter: (((main.status)::text <> 'deleted'::text) AND (main.lastupdated > '2008-12-31
23:00:00'::timestampwithout time zone) AND (main.created > '2005-12-31 23:00:00'::timestamp without time zone) AND
int4eq(main.effectiveid,main.id) AND (main.queue = 15) AND ((main.type)::text = 'ticket'::text) AND
((main.status)::text= 'resolved'::text)) 
...


Filter condition contains int4eq(main.effectiveid, main.id) and tickets5
is: "tickets5" btree (id, effectiveid)

That means tickets5 index was used for int4eq(main.effectiveid, main.id).
Is it right? Or am I something missing?

Well the index will not be used generally probably, because of
selectivity of int4eq() you mention (33%). The planner thinks it is
better to use seq scan then. I tried this now.

I did hack for this particular case only:


diff --git a/local/lib/DBIx/SearchBuilder.pm b/local/lib/DBIx/SearchBuilder.pm
index f3ee1e1..9e3a6a6 100644
--- a/local/lib/DBIx/SearchBuilder.pm
+++ b/local/lib/DBIx/SearchBuilder.pm
@@ -1040,7 +1040,9 @@ sub _CompileGenericRestrictions {
                 $result .= ' '. $entry . ' ';
             }
             else {
-                $result .= join ' ', @{$entry}{qw(field op value)};
+                my $term = join ' ', @{$entry}{qw(field op value)};
+                $term =~ s/^(main|Tickets_\d+)\.(EffectiveId) = (\1)\.(id)$/int4eq($1.$2, $3.$4)/i;
+                $result .= $term;
             }
         }
         $result .= ')';


It works as expected.
Thanks
Best Regards
--
Zito

pgsql-performance by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Performance
Next
From: tv@fuzzy.cz
Date:
Subject: Re: Performance