Thread: runtime of the same query in function differs on 2 degree!

runtime of the same query in function differs on 2 degree!

From

Andriy Tkachuk

Date:

02 October 2003, 11:24:33

Hi folks.

What's wrong with planner that executes my query in function?:
(i mean no explanation but runtime)


tele=# EXPLAIN analyze select calc_total(6916799, 1062363600, 1064955599);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=0) (actual time=36919.37..36919.37 rows=1 loops=1)
 Total runtime: 36919.40 msec
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tele=# \df+ calc_total
...
declare
        usr alias for $1;
        d1 alias for $2;
        d2   alias for $3;
        res integer;
begin
        select sum(cost) into res
                from bills where
                        (parent(user_id) = usr or user_id = usr)
                        and dat >= d1 and dat < d2;
        if res is not null then
                return res;
        else
                return 0;
        end if;
end;

tele=# EXPLAIN analyze select sum(cost) from bills where (parent(user_id) = 6916799 or user_id = 6916799) and dat >=
1062363600and dat < 10649555 
99;
                                                                             QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
------------------
 Aggregate  (cost=17902.80..17902.80 rows=1 width=4) (actual time=101.04..101.04 rows=1 loops=1)
   ->  Index Scan using bills_parent_user_id_idx, bills_userid_dat_idx on bills  (cost=0.00..17901.11 rows=679 width=4)
(actualtime=101.03..101.0 
3 rows=0 loops=1)
         Index Cond: ((parent(user_id) = 6916799) OR ((user_id = 6916799) AND (dat >= 1062363600) AND (dat <
1064955599)))
         Filter: (((parent(user_id) = 6916799) OR (user_id = 6916799)) AND (dat >= 1062363600) AND (dat < 1064955599))
 Total runtime: 101.14 msec
 ^^^^^^^^^^^^^^^^^^^^^^^^^^

So the query is the same as in calc_total(usr,d1,d2) function,
but execute time extremely differs.

Is it normal?

Thanks,
  Andriy Tkachuk.

Re: runtime of the same query in function differs on 2 degree!

From

Tom Lane

Date:

02 October 2003, 12:31:33

Andriy Tkachuk <ant@imt.com.ua> writes:
> What's wrong with planner that executes my query in function?:

> tele=# EXPLAIN analyze select sum(cost) from bills where (parent(user_id) = 6916799 or user_id = 6916799) and dat >=
1062363600and dat < 10649555 
> 99;

In the function case, the planner will not have access to the specific
values that "dat" is being compared to --- it'll see something like

    ... and dat >= $1 and dat < $2

In this case it has to fall back on a default estimate of how many rows
will be selected, and I suspect it's guessing that a seqscan will be
faster.  The trouble is that for a sufficiently large range of d1/d2,
a seqscan *will* be faster.

You might find that the best solution is to use FOR ... EXECUTE and plug
the parameters into the query string so that the planner can see their
values.  This will mean re-planning on every function call, but the
advantage is the plan will adapt to the actual range of d1/d2.

            regards, tom lane

Re: runtime of the same query in function differs on 2 degree!

From

Gaetano Mendola

Date:

02 October 2003, 20:53:50

Andriy Tkachuk wrote:

> Hi folks.
>
> What's wrong with planner that executes my query in function?:
> (i mean no explanation but runtime)
>
>
> tele=# EXPLAIN analyze select calc_total(6916799, 1062363600, 1064955599);
>                                         QUERY PLAN
> ------------------------------------------------------------------------------------------
>  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=36919.37..36919.37 rows=1 loops=1)
>  Total runtime: 36919.40 msec
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> tele=# \df+ calc_total
> ...
> declare
>         usr alias for $1;
>         d1 alias for $2;
>         d2   alias for $3;
>         res integer;
> begin
>         select sum(cost) into res
>                 from bills where
>                         (parent(user_id) = usr or user_id = usr)
>                         and dat >= d1 and dat < d2;
>         if res is not null then
>                 return res;
>         else
>                 return 0;
>         end if;
> end;

You didn't wrote the type of d1 and d2, I had your same problem:

declare
    a_user alias for $1;
    res INTEGER;
begin
          select cost into res
          from my_table
          where login = a_user;

    ......
end;

the problem was that login was a VARCHAR and a_user was a TEXT so
the index was not used, was enough cast a_user::varchar;


I believe that your dat, d1, d2 are not "index" comparable.


Gaetano

Re: runtime of the same query in function differs on 2

From

Andriy Tkachuk

Date:

03 October 2003, 04:03:18

No: the function is calc_total(int,int,int) and the table have the
same types.

As Tom said that my problem is because of planning in pl/pgsql.  As
is written in
http://www.postgresql.org/docs/7.3/static/plpgsql.html#PLPGSQL-OVERVIEW
plans for queries in pl/pgsql are made just once - when they are
first used in function by backend. So AFAICS this planning do not
take into consideration table statistics because it don't see values
of variables in queries (or if see than it must not take them into account,
because they may be changed in future function callings).

I rollback to my previous realization of calc_total() on pl/tcl. I
use there spi_exec - so the query always regards as dynamic - it
always parsed, rewritten, planned but executes fastest much more
:)

On Fri, 3 Oct 2003, Gaetano Mendola wrote:

> Andriy Tkachuk wrote:
>
> > Hi folks.
> >
> > What's wrong with planner that executes my query in function?:
> > (i mean no explanation but runtime)
> >
> >
> > tele=# EXPLAIN analyze select calc_total(6916799, 1062363600, 1064955599);
> >                                         QUERY PLAN
> > ------------------------------------------------------------------------------------------
> >  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=36919.37..36919.37 rows=1 loops=1)
> >  Total runtime: 36919.40 msec
> >  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > tele=# \df+ calc_total
> > ...
> > declare
> >         usr alias for $1;
> >         d1 alias for $2;
> >         d2   alias for $3;
> >         res integer;
> > begin
> >         select sum(cost) into res
> >                 from bills where
> >                         (parent(user_id) = usr or user_id = usr)
> >                         and dat >= d1 and dat < d2;
> >         if res is not null then
> >                 return res;
> >         else
> >                 return 0;
> >         end if;
> > end;
>
> You didn't wrote the type of d1 and d2, I had your same problem:
>
> declare
>     a_user alias for $1;
>     res INTEGER;
> begin
>           select cost into res
>           from my_table
>           where login = a_user;
>
>     ......
> end;
>
> the problem was that login was a VARCHAR and a_user was a TEXT so
> the index was not used, was enough cast a_user::varchar;
>
>
> I believe that your dat, d1, d2 are not "index" comparable.
>
>
> Gaetano
>
>
>
>
>
>
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings