Thread: Trigger function, bad performance
Hello, I'm having problems with the following bad performing select-statement in a trigger-function (on update before): ... for c in select id_country, sum(cost) as sum_cost from costs where id_user = p_id_user and id_state = 1 and date(request) between p_begin and p_until group by id_country; loop ... end loop; ... Explain shows that the following existing partial index isn't used: CREATE INDEX ix_costs_user_state_date_0701 ON costs USING btree(id_user, id_state, date(request)) WHERE id_state = 1 AND date(request) >= '2007-01-01'::date AND date(request) <= '2007-01-31'::date; The funny thing is, that while executing the statement with type-casted string-literals the index is used as expected: ... for c in select id_country, sum(cost) as sum_cost from costs where id_user = p_id_user and id_state = 1 and date(request) between '2007-01-01'::date AND '2007-01-31'::date group by id_country; loop ... end loop; ... Any ideas? Best regards Rainer Rogatzki (mailto:rainer.rogatzki@ggrz-hagen.nrw.de)
Rogatzki Rainer wrote: > I'm having problems with the following bad performing select-statement > in a trigger-function (on update before): > > ... > for c in > select id_country, sum(cost) as sum_cost > from costs > where id_user = p_id_user > and id_state = 1 > and date(request) between p_begin and p_until > group by id_country; > loop > ... > end loop; > ... > > Explain shows that the following existing partial index isn't used: > > CREATE INDEX ix_costs_user_state_date_0701 > ON costs > USING btree(id_user, id_state, date(request)) > WHERE id_state = 1 AND date(request) >= '2007-01-01'::date AND > date(request) <= '2007-01-31'::date; > > > The funny thing is, that while executing the statement with > type-casted > string-literals the index is used as expected: > > ... > for c in > select id_country, sum(cost) as sum_cost > from costs > where id_user = p_id_user > and id_state = 1 > and date(request) between '2007-01-01'::date AND > '2007-01-31'::date > group by id_country; > loop > ... > end loop; > ... > > Any ideas? The problem is that "p_begin" and "p_until" are variables. Consequently PostgreSQL, when the function is run the first time, will prepare this statement: select id_country, sum(cost) as sum_cost from costs where id_user = $1 and id_state = 1 and date(request) between $2 and $3 group by id_country; That prepared statement will be reused for subsequent invocations of the trigger function, whiere the parameters will probably have different values. So it cannot use the partial index. If you want the index to be used, don't include "date(request)" in the WHERE clause. Yours, Laurenz Albe
> Rogatzki Rainer wrote: > > I'm having problems with the following bad performing select-statement > > in a trigger-function (on update before): > > > > ... > > for c in > > select id_country, sum(cost) as sum_cost > > from costs > > where id_user = p_id_user > > and id_state = 1 > > and date(request) between p_begin and p_until > > group by id_country; > > loop > > ... > > end loop; > > ... > > > > Explain shows that the following existing partial index isn't used: > > > > CREATE INDEX ix_costs_user_state_date_0701 > > ON costs > > USING btree(id_user, id_state, date(request)) > > WHERE id_state = 1 AND date(request) >= '2007-01-01'::date AND > > date(request) <= '2007-01-31'::date; > > > > > > The funny thing is, that while executing the statement with > > type-casted string-literals the index is used as expected: > > > > ... > > for c in > > select id_country, sum(cost) as sum_cost > > from costs > > where id_user = p_id_user > > and id_state = 1 > > and date(request) between '2007-01-01'::date AND > > '2007-01-31'::date > > group by id_country; > > loop > > ... > > end loop; > > ... > > > > Any ideas? Albe Laurenz wrote: > The problem is that "p_begin" and "p_until" are variables. Consequently PostgreSQL, when the function is run the first time, will prepare this statement: > > select id_country, sum(cost) as sum_cost > from costs > where id_user = $1 > and id_state = 1 > and date(request) between $2 and $3 > group by id_country; > > That prepared statement will be reused for subsequent invocations of the trigger function, whiere the parameters will probably have different values. > > So it cannot use the partial index. > > If you want the index to be used, don't include "date(request)" in the WHERE clause. > > Yours, > Laurenz Albe Hello Laurenz, thank you for your analysis! Unfortunately your proposal is no option for me, since I do have to include the WHERE clause in both index and procedure. By the way: The parameters in the WHERE clause (p_begin,p_until) come from a RECORD which is filled before via SELECT INTO with begin and end tablefields of the type date like the following: my_record RECORD; ... select into my_record p_begin, p_until from accounting_interval where id = 1; I omitted this information in my first posting to make it easier to read. In fact I extracted the bad performing statement to let pgadmin explain and the same effect shows: -- Example with bad performance since index isn't used explain select c.id_country, sum(c.cost) as sum_cost from costs c, accounting_interval a where c.id_user = 123 and c.id_state = 1 and a.id = 1 and date(c.request) between a.p_begin and a.p_until group by id_country; -- Example with invoked index (100 times better performance) explain select c.id_country, sum(c.cost) as sum_cost from costs c where c.id_user = 123 and c.id_state = 1 and date(c.request) between '2007-01-01'::date and '2007-01-31'::date group by id_country; Here I cannot see why statement preparation has an effect at all. Apart from this I don't really understand why statement preparation combined with parameters in functions prevent index invocation. Especially since p_id_user is a parameter as well which doesn't prevent the usage of another existing index on costs.id_user and costs.id_state. So you see me still clueless :O) Best regards Rainer Rogatzki
am Fri, dem 05.12.2008, um 11:41:11 +0100 mailte Rogatzki Rainer folgendes: > -- Example with bad performance since index isn't used > explain > select c.id_country, sum(c.cost) as sum_cost > from costs c, accounting_interval a > where c.id_user = 123 > and c.id_state = 1 > and a.id = 1 > and date(c.request) between a.p_begin and a.p_until > group by id_country; > > -- Example with invoked index (100 times better performance) > explain > select c.id_country, sum(c.cost) as sum_cost > from costs c > where c.id_user = 123 > and c.id_state = 1 > and date(c.request) between '2007-01-01'::date and '2007-01-31'::date > group by id_country; > > Here I cannot see why statement preparation has an effect at all. The planner don't know the parameters at compile-time. Because of this fakt, the planner choose a other plan (a seq-scan). You can try to use execute 'your query'. In this case the planner investigate a new plan, and (maybe) with the index. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
> > Rogatzki wrote: > > -- Example with bad performance since index isn't used > > explain > > select c.id_country, sum(c.cost) as sum_cost > > from costs c, accounting_interval a > > where c.id_user = 123 > > and c.id_state = 1 > > and a.id = 1 > > and date(c.request) between a.p_begin and a.p_until > > group by id_country; > > > > -- Example with invoked index (100 times better performance) > > explain > > select c.id_country, sum(c.cost) as sum_cost > > from costs c > > where c.id_user = 123 > > and c.id_state = 1 > > and date(c.request) between '2007-01-01'::date and '2007-01-31'::date > > group by id_country; > > > > Here I cannot see why statement preparation has an effect at all. > > Andreas Kretschmer wrote: > The planner don't know the parameters at compile-time. Because of this fakt, the planner choose a other plan (a seq-scan). > > You can try to use execute 'your query'. In this case the planner investigate a new plan, and (maybe) with the index. > > > Andreas > -- > Andreas Kretschmer > Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) > GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net Hello Andreas, your hint did the trick - thank you very much! After using "execute 'my query'" the index is used as expected. Though I still wonder why the poor performance occurred since november, without having done any relevant changes to neither postgres nor db model (including index, procedures, ...) as far as I know. Anyway - I'm deeply content with your solution. Best regards Rainer Rogatzki (mailto:rainer.rogatzki@ggrz-hagen.nrw.de)
am Fri, dem 05.12.2008, um 14:23:33 +0100 mailte Rogatzki Rainer folgendes: > Hello Andreas, > > your hint did the trick - thank you very much! glad to help you... Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
Rogatzki Rainer wrote: > > > in a trigger-function (on update before): > > > > > > ... > > > for c in > > > select id_country, sum(cost) as sum_cost > > > from costs > > > where id_user = p_id_user > > > and id_state = 1 > > > and date(request) between p_begin and p_until > > > group by id_country; > > > loop > > > ... > > > end loop; > > > ... > > > > > > Explain shows that the following existing partial index isn't used: > > > > > > CREATE INDEX ix_costs_user_state_date_0701 > > > ON costs > > > USING btree(id_user, id_state, date(request)) > > > WHERE id_state = 1 AND date(request) >= '2007-01-01'::date AND > > > date(request) <= '2007-01-31'::date; > > > > > > > The problem is that "p_begin" and "p_until" are variables. > > > > So it cannot use the partial index. > > > > If you want the index to be used, don't include "date(request)" in the > > WHERE clause. > > Unfortunately your proposal is no option for me, since I do have to > include the WHERE clause in both index and procedure. You have been ordered to use a partial index? > Apart from this I don't really understand why statement preparation > combined with parameters in functions prevent index invocation. > Especially since p_id_user is a parameter as well which doesn't prevent > the usage of another existing index on costs.id_user and costs.id_state. The connection with parameters is by chance. The main thing is that both "p_begin" and "p_until" are variables. Andreas Kretschmer gave you the advice you'll want: use dynamic SQL. Yours, Laurenz Albe
Laurenz Albe wrote: > You have been ordered to use a partial index? > > > Apart from this I don't really understand why statement preparation > > combined with parameters in functions prevent index invocation. > > Especially since p_id_user is a parameter as well which doesn't > > prevent the usage of another existing index on costs.id_user and costs.id_state. > > The connection with parameters is by chance. > > The main thing is that both "p_begin" and "p_until" are variables. > > Andreas Kretschmer gave you the advice you'll want: use dynamic SQL. > > Yours, > Laurenz Albe Well it was no 'order' to use a partial index but it was necessary, since the table is vastly filled with log-entries that have to persist over 4 years for documentation. Since old entries are accessed and changed less often we decided to introduce partial indexes for a better performance. This prooved to speed up especially frequently used monthly reports. Now I understand (thanks to Andreas Kretschmer and you) that at execution planning time postgres cannot decide what partial index to use for following procedure calls and thus doesn't invoke it. After implementing Andreas' proposal (execute 'my statement') the expected index is used and performance is 125 time better :O)) Best regards Rainer Rogatzki