Thread: Cursor fetch performance issue
We are migrating our Oracle warehouse to Postgres 9. This function responds well: pg=# select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); getmemberadminprevious_sp2 ---------------------------- <unnamed portal 1> (1 row) Time: 7.549 ms However, when testing, this fetch takes upwards of 38 minutes: BEGIN; select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); FETCH ALL IN "<unnamed portal 2>"; How can I diagnose any performance issues with the fetch in the cursor? Thanks. Tony
Hello 2012/1/24 Tony Capobianco <tcapobianco@prospectiv.com>: > We are migrating our Oracle warehouse to Postgres 9. > > This function responds well: > > pg=# select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); > getmemberadminprevious_sp2 > ---------------------------- > <unnamed portal 1> > (1 row) > > Time: 7.549 ms > > However, when testing, this fetch takes upwards of 38 minutes: > > BEGIN; > select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); > FETCH ALL IN "<unnamed portal 2>"; > > How can I diagnose any performance issues with the fetch in the cursor? > Cursors are optimized to returns small subset of result - if you plan to read complete result, then set set cursor_tuple_fraction to 1.0; this is session config value, you can set it before selected cursors queries Regards Pavel Stehule > Thanks. > Tony > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Running just the sql of the function returns only 10 rows: pg=# SELECT m.memberid, m.websiteid, m.emailaddress, pg-# m.firstname, m.lastname, m.regcomplete, m.emailok pg-# FROM members m pg-# WHERE m.emailaddress LIKE 'test.email@hotmail.com' pg-# AND m.changedate_id < 5868 ORDER BY m.emailaddress, m.websiteid; memberid | websiteid | emailaddress | firstname | lastname | regcomplete | emailok -----------+-----------+------------------------+-----------+----------+-------------+--------- 247815829 | 1 | test.email@hotmail.com | email | test | 1 | 1 300960335 | 62 | test.email@hotmail.com | | | 1 | 1 300959937 | 625 | test.email@hotmail.com | | | 1 | 1 260152830 | 1453 | test.email@hotmail.com | | | 1 | 1 300960163 | 1737 | test.email@hotmail.com | email | test | 1 | 1 300960259 | 1824 | test.email@hotmail.com | email | test | 1 | 1 300959742 | 1928 | test.email@hotmail.com | email | test | 1 | 1 368122699 | 2457 | test.email@hotmail.com | email | test | 1 | 1 403218613 | 2464 | test.email@hotmail.com | email | test | 1 | 0 378951994 | 2656 | test.email@hotmail.com | | | 1 | 1 (10 rows) Time: 132.626 ms So, it would seem that's a small enough number of rows. Unfortunately, issuing: set cursor_tuple_fraction to 1.0; Did not have an effect on performance. Is it common to modify this cursor_tuple_fraction parameter each time we execute the function? On Tue, 2012-01-24 at 21:47 +0100, Pavel Stehule wrote: > Hello > > 2012/1/24 Tony Capobianco <tcapobianco@prospectiv.com>: > > We are migrating our Oracle warehouse to Postgres 9. > > > > This function responds well: > > > > pg=# select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); > > getmemberadminprevious_sp2 > > ---------------------------- > > <unnamed portal 1> > > (1 row) > > > > Time: 7.549 ms > > > > However, when testing, this fetch takes upwards of 38 minutes: > > > > BEGIN; > > select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); > > FETCH ALL IN "<unnamed portal 2>"; > > > > How can I diagnose any performance issues with the fetch in the cursor? > > > > Cursors are optimized to returns small subset of result - if you plan > to read complete result, then set > > set cursor_tuple_fraction to 1.0; > > this is session config value, you can set it before selected cursors queries > > Regards > > Pavel Stehule > > > Thanks. > > Tony > > > > > > -- > > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-performance >
2012/1/24 Tony Capobianco <tcapobianco@prospectiv.com>: > Running just the sql of the function returns only 10 rows: > > pg=# SELECT m.memberid, m.websiteid, m.emailaddress, > pg-# m.firstname, m.lastname, m.regcomplete, m.emailok > pg-# FROM members m > pg-# WHERE m.emailaddress LIKE 'test.email@hotmail.com' > pg-# AND m.changedate_id < 5868 ORDER BY m.emailaddress, m.websiteid; > memberid | websiteid | emailaddress | firstname | lastname | regcomplete | emailok > -----------+-----------+------------------------+-----------+----------+-------------+--------- > 247815829 | 1 | test.email@hotmail.com | email | test | 1 | 1 > 300960335 | 62 | test.email@hotmail.com | | | 1 | 1 > 300959937 | 625 | test.email@hotmail.com | | | 1 | 1 > 260152830 | 1453 | test.email@hotmail.com | | | 1 | 1 > 300960163 | 1737 | test.email@hotmail.com | email | test | 1 | 1 > 300960259 | 1824 | test.email@hotmail.com | email | test | 1 | 1 > 300959742 | 1928 | test.email@hotmail.com | email | test | 1 | 1 > 368122699 | 2457 | test.email@hotmail.com | email | test | 1 | 1 > 403218613 | 2464 | test.email@hotmail.com | email | test | 1 | 0 > 378951994 | 2656 | test.email@hotmail.com | | | 1 | 1 > (10 rows) > > Time: 132.626 ms > > So, it would seem that's a small enough number of rows. Unfortunately, issuing: > > set cursor_tuple_fraction to 1.0; > > Did not have an effect on performance. Is it common to modify this > cursor_tuple_fraction parameter each time we execute the function? > no, usually only before some strange query. Check execution plan, please - but I don't think so your slow query depends on cursor usage. postgres=# set cursor_tuple_fraction TO 1.0; SET postgres=# explain declare x cursor for select * from foo where a % 2 = 0 order by a; QUERY PLAN ──────────────────────────────────────────────────────────────── Sort (cost=19229.19..19241.69 rows=5000 width=4) Sort Key: a -> Seq Scan on foo (cost=0.00..18922.00 rows=5000 width=4) Filter: ((a % 2) = 0) (4 rows) postgres=# set cursor_tuple_fraction TO 1.0; SET postgres=# explain declare x cursor for select * from foo where a % 2 = 0 order by a; QUERY PLAN ──────────────────────────────────────────────────────────────── Sort (cost=19229.19..19241.69 rows=5000 width=4) Sort Key: a -> Seq Scan on foo (cost=0.00..18922.00 rows=5000 width=4) Filter: ((a % 2) = 0) (4 rows) postgres=# set cursor_tuple_fraction TO 0.1; SET postgres=# explain declare x cursor for select * from foo where a % 2 = 0 order by a; QUERY PLAN ─────────────────────────────────────────────────────────────────────────── Index Scan using foo_pkey on foo (cost=0.00..32693.34 rows=5000 width=4) Filter: ((a % 2) = 0) (2 rows) Regards Pavel Stehule > > On Tue, 2012-01-24 at 21:47 +0100, Pavel Stehule wrote: >> Hello >> >> 2012/1/24 Tony Capobianco <tcapobianco@prospectiv.com>: >> > We are migrating our Oracle warehouse to Postgres 9. >> > >> > This function responds well: >> > >> > pg=# select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); >> > getmemberadminprevious_sp2 >> > ---------------------------- >> > <unnamed portal 1> >> > (1 row) >> > >> > Time: 7.549 ms >> > >> > However, when testing, this fetch takes upwards of 38 minutes: >> > >> > BEGIN; >> > select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); >> > FETCH ALL IN "<unnamed portal 2>"; >> > >> > How can I diagnose any performance issues with the fetch in the cursor? >> > >> >> Cursors are optimized to returns small subset of result - if you plan >> to read complete result, then set >> >> set cursor_tuple_fraction to 1.0; >> >> this is session config value, you can set it before selected cursors queries >> >> Regards >> >> Pavel Stehule >> >> > Thanks. >> > Tony >> > >> > >> > -- >> > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >> > To make changes to your subscription: >> > http://www.postgresql.org/mailpref/pgsql-performance >> > >
> On Tue, 2012-01-24 at 21:47 +0100, Pavel Stehule wrote: >> Hello >> >> 2012/1/24 Tony Capobianco<tcapobianco@prospectiv.com>: >>> We are migrating our Oracle warehouse to Postgres 9. >>> >>> This function responds well: >>> >>> pg=# select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); >>> getmemberadminprevious_sp2 >>> ---------------------------- >>> <unnamed portal 1> >>> (1 row) >>> >>> Time: 7.549 ms >>> >>> However, when testing, this fetch takes upwards of 38 minutes: >>> >>> BEGIN; >>> select public.getMemberAdminPrevious_sp2(247815829, 1,'test.email@hotmail.com', 'email', 'test'); >>> FETCH ALL IN "<unnamed portal 2>"; >>> >>> How can I diagnose any performance issues with the fetch in the cursor? >>> >> >> Cursors are optimized to returns small subset of result - if you plan >> to read complete result, then set >> >> set cursor_tuple_fraction to 1.0; >> >> this is session config value, you can set it before selected cursors queries >> >> Regards >> >> Pavel Stehule >> >>> Thanks. >>> Tony >>> >>> >>> -- >>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-performance >> > > > On 1/24/2012 2:57 PM, Tony Capobianco wrote: > Running just the sql of the function returns only 10 rows: > > pg=# SELECT m.memberid, m.websiteid, m.emailaddress, > pg-# m.firstname, m.lastname, m.regcomplete, m.emailok > pg-# FROM members m > pg-# WHERE m.emailaddress LIKE 'test.email@hotmail.com' > pg-# AND m.changedate_id< 5868 ORDER BY m.emailaddress, m.websiteid; > memberid | websiteid | emailaddress | firstname | lastname | regcomplete | emailok > -----------+-----------+------------------------+-----------+----------+-------------+--------- > 247815829 | 1 | test.email@hotmail.com | email | test | 1 | 1 > 300960335 | 62 | test.email@hotmail.com | | | 1 | 1 > 300959937 | 625 | test.email@hotmail.com | | | 1 | 1 > 260152830 | 1453 | test.email@hotmail.com | | | 1 | 1 > 300960163 | 1737 | test.email@hotmail.com | email | test | 1 | 1 > 300960259 | 1824 | test.email@hotmail.com | email | test | 1 | 1 > 300959742 | 1928 | test.email@hotmail.com | email | test | 1 | 1 > 368122699 | 2457 | test.email@hotmail.com | email | test | 1 | 1 > 403218613 | 2464 | test.email@hotmail.com | email | test | 1 | 0 > 378951994 | 2656 | test.email@hotmail.com | | | 1 | 1 > (10 rows) > > Time: 132.626 ms > > So, it would seem that's a small enough number of rows. Unfortunately, issuing: > > set cursor_tuple_fraction to 1.0; > > Did not have an effect on performance. Is it common to modify this > cursor_tuple_fraction parameter each time we execute the function? > > So, is getMemberAdminPrevious_sp2() preparing a statement with wildcards? SELECT m.memberid, m.websiteid, m.emailaddress, m.firstname, m.lastname, m.regcomplete, m.emailok FROM members m WHERE m.emailaddress LIKE $1 AND m.changedate_id < $2 ORDER BY m.emailaddress, m.websiteid; Or is it creating the string and executing it: sql = 'SELECT m.memberid, m.websiteid, m.emailaddress, ' || ' m.firstname, m.lastname, m.regcomplete, m.emailok ' || ' FROM members m || ' WHERE m.emailaddress LIKE ' || arg1 || ' AND m.changedate_id < ' || arg2 || ' ORDER BY m.emailaddress, m.websiteid '; execute(sql); Maybe its the planner doesnt plan so well with $1 arguments vs actual arguments thing. -Andy
Hello > > So, is getMemberAdminPrevious_sp2() preparing a statement with wildcards? > > SELECT m.memberid, m.websiteid, m.emailaddress, > m.firstname, m.lastname, m.regcomplete, m.emailok > FROM members m > WHERE m.emailaddress LIKE $1 > AND m.changedate_id < $2 > ORDER BY m.emailaddress, m.websiteid; > > Or is it creating the string and executing it: > > sql = 'SELECT m.memberid, m.websiteid, m.emailaddress, ' > || ' m.firstname, m.lastname, m.regcomplete, m.emailok ' > || ' FROM members m > || ' WHERE m.emailaddress LIKE ' || arg1 > || ' AND m.changedate_id < ' || arg2 > || ' ORDER BY m.emailaddress, m.websiteid '; > execute(sql); > > Maybe its the planner doesnt plan so well with $1 arguments vs actual > arguments thing. > sure, it could be blind optimization problem in plpgsql. Maybe you have to use a dynamic SQL - OPEN FOR EXECUTE stmt probably http://www.postgresql.org/docs/9.1/interactive/plpgsql-cursors.html Regards Pavel Stehule > -Andy > >
Tony Capobianco <tcapobianco@prospectiv.com> writes: > Running just the sql of the function returns only 10 rows: > pg=# SELECT m.memberid, m.websiteid, m.emailaddress, > pg-# m.firstname, m.lastname, m.regcomplete, m.emailok > pg-# FROM members m > pg-# WHERE m.emailaddress LIKE 'test.email@hotmail.com' > pg-# AND m.changedate_id < 5868 ORDER BY m.emailaddress, m.websiteid; Based on that, I'd bet your problem is that the function is executing WHERE m.emailaddress LIKE $1 (for some spelling of $1) and you are therefore not getting the benefit of the index optimizations that can happen when LIKE's pattern is constant. Do you actually need LIKE rather than just "=" here? regards, tom lane
Here's the explain: pg=# explain select getMemberAdminPrevious_sp(247815829, 1,'test.email@hotmail.com', 'Email', 'Test'); QUERY PLAN ------------------------------------------ Result (cost=0.00..0.26 rows=1 width=0) (1 row) Time: 1.167 ms There was discussion of 'LIKE' v. '=' and wildcard characters are not being entered into the $1 parameter. This is not generating a sql string. I feel it's something to do with the fetch of the refcursor. The cursor is a larger part of a function: CREATE OR REPLACE FUNCTION PUBLIC.GETMEMBERADMINPREVIOUS_SP2 ( p_memberid IN numeric, p_websiteid IN numeric, p_emailaddress IN varchar, p_firstname IN varchar, p_lastname IN varchar) RETURNS refcursor AS $$ DECLARE ref refcursor; l_sysdateid numeric; BEGIN l_sysdateid := sysdateid(); if (p_memberid != 0) then if (p_emailaddress IS NOT NULL) then OPEN ref FOR SELECT m.memberid, m.websiteid, m.emailaddress, m.firstname, m.lastname, m.regcomplete, m.emailok FROM members m WHERE m.emailaddress LIKE p_emailaddress AND m.changedate_id < l_sysdateid ORDER BY m.emailaddress, m.websiteid; end if; end if; Return ref; EXCEPTION WHEN NO_DATA_FOUND THEN Return null; END; $$ LANGUAGE 'plpgsql'; On Tue, 2012-01-24 at 22:17 +0100, Pavel Stehule wrote: > Hello > > > > > So, is getMemberAdminPrevious_sp2() preparing a statement with wildcards? > > > > SELECT m.memberid, m.websiteid, m.emailaddress, > > m.firstname, m.lastname, m.regcomplete, m.emailok > > FROM members m > > WHERE m.emailaddress LIKE $1 > > AND m.changedate_id < $2 > > ORDER BY m.emailaddress, m.websiteid; > > > > Or is it creating the string and executing it: > > > > sql = 'SELECT m.memberid, m.websiteid, m.emailaddress, ' > > || ' m.firstname, m.lastname, m.regcomplete, m.emailok ' > > || ' FROM members m > > || ' WHERE m.emailaddress LIKE ' || arg1 > > || ' AND m.changedate_id < ' || arg2 > > || ' ORDER BY m.emailaddress, m.websiteid '; > > execute(sql); > > > > Maybe its the planner doesnt plan so well with $1 arguments vs actual > > arguments thing. > > > > sure, it could be blind optimization problem in plpgsql. Maybe you > have to use a dynamic SQL - OPEN FOR EXECUTE stmt probably > > http://www.postgresql.org/docs/9.1/interactive/plpgsql-cursors.html > > Regards > > Pavel Stehule > > > -Andy > > > > >
On 24.01.2012 23:34, Tony Capobianco wrote: > Here's the explain: > > pg=# explain select getMemberAdminPrevious_sp(247815829, 1,'test.email@hotmail.com', 'Email', 'Test'); > QUERY PLAN > ------------------------------------------ > Result (cost=0.00..0.26 rows=1 width=0) > (1 row) > > Time: 1.167 ms That's not very helpful. We'd need to see the plan of the query within the function, not the plan on invoking the function. The auto_explain contrib module with auto_explain_log_nested_statements=on might be useful to get that. > There was discussion of 'LIKE' v. '=' and wildcard characters are not > being entered into the $1 parameter. > > This is not generating a sql string. I feel it's something to do with > the fetch of the refcursor. The cursor is a larger part of a function: > > CREATE OR REPLACE FUNCTION PUBLIC.GETMEMBERADMINPREVIOUS_SP2 ( > p_memberid IN numeric, > p_websiteid IN numeric, > p_emailaddress IN varchar, > p_firstname IN varchar, > p_lastname IN varchar) > RETURNS refcursor AS $$ > DECLARE > ref refcursor; > l_sysdateid numeric; > BEGIN > l_sysdateid := sysdateid(); > if (p_memberid != 0) then > if (p_emailaddress IS NOT NULL) then > OPEN ref FOR > SELECT m.memberid, m.websiteid, m.emailaddress, > m.firstname, m.lastname, m.regcomplete, m.emailok > FROM members m > WHERE m.emailaddress LIKE p_emailaddress > AND m.changedate_id< l_sysdateid ORDER BY m.emailaddress, > m.websiteid; > end if; > end if; > Return ref; > EXCEPTION > WHEN NO_DATA_FOUND THEN > Return null; > END; > $$ LANGUAGE 'plpgsql'; The theory that the query takes a long time because "LIKE p_emailaddress" is not optimizeable by the planner seems the most likely to me. If you don't actually use any wildcards in the email, try replacing LIKE with =. If you do, then you can try the "OPEN ref FOR EXECUTE" syntax. That way the query is re-planned every time, and the planner can take advantage of the parameter value. That enables it to use an index on the email address column, when there isn't in fact any wildcards in the value, and also estimate the selectivities better which can lead to a better plan. Like this: CREATE OR REPLACE FUNCTION public.getmemberadminprevious_sp2(p_memberid numeric, p_websiteid numeric, p_emailaddress character varying, p_firstname character varying, p_lastname character varying) RETURNS refcursor LANGUAGE plpgsql AS $function$ DECLARE ref refcursor; l_sysdateid numeric; BEGIN l_sysdateid := sysdateid(); if (p_memberid != 0) then if (p_emailaddress IS NOT NULL) then OPEN ref FOR EXECUTE $query$ SELECT m.memberid, m.websiteid, m.emailaddress, m.firstname, m.lastname, m.regcomplete, m.emailok FROM members m WHERE m.emailaddress LIKE $1 AND m.changedate_id < $2 ORDER BY m.emailaddress, m.websiteid; $query$ USING p_emailaddress, l_sysdateid; end if; end if; Return ref; EXCEPTION WHEN NO_DATA_FOUND THEN Return null; END; $function$ -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com