Thread: Slow planning time for custom function
Hi, I have a table api.issues that has a text column "body" with long texts (1000+ chars). I also wrote a custom function "normalizeBody"with plv8 that is a simple Text -> Text conversion. Now I created an index applying the function to the bodycolumn, so I can quickly run SELECT * FROM api.issues WHERE normalizeBody(body) = normalizeBody($1) The issue is, that the planning time is very slow (1.8 seconds). When I replace "normalizeBody" with "md5", however, I geta planning time of 0.5ms. Please note that row level security is enabled on the api.issues and most other tables. Thanks for your help, Ben Details below: - Managed AWS Postgres with default settings, no replication - PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit - Table api.issues has approx. 40 000 rows. ``` explain (analyze, buffers) select 1 from api.issues where normalizeunidiff(body) = normalizeunidiff(''); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Index Scan using rejectedissues_normalized on issues (cost=0.00..218.80 rows=217 width=4) (actual time=0.160..0.204 rows=3loops=1) Index Cond: (normalizeunidiff(body) = ''::text) Buffers: shared hit=5 Planning time: 1878.783 ms Execution time: 0.230 ms (5 rows) ``` ``` explain (analyze, buffers) select 1 from api.issues where md5(body) = md5(''); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Index Scan using rejectedissues_md5 on issues (cost=0.00..218.80 rows=217 width=4) (actual time=0.016..0.016 rows=0 loops=1) Index Cond: (md5(body) = 'd41d8cd98f00b204e9800998ecf8427e'::text) Buffers: shared hit=2 Planning time: 0.565 ms Execution time: 0.043 ms (5 rows) ``` ``` CREATE OR REPLACE FUNCTION public.normalizeunidiff( unidiff text) RETURNS text LANGUAGE 'plv8' COST 100 IMMUTABLE STRICT PARALLEL SAFE AS $BODY$ return unidiff .replace(/[\s\S]*@@/m, '') // remove header .split('\n') .map(function (line) { return line.trim() }) .filter(function (line) { return line.search(/^[+-]/) >= 0 }) .join('\n') .trim() $BODY$; ``` The indices are created this way where md5 is normalizeunidiff for the second one: ``` CREATE INDEX "rejectedissues_md5" ON api.issues using hash (md5(body)); ```
Hi, On 2018-03-23 21:28:22 +0100, bk@e8s.de wrote: > I have a table api.issues that has a text column "body" with long texts (1000+ chars). I also wrote a custom function "normalizeBody"with plv8 that is a simple Text -> Text conversion. Now I created an index applying the function to the bodycolumn, so I can quickly run > > SELECT * FROM api.issues WHERE normalizeBody(body) = normalizeBody($1) > > The issue is, that the planning time is very slow (1.8 seconds). When I replace "normalizeBody" with "md5", however, Iget a planning time of 0.5ms. How long does planning take if you repeat this? I wonder if a good chunk of those 1.8s is initial loading of plv8. Greetings, Andres Freund
On 24 March 2018 at 14:35, Andres Freund <andres@anarazel.de> wrote: > How long does planning take if you repeat this? I wonder if a good chunk > of those 1.8s is initial loading of plv8. Maybe, but it also could be the execution of the function, after all, the planner does invoke immutable functions: # explain verbose select lower('TEST'); QUERY PLAN ------------------------------------------- Result (cost=0.00..0.01 rows=1 width=32) Output: 'test'::text (2 rows) Would be interesting to see what changes without the IMMUTABLE flag. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, thanks for your help which already resolved the issue for me. I worked through your replies and it is indeed a startup delayfor the first call to a plv8 function in a session. I pasted the query plans below for comparison. ``` explain analyze select normalizeunidiff('') QUERY PLAN ------------------------------------------------------------------------------------- Result (cost=0.00..0.01 rows=1 width=32) (actual time=0.002..0.002 rows=1 loops=1) Planning time: 1863.782 ms Execution time: 0.022 ms ``` Then I ran again multiple times, to make sure that there was not some kind of startup delay: ``` select normalizeunidiff(''); explain analyze select normalizeunidiff(''); QUERY PLAN ------------------------------------------------------------------------------------- Result (cost=0.00..0.01 rows=1 width=32) (actual time=0.000..0.001 rows=1 loops=1) Planning time: 0.190 ms Execution time: 0.008 ms ``` Thanks again -Ben > On 24. Mar 2018, at 02:52, David Rowley <david.rowley@2ndquadrant.com> wrote: > > On 24 March 2018 at 14:35, Andres Freund <andres@anarazel.de> wrote: >> How long does planning take if you repeat this? I wonder if a good chunk >> of those 1.8s is initial loading of plv8. > > Maybe, but it also could be the execution of the function, after all, > the planner does invoke immutable functions: > > # explain verbose select lower('TEST'); > QUERY PLAN > ------------------------------------------- > Result (cost=0.00..0.01 rows=1 width=32) > Output: 'test'::text > (2 rows) > > Would be interesting to see what changes without the IMMUTABLE flag. > > -- > David Rowley http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services