Home > mailing lists

patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL - Mailing list pgsql-hackers

From	Pavel Stehule
Subject	patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL
Date	November 2, 2020 15:08:19
Msg-id	CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com Whole thread Raw
Responses	Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL (Pavel Stehule <pavel.stehule@gmail.com>) Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL (Pavel Stehule <pavel.stehule@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

here is another patch related to using CALL statement inside PL/pgSQL code.

A repeated using of CALL statement is expensive. How much?

I wrote synthetic test:

CREATE TABLE foo(a int, b int, c int);

CREATE OR REPLACE PROCEDURE public.simple_proc3(a integer, b integer, c integer, cnt int, OUT r boolean)
AS $$
BEGIN
INSERT INTO foo VALUES(a, b, c);
IF cnt % 10000 = 0 THEN
COMMIT;
r := true;
ELSE
r := false;
END IF;
END;
$$ LANGUAGE plpgsql;

DO $$
DECLARE a int; b int; c int; r boolean;
BEGIN
TRUNCATE foo;
FOR i IN 1..10000000
LOOP
a := (random() * 100)::int;
b := (random() * 100)::int;
c := (random() * 100)::int;
CALL simple_proc3(a, b, c, i, r);
IF r THEN
RAISE NOTICE 'committed at % row', i;
END IF;
END LOOP;
END;
$$;

I try to insert 10M rows with commit after inserting 10K rows. Execution time on master is ~ 6 minutes 368251,691 ms (06:08,252)

DO $$
DECLARE a int; b int; c int; r boolean;
BEGIN
TRUNCATE foo;
FOR i IN 1..10000000
LOOP
a := (random() * 100)::int;
b := (random() * 100)::int;
c := (random() * 100)::int;
INSERT INTO foo VALUES(a, b, c);
IF i % 10000 = 0 THEN
COMMIT;
r := true;
ELSE
r := false;
END IF;
IF r THEN
RAISE NOTICE 'committed at % row', i;
END IF;
END LOOP;
END;
$$;

When I try to remove CALL statement then same work needs less to 2 minutes 99109,511 ms (01:39,110). So this code is three times slower with calling one procedure. There are two significant parts of overhead:

a) internal implementation of CALL statement that doesn't use plan cache well, and it does lot of expensive operations over pg_proc catalogue,

b) wrapper in PL/pgSQL that repeatedly repearse expression string.

Overhead of PL/pgSQL can be reduced by using plan cache after fixing issue with resource owner. I did it, and I introduced "local_resowner" for holding references of plans for CALL statement expressions.

After patching the execution time is reduced to 4 minutes Time: 245852,846 ms (04:05,853). Still the overhead is significant, but it is 30% speedup.

The best case for this patch is about 2x times better performance

CREATE OR REPLACE PROCEDURE public.simple_proc2(a integer, b integer, c integer, cnt int, OUT r boolean)
AS $$
BEGIN
END;
$$ LANGUAGE plpgsql;

DO $$
DECLARE a int; r boolean;
BEGIN
FOR i IN 1..10000000
LOOP
CALL simple_proc2((random()*100)::int, (random()*100)::int, (random()*100)::int, i, r);
END LOOP;
END;
$$;

Time: 184667,970 ms (03:04,668), master: Time: 417463,457 ms (06:57,463)

On second hand, the worst case is about 10% (probably this overhead can be reduced by creating "local_resowner" only when it is necessary)

CREATE OR REPLACE FUNCTION simple_fx2(a int)
RETURNS int AS $$
BEGIN
RETURN a + a;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

DO $$
DECLARE a int;
BEGIN
FOR i IN 1..10000000
LOOP
a := simple_fx2(i);
END LOOP;
END;
$$;

Time: 5434,808 ms (00:05,435) , master: Time: 4632,762 ms (00:04,633)

Comments, notes, ideas?

Regards

Pavel

Attachment

plpgsql-using-local-resowner-for-call-plans.patch

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 02 November 2020, 14:52:48
Subject: Re: hash_array_extended() needs to pass down collation

From: Erik Rijkers
Date: 02 November 2020, 15:24:12
Subject: Re: Split copy.c

patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL - Mailing list pgsql-hackers

Attachment

Previous

Next