Thread: Linear slow-down while inserting into a table with an ON INSERT trigger ?

Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Tobias Gierke

Date:

16 July 2021, 21:27:24

Hi,

Probably my google-foo is weak today but I couldn't find any (convincing) explanation for this.

I'm running PostgreSQL 12.6 on 64-bit Linux (CentOS 7, PostgreSQL compiled from sources) and I'm trying to insert 30k rows into a simple table that has an "ON INSERT .. FOR EACH STATEMENT" trigger.

                                      Table "public.parent_table"

   Column    |          Type          | Collation | Nullable |                  Default                   
-------------+------------------------+-----------+----------+-------------------------------------------- id          | bigint                 |           | not null | nextval('parent_table_id_seq'::regclass) name        | character varying(64)  |           | not null | 
 enabled     | boolean                |           | not null | 
 description | character varying(255) |           |          | 
 deleted     | boolean                |           | not null | false is_default  | boolean                |           | not null | false

Indexes:
    "parent_pkey" PRIMARY KEY, btree (id)
    "uniq_name" UNIQUE, btree (name) WHERE deleted = false

Referenced by:
    TABLE "child" CONSTRAINT "child_fkey" FOREIGN KEY (parent_id) REFERENCES parent_table(id) ON DELETE CASCADE

Triggers:
    parent_changed BEFORE INSERT OR DELETE OR UPDATE OR TRUNCATE ON parent_table FOR EACH STATEMENT EXECUTE FUNCTION parent_table_changed();

This is the trigger function

CREATE OR REPLACE FUNCTION parent_table_changed() RETURNS trigger LANGUAGE plpgsql
AS $function$
BEGIN    UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP;    RETURN NEW;
END;
$function$

I'm trying to insert 30k rows (inside a single transaction) into the parent table using the following SQL (note that I came across this issue while debugging an application-level performance problem and the SQL I'm using here is similar to what the application is generating):

BEGIN;
-- ALTER TABLE public.parent_table DISABLE TRIGGER parent_changed; 
PREPARE my_insert (varchar(64), boolean, varchar(255), boolean, boolean) AS INSERT INTO public.parent_table (name,enabled,description,deleted,is_default) VALUES($1, $2, $3, $4, $5);
EXECUTE my_insert ($$035001$$, true, $$null$$, false, false);
EXECUTE my_insert ($$035002$$, true, $$null$$, false, false);
....29998 more lines

This is the execution time I get when running the script while the trigger is enabled:

~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db

real    0m8,381s
user    0m0,203s
sys     0m0,287s

Running the same SQL script with trigger disabled shows a ~4x speed-up:

~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db

real    0m2,284s
user    0m0,171s
sys     0m0,261s

Defining the trigger as "BEFORE INSERT" or "AFTER INSERT" made no difference.

I then got curious , put a "/timing" at the start of the SQL script, massaged the psql output a bit and plotted a chart of the statement execution times.
To my surprise, I see a linear increase of the per-INSERT execution times, roughly 4x as well:

While the execution time per INSERT remains constant when disabling the trigger before inserting the data:

What's causing this slow-down ?

Thanks,
Tobias

Attachment

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Justin Pryzby

Date:

17 July 2021, 04:40:26

On Fri, Jul 16, 2021 at 11:27:24PM +0200, Tobias Gierke wrote:
> CREATE OR REPLACE FUNCTION parent_table_changed() RETURNS trigger LANGUAGE plpgsql
> AS $function$
> BEGIN
>     UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP;
>     RETURN NEW;
> END;
> $function$
> 
> I'm trying to insert 30k rows (inside a single transaction) into the parent

The problem is because you're doing 30k updates of data_sync within a txn.
Ideally it starts with 1 tuple in 1 page but every row updated requires
scanning the previous N rows, which haven't been vacuumed (and cannot).
Update is essentially delete+insert, and the table will grow with each update
until the txn ends and it's vacuumed.

        pages: 176 removed, 1 remain, 0 skipped due to pins, 0 skipped frozen
        tuples: 40000 removed, 1 remain, 0 are dead but not yet removable, oldest xmin: 2027

You could run a single UPDATE rather than 30k triggers.
Or switch to an INSERT on the table, with an index on it, and call
max(last_parent_table_change) from whatever needs to ingest it.  And prune the
old entries and vacuum it outside the transaction.  Maybe someone else will
have a better suggestion.

-- 
Justin

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Benjamin Scherrey

Date:

17 July 2021, 06:48:45

In addition to what Justin was saying, another possible consideration may be the transaction isolation level that you're running at. I don't recall what Postgres' default is off hand but "Serializable" is the most strict and would likely impose some overhead on what you described. Check out https://www.postgresql.org/docs/12/transaction-iso.html for details. If your particular use case can loosen up some of the strictness in the context of that transaction it might possibly result in a speed improvement. Just make sure you don't trade off data integrity for speed or else you'll get invalid data quickly!

-- Ben Scherrey

On Sat, Jul 17, 2021 at 4:27 AM Tobias Gierke <tobias.gierke@code-sourcery.de> wrote:

Hi,
Probably my google-foo is weak today but I couldn't find any (convincing) explanation for this.
I'm running PostgreSQL 12.6 on 64-bit Linux (CentOS 7, PostgreSQL compiled from sources) and I'm trying to insert 30k rows into a simple table that has an "ON INSERT .. FOR EACH STATEMENT" trigger.
                                      Table "public.parent_table"
   Column    |          Type          | Collation | Nullable |                  Default                   
-------------+------------------------+-----------+----------+-------------------------------------------- id          | bigint                 |           | not null | nextval('parent_table_id_seq'::regclass) name        | character varying(64)  |           | not null | 
 enabled     | boolean                |           | not null | 
 description | character varying(255) |           |          | 
 deleted     | boolean                |           | not null | false is_default  | boolean                |           | not null | false
Indexes:
    "parent_pkey" PRIMARY KEY, btree (id)
    "uniq_name" UNIQUE, btree (name) WHERE deleted = false
Referenced by:
    TABLE "child" CONSTRAINT "child_fkey" FOREIGN KEY (parent_id) REFERENCES parent_table(id) ON DELETE CASCADE
Triggers:
    parent_changed BEFORE INSERT OR DELETE OR UPDATE OR TRUNCATE ON parent_table FOR EACH STATEMENT EXECUTE FUNCTION parent_table_changed();
This is the trigger function
CREATE OR REPLACE FUNCTION parent_table_changed() RETURNS trigger LANGUAGE plpgsql
AS $function$
BEGIN    UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP;    RETURN NEW;
END;
$function$
I'm trying to insert 30k rows (inside a single transaction) into the parent table using the following SQL (note that I came across this issue while debugging an application-level performance problem and the SQL I'm using here is similar to what the application is generating):
BEGIN;
-- ALTER TABLE public.parent_table DISABLE TRIGGER parent_changed; 
PREPARE my_insert (varchar(64), boolean, varchar(255), boolean, boolean) AS INSERT INTO public.parent_table (name,enabled,description,deleted,is_default) VALUES($1, $2, $3, $4, $5);
EXECUTE my_insert ($$035001$$, true, $$null$$, false, false);
EXECUTE my_insert ($$035002$$, true, $$null$$, false, false);
....29998 more lines
This is the execution time I get when running the script while the trigger is enabled:
~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db
real    0m8,381s
user    0m0,203s
sys     0m0,287s
Running the same SQL script with trigger disabled shows a ~4x speed-up:
~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db

real    0m2,284s
user    0m0,171s
sys     0m0,261s
Defining the trigger as "BEFORE INSERT" or "AFTER INSERT" made no difference.

I then got curious , put a "/timing" at the start of the SQL script, massaged the psql output a bit and plotted a chart of the statement execution times.
To my surprise, I see a linear increase of the per-INSERT execution times, roughly 4x as well:
While the execution time per INSERT remains constant when disabling the trigger before inserting the data:
What's causing this slow-down ?

Thanks,
Tobias

Attachment

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

David Rowley

Date:

17 July 2021, 07:02:31

On Sat, 17 Jul 2021 at 16:40, Justin Pryzby <pryzby@telsasoft.com> wrote:
> You could run a single UPDATE rather than 30k triggers.
> Or switch to an INSERT on the table, with an index on it, and call
> max(last_parent_table_change) from whatever needs to ingest it.  And prune the
> old entries and vacuum it outside the transaction.  Maybe someone else will
> have a better suggestion.

Maybe just change the UPDATE statement to:

UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP WHERE
last_parent_table_change <> CURRENT_TIMESTAMP;

That should reduce the number of actual updates to 1 per transaction.

David

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Tom Lane

Date:

17 July 2021, 14:33:09

David Rowley <dgrowleyml@gmail.com> writes:
> On Sat, 17 Jul 2021 at 16:40, Justin Pryzby <pryzby@telsasoft.com> wrote:
>> You could run a single UPDATE rather than 30k triggers.
>> Or switch to an INSERT on the table, with an index on it, and call
>> max(last_parent_table_change) from whatever needs to ingest it.  And prune the
>> old entries and vacuum it outside the transaction.  Maybe someone else will
>> have a better suggestion.

> Maybe just change the UPDATE statement to:
> UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP WHERE
> last_parent_table_change <> CURRENT_TIMESTAMP;
> That should reduce the number of actual updates to 1 per transaction.

Or, if it's impractical to make the application do that for itself,
this could be a job for suppress_redundant_updates_trigger().

https://www.postgresql.org/docs/current/functions-trigger.html

            regards, tom lane

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Ninad Shah

Date:

18 July 2021, 06:52:00

How much does the function-ed UPDATE statement take?

Regards,

Ninad Shah

On Sat, 17 Jul 2021 at 02:57, Tobias Gierke <tobias.gierke@code-sourcery.de> wrote:

Hi,
Probably my google-foo is weak today but I couldn't find any (convincing) explanation for this.
I'm running PostgreSQL 12.6 on 64-bit Linux (CentOS 7, PostgreSQL compiled from sources) and I'm trying to insert 30k rows into a simple table that has an "ON INSERT .. FOR EACH STATEMENT" trigger.
                                      Table "public.parent_table"
   Column    |          Type          | Collation | Nullable |                  Default                   
-------------+------------------------+-----------+----------+-------------------------------------------- id          | bigint                 |           | not null | nextval('parent_table_id_seq'::regclass) name        | character varying(64)  |           | not null | 
 enabled     | boolean                |           | not null | 
 description | character varying(255) |           |          | 
 deleted     | boolean                |           | not null | false is_default  | boolean                |           | not null | false
Indexes:
    "parent_pkey" PRIMARY KEY, btree (id)
    "uniq_name" UNIQUE, btree (name) WHERE deleted = false
Referenced by:
    TABLE "child" CONSTRAINT "child_fkey" FOREIGN KEY (parent_id) REFERENCES parent_table(id) ON DELETE CASCADE
Triggers:
    parent_changed BEFORE INSERT OR DELETE OR UPDATE OR TRUNCATE ON parent_table FOR EACH STATEMENT EXECUTE FUNCTION parent_table_changed();
This is the trigger function
CREATE OR REPLACE FUNCTION parent_table_changed() RETURNS trigger LANGUAGE plpgsql
AS $function$
BEGIN    UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP;    RETURN NEW;
END;
$function$
I'm trying to insert 30k rows (inside a single transaction) into the parent table using the following SQL (note that I came across this issue while debugging an application-level performance problem and the SQL I'm using here is similar to what the application is generating):
BEGIN;
-- ALTER TABLE public.parent_table DISABLE TRIGGER parent_changed; 
PREPARE my_insert (varchar(64), boolean, varchar(255), boolean, boolean) AS INSERT INTO public.parent_table (name,enabled,description,deleted,is_default) VALUES($1, $2, $3, $4, $5);
EXECUTE my_insert ($$035001$$, true, $$null$$, false, false);
EXECUTE my_insert ($$035002$$, true, $$null$$, false, false);
....29998 more lines
This is the execution time I get when running the script while the trigger is enabled:
~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db
real    0m8,381s
user    0m0,203s
sys     0m0,287s
Running the same SQL script with trigger disabled shows a ~4x speed-up:
~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db

real    0m2,284s
user    0m0,171s
sys     0m0,261s
Defining the trigger as "BEFORE INSERT" or "AFTER INSERT" made no difference.

I then got curious , put a "/timing" at the start of the SQL script, massaged the psql output a bit and plotted a chart of the statement execution times.
To my surprise, I see a linear increase of the per-INSERT execution times, roughly 4x as well:
While the execution time per INSERT remains constant when disabling the trigger before inserting the data:
What's causing this slow-down ?

Thanks,
Tobias

Attachment

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Tobias Gierke

Date:

18 July 2021, 07:19:00

How much does the function-ed UPDATE statement take?

INSERTs into the parent table are taking ~0.1 ms/INSERT at the beginning of the transaction and this increases to ~0.43 ms after 30k INSERTs. Without the trigger, INSERTs into the parent table consistently take around 0.05ms.

Regards,
Tobias

Regards,
Ninad Shah
On Sat, 17 Jul 2021 at 02:57, Tobias Gierke <tobias.gierke@code-sourcery.de> wrote:
Hi,
Probably my google-foo is weak today but I couldn't find any (convincing) explanation for this.
I'm running PostgreSQL 12.6 on 64-bit Linux (CentOS 7, PostgreSQL compiled from sources) and I'm trying to insert 30k rows into a simple table that has an "ON INSERT .. FOR EACH STATEMENT" trigger.
                                      Table "public.parent_table"
   Column    |          Type          | Collation | Nullable |                  Default                   
-------------+------------------------+-----------+----------+-------------------------------------------- id          | bigint                 |           | not null | nextval('parent_table_id_seq'::regclass) name        | character varying(64)  |           | not null | 
 enabled     | boolean                |           | not null | 
 description | character varying(255) |           |          | 
 deleted     | boolean                |           | not null | false is_default  | boolean                |           | not null | false
Indexes:
    "parent_pkey" PRIMARY KEY, btree (id)
    "uniq_name" UNIQUE, btree (name) WHERE deleted = false
Referenced by:
    TABLE "child" CONSTRAINT "child_fkey" FOREIGN KEY (parent_id) REFERENCES parent_table(id) ON DELETE CASCADE
Triggers:
    parent_changed BEFORE INSERT OR DELETE OR UPDATE OR TRUNCATE ON parent_table FOR EACH STATEMENT EXECUTE FUNCTION parent_table_changed();
This is the trigger function
CREATE OR REPLACE FUNCTION parent_table_changed() RETURNS trigger LANGUAGE plpgsql
AS $function$
BEGIN    UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP;    RETURN NEW;
END;
$function$
I'm trying to insert 30k rows (inside a single transaction) into the parent table using the following SQL (note that I came across this issue while debugging an application-level performance problem and the SQL I'm using here is similar to what the application is generating):
BEGIN;
-- ALTER TABLE public.parent_table DISABLE TRIGGER parent_changed; 
PREPARE my_insert (varchar(64), boolean, varchar(255), boolean, boolean) AS INSERT INTO public.parent_table (name,enabled,description,deleted,is_default) VALUES($1, $2, $3, $4, $5);
EXECUTE my_insert ($$035001$$, true, $$null$$, false, false);
EXECUTE my_insert ($$035002$$, true, $$null$$, false, false);
....29998 more lines
This is the execution time I get when running the script while the trigger is enabled:
~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db
real    0m8,381s
user    0m0,203s
sys     0m0,287s
Running the same SQL script with trigger disabled shows a ~4x speed-up:
~/tmp$ time psql -q -Upostgres -h dbhost -f inserts.sql test_db

real    0m2,284s
user    0m0,171s
sys     0m0,261s
Defining the trigger as "BEFORE INSERT" or "AFTER INSERT" made no difference.

I then got curious , put a "/timing" at the start of the SQL script, massaged the psql output a bit and plotted a chart of the statement execution times.
To my surprise, I see a linear increase of the per-INSERT execution times, roughly 4x as well:
While the execution time per INSERT remains constant when disabling the trigger before inserting the data:
What's causing this slow-down ?

Thanks,
Tobias

Attachment

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Tobias Gierke

Date:

18 July 2021, 07:25:19

Great idea ! This brought the time per INSERT into the parent table down 
to a consistent ~0.065ms again (compared to 0.05ms when completely 
removing the trigger, so penalty for the trigger is roughly ~20%).

> On Sat, 17 Jul 2021 at 16:40, Justin Pryzby <pryzby@telsasoft.com> wrote:
>> You could run a single UPDATE rather than 30k triggers.
>> Or switch to an INSERT on the table, with an index on it, and call
>> max(last_parent_table_change) from whatever needs to ingest it.  And prune the
>> old entries and vacuum it outside the transaction.  Maybe someone else will
>> have a better suggestion.
> Maybe just change the UPDATE statement to:
>
> UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP WHERE
> last_parent_table_change <> CURRENT_TIMESTAMP;
>
> That should reduce the number of actual updates to 1 per transaction.
>
> David

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Tobias Gierke

Date:

18 July 2021, 07:36:33

Thank you for the detailed explanation ! Just one more question: I've 
did an experiment and reduced the fillfactor on the table updated by the 
trigger to 50%, hoping  the HOT feature would kick in and each 
subsequent INSERT would clean up the "HOT chain" of the previous INSERT 
... but execution times did not change at all compared to 100% 
fillfactor, why is this ? Does the HOT feature only work if a different 
backend accesses the table concurrently ?

Thanks,
Tobias

> On Fri, Jul 16, 2021 at 11:27:24PM +0200, Tobias Gierke wrote:
>> CREATE OR REPLACE FUNCTION parent_table_changed() RETURNS trigger LANGUAGE plpgsql
>> AS $function$
>> BEGIN
>>      UPDATE data_sync SET last_parent_table_change=CURRENT_TIMESTAMP;
>>      RETURN NEW;
>> END;
>> $function$
>>
>> I'm trying to insert 30k rows (inside a single transaction) into the parent
> The problem is because you're doing 30k updates of data_sync within a txn.
> Ideally it starts with 1 tuple in 1 page but every row updated requires
> scanning the previous N rows, which haven't been vacuumed (and cannot).
> Update is essentially delete+insert, and the table will grow with each update
> until the txn ends and it's vacuumed.
>
>          pages: 176 removed, 1 remain, 0 skipped due to pins, 0 skipped frozen
>          tuples: 40000 removed, 1 remain, 0 are dead but not yet removable, oldest xmin: 2027
>
> You could run a single UPDATE rather than 30k triggers.
> Or switch to an INSERT on the table, with an index on it, and call
> max(last_parent_table_change) from whatever needs to ingest it.  And prune the
> old entries and vacuum it outside the transaction.  Maybe someone else will
> have a better suggestion.
>

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

From

Laurenz Albe

Date:

19 July 2021, 05:45:02

On Sun, 2021-07-18 at 09:36 +0200, Tobias Gierke wrote:
> Thank you for the detailed explanation ! Just one more question: I've 
> did an experiment and reduced the fillfactor on the table updated by the 
> trigger to 50%, hoping  the HOT feature would kick in and each 
> subsequent INSERT would clean up the "HOT chain" of the previous INSERT 
> ... but execution times did not change at all compared to 100% 
> fillfactor, why is this ? Does the HOT feature only work if a different 
> backend accesses the table concurrently ?

No, but until the transaction is done, the tuples cannot be removed,
no matter if they are HOT or not.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com