Thread: How to optimize insert statements ?

How to optimize insert statements ?

From

"Christian Leclerc"

Date:

24 July 2007, 07:37:41

Hello,

I'm encountering a performance issue with insert statements.

I push to Postgres an xml file with 2460 objects representing 2460 insert statements in a single transaction commited when the xml file is totally read.

I don't know how to solve the issue, how to tune/optimize Postgres or my statements. I limited the number of index and commented lots of lines in the PERL trigger attached to my table without any success. Any hints/advises are welcome.

Thanks in advance,

Christian

I have activated the execution time logging. The first insert statements are very fast, but with time and objects inserted, every insert statement becomes slower:

2007-07-24 10:30:39 LOG: duration: 0.000 ms statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

[...]

2007-07-24 10:30:58 LOG: duration: 0.000 ms statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date ) VALUES ( 'FILE_2410', 'jrules50_4.exe', 12, 84, 4, 'Patch', 'JRules 5.0 update 4 - build 44', 314525023, 'patch/jrules/1147/jrules50_4.exe', '2005-06-09 09:31:52.000000', '2005-06-07 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]

2007-07-24 10:32:33 LOG: duration: 47.000 ms statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date ) VALUES ( 'FILE_528', 'gadgets_views40_89.rs6000.tar.gz', 30, 273, 89, 'Patch', NULL, 1224154, 'patch/views/150/gadgets_views40_89.rs6000.tar.gz', '2001-03-19 00:00:00.000000', '2001-03-19 00:00:00.000000', '2001-03-19 00:00:00.000000' ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]

2007-07-24 10:35:31 LOG: duration: 78.000 ms statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date ) VALUES ( 'FILE_1930', 'web_views50_254.hp32_11_3.30.tar.gz', 30, 261, 254, 'Patch', NULL, 562328, 'patch/views/14940/web_views50_254.hp32_11_3.30.tar.gz', '2003-10-03 18:50:37.000000', '2003-10-03 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]

2007-07-24 10:37:50 LOG: duration: 93.999 ms statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date ) VALUES ( 'FILE_1942', 'foundation_views402_196.alpha_4_6.1.tar.gz', 30, 260, 196, 'Patch', NULL, 10765925, 'patch/views/18824/foundation_views402_196.alpha_4_6.1.tar.gz', '2004-03-12 10:56:52.000000', '2004-03-12 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]

2007-07-24 10:53:08 LOG: duration: 171.999 ms statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date ) VALUES ( 'FILE_7099', 'manager_views501_293.hp64_11_3.15.tar.gz', 30, 275, 293, 'Patch', NULL, 731466, 'patch/views/50549/manager_views501_293.hp64_11_3.15.tar.gz', '2007-06-29 14:36:16.000000', '2007-06-27 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'

[...]

2007-07-24 10:53:15 LOG: duration: 233.999 ms statement: COMMIT

Here is the table schema, the index and the trigger code written in PERL:

CREATE TABLE production.product_downloads
(
nid integer NOT NULL,

deploytool_id character varying(64),
name character varying(128),
fk_product integer NOT NULL REFERENCES production.products(nid) ON DELETE RESTRICT,
fk_product_version integer NOT NULL REFERENCES production.product_versions(nid) ON DELETE RESTRICT,
patch_level integer,
category character varying(32),
description text,
size integer,
internal_id character varying(128),
deployment_date timestamp with time zone,
release_date date,
download_update_date timestamp with time zone,

CONSTRAINT product_downloads_pkey PRIMARY KEY (nid),
CONSTRAINT product_downloads_deploytool_id_key UNIQUE (deploytool_id)
)
WITHOUT OIDS;

CREATE INDEX product_downloads_deploytool_id_idx ON production.product_downloads(deploytool_id);

CREATE TRIGGER trigger_product_downloads BEFORE INSERT OR DELETE OR UPDATE
ON production.product_downloads FOR EACH ROW EXECUTE PROCEDURE core.historize_and_notify('production');

CREATE OR REPLACE FUNCTION core.historize_and_notify() RETURNS "trigger" AS $BODY$
my $schemaName = @{$_TD->{args}}[0];
if ($_TD->{event} eq "DELETE") {

       # All the code is commented here
       return;
    }
    elsif ($_TD->{event} eq "INSERT") {
        if (!defined($_TD->{new}{row_id})) {
           my $retrieved = spi_exec_query("SELECT next_id, prefix FROM core.tables WHERE name = '".$schemaName.".".$_TD->{relname}."'");
           $_TD->{new}{nid} = $retrieved->{rows}[0]->{next_id};
           my $incremented = spi_exec_query("UPDATE core.tables SET next_id=".($_TD->{new}{nid}+1)."WHERE name = '".$schemaName.".".$_TD->{relname}."'");
        }
        # All the code is commented from here
        return "MODIFY";
    } elsif ($_TD->{event}eq "UPDATE") {

         # All the code is commented from here
         return "MODIFY";
    }
    return;
$BODY$ LANGUAGE 'plperl' VOLATILE;

CREATE TABLE core.tables
(
name character(64) NOT NULL,
prefix character(4) NOT NULL,

description text,
next_id integer,

CONSTRAINT ods_pkey PRIMARY KEY (name),
CONSTRAINT ods_id_prefix_key UNIQUE (prefix)
)
WITHOUT OIDS;

CREATE INDEX tables_name_idx
ON core.tables USING btree (name);

Re: How to optimize insert statements ?

From

Tom Lane

Date:

24 July 2007, 11:33:46

"Christian Leclerc" <cleclerc@ilog.fr> writes:
> I'm encountering a performance issue with insert statements.

It looks to me like your trigger is the entire cause of the slowness.
I think you would be well advised to get rid of it and use a serial
column (ie a sequence object) instead of a handmade, poorly performing
substitute for sequences.

            regards, tom lane

Re: How to optimize insert statements ?

From

"Christian Leclerc"

Date:

25 July 2007, 07:48:32

Hello Tom,

Thanks for your reply and advise. I understand in your email that the
use of a sequence object will increase the performance of my trigger.
Therefore I'm going to replace that in my code immediately. Anyway, my
problem is not exactly the performances; it's more the insertion time
growing. Indeed I don't understand why the insertion time grows
"exponentially" with my single transaction. If I split my objects
insertion into several transactions (instead of one), the problem seems
to disappear. Would you see any reasons linked to the Postgres
transactions explaining this insertion time growing ?

Regards,
Christian


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 24, 2007 4:33 PM
To: Christian Leclerc
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] How to optimize insert statements ?

"Christian Leclerc" <cleclerc@ilog.fr> writes:
> I'm encountering a performance issue with insert statements.

It looks to me like your trigger is the entire cause of the slowness.
I think you would be well advised to get rid of it and use a serial
column (ie a sequence object) instead of a handmade, poorly performing
substitute for sequences.

            regards, tom lane

Re: How to optimize insert statements ?

From

Sean Davis

Date:

25 July 2007, 08:03:47

Christian Leclerc wrote:
> Hello Tom,
>
> Thanks for your reply and advise. I understand in your email that the
> use of a sequence object will increase the performance of my trigger.
> Therefore I'm going to replace that in my code immediately. Anyway, my
> problem is not exactly the performances; it's more the insertion time
> growing. Indeed I don't understand why the insertion time grows
> "exponentially" with my single transaction. If I split my objects
> insertion into several transactions (instead of one), the problem seems
> to disappear. Would you see any reasons linked to the Postgres
> transactions explaining this insertion time growing ?
>
I think that the insertion time growing is related to the way in which
you were doing your inserts; using a sequence object will likely fix the
issue and then you will have the answer to your question.

Sean

Re: How to optimize insert statements ?

From

"Christian Leclerc"

Date:

25 July 2007, 14:40:27

Thank you. I just brought the modifications to my code.
Using the sequences totally solves my issue. The insertion time is
constant and systematically between 15ms and 33ms.

Christian