Thread: How to optimize insert statements ?

How to optimize insert statements ?

From
"Christian Leclerc"
Date:
Hello,
 
I'm encountering a performance issue with insert statements. 
I push to Postgres an xml file with 2460 objects representing 2460 insert statements in a single transaction commited when the xml file is totally read.
I don't know how to solve the issue, how to tune/optimize Postgres or my statements. I limited the number of index and commented lots of lines in the PERL trigger attached to my table without any success. Any hints/advises are welcome.
 
Thanks in advance,
Christian
 
 
have activated the execution time logging. The first insert statements are very fast, but with time and objects inserted, every insert statement becomes slower:
2007-07-24 10:30:39 LOG:  duration: 0.000 ms  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
[...]
2007-07-24 10:30:58 LOG:  duration: 0.000 ms  statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date )  VALUES ( 'FILE_2410', 'jrules50_4.exe', 12, 84, 4, 'Patch', 'JRules 5.0 update 4 - build 44', 314525023, 'patch/jrules/1147/jrules50_4.exe', '2005-06-09 09:31:52.000000', '2005-06-07 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]
2007-07-24 10:32:33 LOG:  duration: 47.000 ms  statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date )  VALUES ( 'FILE_528', 'gadgets_views40_89.rs6000.tar.gz', 30, 273, 89, 'Patch', NULL, 1224154, 'patch/views/150/gadgets_views40_89.rs6000.tar.gz', '2001-03-19 00:00:00.000000', '2001-03-19 00:00:00.000000', '2001-03-19 00:00:00.000000' ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]
2007-07-24 10:35:31 LOG:  duration: 78.000 ms  statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date )  VALUES ( 'FILE_1930', 'web_views50_254.hp32_11_3.30.tar.gz', 30, 261, 254, 'Patch', NULL, 562328, 'patch/views/14940/web_views50_254.hp32_11_3.30.tar.gz', '2003-10-03 18:50:37.000000', '2003-10-03 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]
2007-07-24 10:37:50 LOG:  duration: 93.999 ms  statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date )  VALUES ( 'FILE_1942', 'foundation_views402_196.alpha_4_6.1.tar.gz', 30, 260, 196, 'Patch', NULL, 10765925, 'patch/views/18824/foundation_views402_196.alpha_4_6.1.tar.gz', '2004-03-12 10:56:52.000000', '2004-03-12 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]
2007-07-24 10:53:08 LOG:  duration: 171.999 ms  statement: INSERT INTO production.product_downloads ( deploytool_id, name, fk_product, fk_product_version, patch_level, category, description, size, internal_id, deployment_date, release_date, download_update_date )  VALUES ( 'FILE_7099', 'manager_views501_293.hp64_11_3.15.tar.gz', 30, 275, 293, 'Patch', NULL, 731466, 'patch/views/50549/manager_views501_293.hp64_11_3.15.tar.gz', '2007-06-29 14:36:16.000000', '2007-06-27 00:00:00.000000', NULL ); SELECT next_id FROM core.tables WHERE name = 'production.product_downloads'
[...]
2007-07-24 10:53:15 LOG:  duration: 233.999 ms  statement: COMMIT
 
Here is the table schema, the index and the trigger code written in PERL:
CREATE TABLE production.product_downloads
(
  nid integer NOT NULL, 
  deploytool_id character varying(64), 
  name character varying(128), 
  fk_product integer NOT NULL REFERENCES production.products(nid) ON DELETE RESTRICT, 
  fk_product_version integer NOT NULL REFERENCES production.product_versions(nid) ON DELETE RESTRICT, 
  patch_level integer, 
  category character varying(32), 
  description text, 
  size integer, 
  internal_id character varying(128), 
  deployment_date timestamp with time zone, 
  release_date date, 
  download_update_date timestamp with time zone,
 
  CONSTRAINT product_downloads_pkey PRIMARY KEY (nid),
  CONSTRAINT product_downloads_deploytool_id_key UNIQUE (deploytool_id)
)
WITHOUT OIDS;
 
CREATE INDEX product_downloads_deploytool_id_idx ON production.product_downloads(deploytool_id);

CREATE TRIGGER trigger_product_downloads BEFORE INSERT OR DELETE OR UPDATE
  ON production.product_downloads FOR EACH ROW EXECUTE PROCEDURE core.historize_and_notify('production');
CREATE OR REPLACE FUNCTION core.historize_and_notify() RETURNS "trigger" AS $BODY$
    my $schemaName = @{$_TD->{args}}[0];
    if ($_TD->{event} eq "DELETE") {
       # All the code is commented here
       return;
    }
    elsif ($_TD->{event} eq "INSERT") {
        if (!defined($_TD->{new}{row_id})) {
           my $retrieved = spi_exec_query("SELECT next_id, prefix FROM core.tables WHERE name = '".$schemaName.".".$_TD->{relname}."'");
           $_TD->{new}{nid} = $retrieved->{rows}[0]->{next_id};       
           my $incremented = spi_exec_query("UPDATE core.tables SET next_id=".($_TD->{new}{nid}+1)."WHERE name = '".$schemaName.".".$_TD->{relname}."'");
        }
        # All the code is commented from here
        return "MODIFY";
    } elsif ($_TD->{event}eq "UPDATE") {
         # All the code is commented from here
         return "MODIFY";
    }
    return;
$BODY$ LANGUAGE 'plperl' VOLATILE;
 
CREATE TABLE core.tables
(
  name character(64) NOT NULL, 
  prefix character(4) NOT NULL, 
  description text, 
  next_id integer, 
  
  CONSTRAINT ods_pkey PRIMARY KEY (name),
  CONSTRAINT ods_id_prefix_key UNIQUE (prefix)
)
WITHOUT OIDS;
 
CREATE INDEX tables_name_idx
  ON core.tables USING btree (name);

Re: How to optimize insert statements ?

From
Tom Lane
Date:
"Christian Leclerc" <cleclerc@ilog.fr> writes:
> I'm encountering a performance issue with insert statements.

It looks to me like your trigger is the entire cause of the slowness.
I think you would be well advised to get rid of it and use a serial
column (ie a sequence object) instead of a handmade, poorly performing
substitute for sequences.

            regards, tom lane

Re: How to optimize insert statements ?

From
"Christian Leclerc"
Date:
Hello Tom,

Thanks for your reply and advise. I understand in your email that the
use of a sequence object will increase the performance of my trigger.
Therefore I'm going to replace that in my code immediately. Anyway, my
problem is not exactly the performances; it's more the insertion time
growing. Indeed I don't understand why the insertion time grows
"exponentially" with my single transaction. If I split my objects
insertion into several transactions (instead of one), the problem seems
to disappear. Would you see any reasons linked to the Postgres
transactions explaining this insertion time growing ?

Regards,
Christian


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 24, 2007 4:33 PM
To: Christian Leclerc
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] How to optimize insert statements ?

"Christian Leclerc" <cleclerc@ilog.fr> writes:
> I'm encountering a performance issue with insert statements.

It looks to me like your trigger is the entire cause of the slowness.
I think you would be well advised to get rid of it and use a serial
column (ie a sequence object) instead of a handmade, poorly performing
substitute for sequences.

            regards, tom lane

Re: How to optimize insert statements ?

From
Sean Davis
Date:
Christian Leclerc wrote:
> Hello Tom,
>
> Thanks for your reply and advise. I understand in your email that the
> use of a sequence object will increase the performance of my trigger.
> Therefore I'm going to replace that in my code immediately. Anyway, my
> problem is not exactly the performances; it's more the insertion time
> growing. Indeed I don't understand why the insertion time grows
> "exponentially" with my single transaction. If I split my objects
> insertion into several transactions (instead of one), the problem seems
> to disappear. Would you see any reasons linked to the Postgres
> transactions explaining this insertion time growing ?
>
I think that the insertion time growing is related to the way in which
you were doing your inserts; using a sequence object will likely fix the
issue and then you will have the answer to your question.

Sean

Re: How to optimize insert statements ?

From
"Christian Leclerc"
Date:
Thank you. I just brought the modifications to my code.
Using the sequences totally solves my issue. The insertion time is
constant and systematically between 15ms and 33ms.

Christian