Thread: slow inserts

slow inserts

From
Jodi Kanter
Date:
I am currently using a Perl data loader that was set up to load data to three particular tables. The majority of this data is entered into one table. The scripts obtain the data by parsing it out of an Excel spreadsheet. My loads seem to take a very long time. Each file only has about 12,000 rows, yet it takes close to 25 minutes to load one file. I have placed some debugging syntax in the code and it seems that the extra time if related to postgres as I had originally thought it may have to do with the parsing of the Excel file.
 
I have tried turning off FSYNC but there is no difference in load time. I also tried removing the indexes from the table in which most of the data is loaded but no luck. By the way, is there anyway to confirm that the FSYNC option was turned off correctly? a way to view status or something?
 
Any thoughts on what might be going on? another performance tuning trick that I have not thought of?
 
Thanks for your help,
Jodi Kanter
 

_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu


 

 

 

Re: slow inserts

From
"Peter Darley"
Date:
Jodi,
    I believe DBI defaults to autocommitting queries, which slows it down.  If you aren't specifically turning off the autocommit, I'd suggest it, which would make all the inserts go in one transaction which makes it much faster.  You'd do something like:
    DBI->connect("dbi:Pg:dbname=dbname;host=hostname", 'user', 'password', {AutoCommit => 0})
Thanks,
Peter Darley
-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jodi Kanter
Sent: Wednesday, March 20, 2002 11:19 AM
To: Postgres Admin List
Subject: [ADMIN] slow inserts

I am currently using a Perl data loader that was set up to load data to three particular tables. The majority of this data is entered into one table. The scripts obtain the data by parsing it out of an Excel spreadsheet. My loads seem to take a very long time. Each file only has about 12,000 rows, yet it takes close to 25 minutes to load one file. I have placed some debugging syntax in the code and it seems that the extra time if related to postgres as I had originally thought it may have to do with the parsing of the Excel file.
 
I have tried turning off FSYNC but there is no difference in load time. I also tried removing the indexes from the table in which most of the data is loaded but no luck. By the way, is there anyway to confirm that the FSYNC option was turned off correctly? a way to view status or something?
 
Any thoughts on what might be going on? another performance tuning trick that I have not thought of?
 
Thanks for your help,
Jodi Kanter
 

_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu


 

 

 

Re: slow inserts

From
"Nick Fankhauser"
Date:
Jodi-

Have you tried turning autocommit off & doing a single commit after the
load?

-NickF

-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jodi Kanter
Sent: Wednesday, March 20, 2002 2:19 PM
To: Postgres Admin List
Subject: [ADMIN] slow inserts


I am currently using a Perl data loader that was set up to load data to
three particular tables. The majority of this data is entered into one
table. The scripts obtain the data by parsing it out of an Excel
spreadsheet. My loads seem to take a very long time. Each file only has
about 12,000 rows, yet it takes close to 25 minutes to load one file. I have
placed some debugging syntax in the code and it seems that the extra time if
related to postgres as I had originally thought it may have to do with the
parsing of the Excel file.

I have tried turning off FSYNC but there is no difference in load time. I
also tried removing the indexes from the table in which most of the data is
loaded but no luck. By the way, is there anyway to confirm that the FSYNC
option was turned off correctly? a way to view status or something?

Any thoughts on what might be going on? another performance tuning trick
that I have not thought of?

Thanks for your help,
Jodi Kanter

_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu







Re: slow inserts

From
Jodi Kanter
Date:
None of the data is actually committed to the database until the scripts
complete so I believe that autocommit is turned off.

----- Original Message -----
From: "Nick Fankhauser" <nickf@ontko.com>
To: "Jodi Kanter" <jkanter@virginia.edu>; "Postgres Admin List"
<pgsql-admin@postgresql.org>
Sent: Wednesday, March 20, 2002 2:42 PM
Subject: Re: [ADMIN] slow inserts


> Jodi-
>
> Have you tried turning autocommit off & doing a single commit after the
> load?
>
> -NickF
>
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org
> [mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jodi Kanter
> Sent: Wednesday, March 20, 2002 2:19 PM
> To: Postgres Admin List
> Subject: [ADMIN] slow inserts
>
>
> I am currently using a Perl data loader that was set up to load data to
> three particular tables. The majority of this data is entered into one
> table. The scripts obtain the data by parsing it out of an Excel
> spreadsheet. My loads seem to take a very long time. Each file only has
> about 12,000 rows, yet it takes close to 25 minutes to load one file. I
have
> placed some debugging syntax in the code and it seems that the extra time
if
> related to postgres as I had originally thought it may have to do with the
> parsing of the Excel file.
>
> I have tried turning off FSYNC but there is no difference in load time. I
> also tried removing the indexes from the table in which most of the data
is
> loaded but no luck. By the way, is there anyway to confirm that the FSYNC
> option was turned off correctly? a way to view status or something?
>
> Any thoughts on what might be going on? another performance tuning trick
> that I have not thought of?
>
> Thanks for your help,
> Jodi Kanter
>
> _______________________________
> Jodi L Kanter
> BioInformatics Database Administrator
> University of Virginia
> (434) 924-2846
> jkanter@virginia.edu
>
>
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


Re: slow inserts

From
Rolf Luettecke
Date:
Hi Jodi,

> None of the data is actually committed to the database until the scripts
> complete so I believe that autocommit is turned off.
>
what if you try to write the output of your script into a separate file
and pipe it to a psql as input? What i mean is to strip of the processing
time for the "excel-part". Still 25 minutes to do the job?

We often insert data in the same amount (10.000 - 100.000 rows per job)
within a few seconds/minutes. A few months ago I had the same problem
"writing" a dbf-file from postgres-data: The select-statement took
milliseconds, but the conversion into db-format seems to be endless.

BTW: We also had a table (10.000s of rows / daily vacuumed) which was
rather slow during inserts (PostgreSQL 7.1.2). After upgrading to
version 7.1.3 and completely rebuild the tables, the problem went away.

Hope it helps
R. Luettecke

--
MICHAEL TELECOM AG
Bruchheide 34 - 49163 Bohmte
Fon: +49 5471 806-0
rolf.luettecke@michael-telecom.de
http://www.michael-telecom.de

Re: slow inserts

From
Morten Sickel
Date:
Jodi Kanter wrote:

> I am currently using a Perl data loader that was set up to load data to
three particular
> tables.
(Snip)
> I have placed some debugging syntax in the code and it seems that the
extra time if
> related to postgres as I  had  originally thought it may have to do with
the parsing of
> the Excel file.

You don't mention it, but I assume you are using DBI/pg. You are sure you
are setting up you
insert qyery handlers once and then reuse them? >s setting up a query
handler takes a lot of time.

ie

my $dbh=DBI->connect(dbi:Pg ...);

my $insh = $dbh->prepare("Insert into table values (?,?,?)";

foreach ($excelrow){
 parse;
 $insh->execute($data1,$data2,$data3);
}

I have written a few script of that kind my self, and I was really surprised
how much it mattered when I managed to move a $dbi->prepare out of the
insert loop.


regards

--
Morten Sickel
Norwegian Radiation Protection Authority

Re: slow inserts

From
Ferdinand Smit
Date:
<ZIP>

> my $dbh=DBI->connect(dbi:Pg ...);
>
> my $insh = $dbh->prepare("Insert into table values (?,?,?)";
>

$insh->begin_work;

> foreach ($excelrow){
>  parse;
>  $insh->execute($data1,$data2,$data3);
> }

$insh->commit;

>
> I have written a few script of that kind my self, and I was really
> surprised how much it mattered when I managed to move a $dbi->prepare out
> of the insert loop.

Try to use transactions, that increates the speed too.

Regards,
Ferdinand