Thread: increasingly slow insert/copy performance
howdy, i'm by no means a postgres expert, so i'm probably making a dumb mistake, but this is v. troubling. we have a script that bulk-uploads about 45K rows. we do this via perl DBI using the COPY functionality. the problem is that we do this about once per 5 minutes and the time it takes to do the operation steadily increases (going from about 5s to over 5m over the course of 24 hrs). the table has one primary key (so hence one index), that we drop and recreate during the bulk load. here are the details of the test case: CREATE TABLE test (one int primary key, two int, three int, four int, five int); script: #!/usr/bin/perl -w use DBI; # Load the DBI module my $DBusername = "username"; my $DBpassword = "password"; my $dbin = "Pg:dbname=useradmin;host=databasehost"; my $table = "test"; my $i; my $count; $dbh = DBI->connect("dbi:$dbin",$DBusername,$DBpassword, { PrintError => 1, ### Don't report errors via warn( ) RaiseError => 1, ### Do report errors via die( ) AutoCommit => 0 ### Start a transaction block } ); $dbh->do("LOCK TABLE $table IN SHARE UPDATE EXCLUSIVE MODE") or die "Cannot lock $table...\n"; $dbh->do("ALTER TABLE $table DROP CONSTRAINT ${table}_pkey") or die "Cannot drop constraint ${table}_pkey...\n"; $dbh->do("DELETE FROM $table WHERE one>0") or die "Cannot empty $table...\n"; $dbh->do("COPY $table (one,two,three,four,five) FROM stdin") or die "Cannot COPY to DB ...\n"; for $i (1..45000){ $dbh->func("$i\t$i\t$i\t$i\t$i\n", 'putline') or die "Couldn't update $table for $i\n"; } $dbh->func("\\.\n", 'putline'); $dbh->func('endcopy'); if ($@) { $dbh->rollback; warn "Rolled back transaction: $@\n"; } else { $dbh->commit; } $dbh->do("ALTER TABLE $table add CONSTRAINT ${table}_pkey primary key (one)") or die "Cannot add constraint ${table}_pkey...\n"; $dbh->commit; $dbh->disconnect(); ------------------------------------------------------------- as i said, execution times start at about 2.5-5s and end up at several minutes. dropping and recreating the table fixes the problem. vacuum does not. any thoughts? i searched the list and think we're doing the right stuff (by dropping the constraints and using the COPY function). i'm not on the list at this point, so i'll check the web archives, but replies to me direclty are also appreciated. thanks so much, t. -- _____________________________________________________________________ todd underwood director of operations & security renesys - interdomain intelligence todd@renesys.com www.renesys.com
Try turning the fsync off in your configuration file. Joe, -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Todd Underwood Sent: Sunday, April 17, 2005 9:18 AM To: pgsql-general@postgresql.org Subject: [GENERAL] increasingly slow insert/copy performance howdy, i'm by no means a postgres expert, so i'm probably making a dumb mistake, but this is v. troubling. we have a script that bulk-uploads about 45K rows. we do this via perl DBI using the COPY functionality. the problem is that we do this about once per 5 minutes and the time it takes to do the operation steadily increases (going from about 5s to over 5m over the course of 24 hrs). the table has one primary key (so hence one index), that we drop and recreate during the bulk load. here are the details of the test case: CREATE TABLE test (one int primary key, two int, three int, four int, five int); script: #!/usr/bin/perl -w use DBI; # Load the DBI module my $DBusername = "username"; my $DBpassword = "password"; my $dbin = "Pg:dbname=useradmin;host=databasehost"; my $table = "test"; my $i; my $count; $dbh = DBI->connect("dbi:$dbin",$DBusername,$DBpassword, { PrintError => 1, ### Don't report errors via warn( ) RaiseError => 1, ### Do report errors via die( ) AutoCommit => 0 ### Start a transaction block } ); $dbh->do("LOCK TABLE $table IN SHARE UPDATE EXCLUSIVE MODE") or die "Cannot lock $table...\n"; $dbh->do("ALTER TABLE $table DROP CONSTRAINT ${table}_pkey") or die "Cannot drop constraint ${table}_pkey...\n"; $dbh->do("DELETE FROM $table WHERE one>0") or die "Cannot empty $table...\n"; $dbh->do("COPY $table (one,two,three,four,five) FROM stdin") or die "Cannot COPY to DB ...\n"; for $i (1..45000){ $dbh->func("$i\t$i\t$i\t$i\t$i\n", 'putline') or die "Couldn't update $table for $i\n"; } $dbh->func("\\.\n", 'putline'); $dbh->func('endcopy'); if ($@) { $dbh->rollback; warn "Rolled back transaction: $@\n"; } else { $dbh->commit; } $dbh->do("ALTER TABLE $table add CONSTRAINT ${table}_pkey primary key (one)") or die "Cannot add constraint ${table}_pkey...\n"; $dbh->commit; $dbh->disconnect(); ------------------------------------------------------------- as i said, execution times start at about 2.5-5s and end up at several minutes. dropping and recreating the table fixes the problem. vacuum does not. any thoughts? i searched the list and think we're doing the right stuff (by dropping the constraints and using the COPY function). i'm not on the list at this point, so i'll check the web archives, but replies to me direclty are also appreciated. thanks so much, t. -- _____________________________________________________________________ todd underwood director of operations & security renesys - interdomain intelligence todd@renesys.com www.renesys.com ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
On Sun, 2005-04-17 at 09:17, Todd Underwood wrote: > howdy, > > i'm by no means a postgres expert, so i'm probably making a dumb > mistake, but this is v. troubling. > > we have a script that bulk-uploads about 45K rows. we do this via > perl DBI using the COPY functionality. > > the problem is that we do this about once per 5 minutes and the time > it takes to do the operation steadily increases (going from about 5s > to over 5m over the course of 24 hrs). the table has one primary key > (so hence one index), that we drop and recreate during the bulk load. I'm guessing that either you aren't vacuuming enough, or your fsm settings are too low. What does vacuum verbose say? What are your fsm settings? Does 'vacuum full <tablename>' fix the problem? Do NOT TURN OFF FSYNC.
Todd Underwood <todd@renesys.com> writes: > $dbh->do("LOCK TABLE $table IN SHARE UPDATE EXCLUSIVE MODE") > or die "Cannot lock $table...\n"; > $dbh->do("ALTER TABLE $table DROP CONSTRAINT ${table}_pkey") > or die "Cannot drop constraint ${table}_pkey...\n"; > $dbh->do("DELETE FROM $table WHERE one>0") > or die "Cannot empty $table...\n"; > $dbh->do("COPY $table (one,two,three,four,five) FROM stdin") > or die "Cannot COPY to DB ...\n"; The problem with this is that the DELETE doesn't physically remove the old rows, therefore you are going to be steadily bloating the table. Can you use a TRUNCATE instead? (Not if you need to keep some rows, obviously, but it's not clear whether your WHERE is actually useful.) Can you commit the deletion and then VACUUM before adding the new rows? (Probably not, if you need other transactions to see a valid table all the time --- though if you do, I wonder why you are committing before you add back the index.) If not, probably the best best is to issue a VACUUM just after committing each of these cycles. That will at least hold the table size to twice the theoretical minimum. BTW, the above is deadlock-prone because ALTER TABLE will take ACCESS EXCLUSIVE lock; taking a lesser lock earlier in the transaction is not only useless but counterproductive. regards, tom lane