psql vs perl prepared inserts - Mailing list pgsql-general
From | Matt Van Mater |
---|---|
Subject | psql vs perl prepared inserts |
Date | |
Msg-id | 424db35505041214582ebef95d@mail.gmail.com Whole thread Raw |
Responses |
Re: psql vs perl prepared inserts
Re: psql vs perl prepared inserts |
List | pgsql-general |
I've been experimenting with loading a large amount of data into a fairly simple database using both psql and perl prepared statements. Unfortunately I'm seeing no appreciable differences between the two methods, where I was under the impression that prepared statements should be much faster (in my case, they are slightly slower). I have included a pseudocode/subset of the perl code I use below. You can see the prepare statement outside the loop and the execute statement inside the loop. Alternatively you can see that I write every INSERT statement to a text file which I then load by using `psql dbname -f bulksql_load.sql`. Normally I only have either the prepare or the print-to-file in the loop, but i've included both to make the pseudocode smaller. Using a simple `time` command from my system it looks like the execution time for loading all the data in both scenarios is about 50 seconds. FYI, the infile has 18k lines, and after parsing and the db enforcing uniqueness there are 15.5k rows in the results table of the db. This works out to ~300 inserts per second with on pgsql 7.3.2 with fsync turned off. I think that is a decent performance for this old box, I'm just confused as to why the prepared statements don't seem to give any speed boost as advertised. Could the fact that many of my inserts have 15 columns slow down the prepared statements to the point where they're no longer useful as a speed enhancement? Or perhaps it's because I'm explicitly inserting each field/value pair, even if many of them are null (which I think is generally considered a good habit). Any other suggestions you might have would be welcome. Please forgive me if I should have posted this to pgsql-performance or some perl list, but I think the comparison with psql makes it relevant. Hopefully this won't be too messy: #!/usr/bin/perl open (IN,"scan.nbe"); open (OUT,">bulksql_load.sql"); use DBI; $dbh = DBI->connect("dbi:Pg:dbname=nessus"); $sth_res = $dbh->prepare("INSERT INTO results (scan_id,subnet,host,service,plugin_id,type,description,see_also,solution,risk_factor,cve_id,bid_id,port,proto,service_name) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"); $sth_tim = $dbh->prepare("INSERT INTO timestamps (scan_id,hostip_or_nul,start_or_end,scan_time) VALUES (?,?,?,?)"); while (<IN>){ if (/^timestamps/){ parse_ts(); $sth_tim->execute($scan_id,$hostip_or_nul,$start_or_end,$scan_time); print OUT "INSERT INTO timestamps (scan_id,hostip_or_nul,start_or_end,scan_time) VALUES ($scan_id,$hostip_or_nul,$start_or_end,$scan_time)\n"; } elsif (/^results/) { parse_res(); $sth_res->execute($scan_id,$subnet,$host,$service,$plugin_id,$type,$description,$see_also,$solution,$risk_factor,$cve_id,$bid_id,$port,$proto,$service_name); print OUT "INSERT INTO results (scan_id,subnet,host,service,plugin_id,type,description,see_also,solution,risk_factor,cve_id,bid_id,port,proto,service_name) VALUES ($scan_id,$subnet,$host,$service,$plugin_id,$type,$description,$see_also,$solution,$risk_factor,$cve_id,$bid_id,$port,$proto,$service_name)\n"; } }
pgsql-general by date: