Re: [HACKERS] Postgres Speed or lack thereof - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] Postgres Speed or lack thereof |
Date | |
Msg-id | 21212.916630479@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [HACKERS] Postgres Speed or lack thereof (Vadim Mikheev <vadim@krs.ru>) |
Responses |
Re: [HACKERS] Postgres Speed or lack thereof
Re: [HACKERS] Postgres Speed or lack thereof |
List | pgsql-hackers |
Vadim Mikheev <vadim@krs.ru> writes: > John Holland wrote: >> I've been lurking on this list for a little while. I have just done a >> little tinkering with Postgres 6.4.2, comparing it to Oracle and mySQL on >> Linux. It would appear that just doing a lot of inserts (ie 40000) in a >> loop is enormously slower in Postgres in two ways that I tried it. >> One - using a loop in Java that makes a JDBC call to insert. >> Two- using plpgsql as a comparision to PL/SQL. >> >> I really like the idea of an open source DB and am impressed with a lot I >> see about PostgreSQL - however the speed difference is pretty bad - >> 4.5 minutes versus about 20 seconds. > Try to use BEGIN/END to run all inserts in single transaction > and please let us know results. I tried this myself and found that wrapping BEGIN/END around a series of INSERT statements didn't make much difference at all. On an HP 715 (75MHz RISC box, not very heavy iron by current standards), I inserted about 13000 records into an initially-empty table having 38 columns (just because it's data I had handy...). I timed it at: Individual INSERT statements (as generated by pg_dump -d):33 inserts/sec Same with BEGIN/END wrapped around 'em:34 inserts/sec Single COPY statement (as generated by pg_dump without -d):1400 inserts/sec This was for a simple table with no indexes. In reality, this table type has four b-tree indexes on different columns in our normal usage. The COPY speed dropped to 325 inserts/sec when I had the indexes in place. I didn't bother trying the INSERTs that way. The conventional wisdom is that you should use COPY for bulk loading, and this result supports it... John didn't say what hardware he's using, so I don't know how comparable my result is to his 150 inserts/sec --- that might have been on a table with many fewer columns, or maybe his machine is just faster. As for where the time is going: "top" showed that the time for the INSERT ops was almost all going into backend CPU time. My guess is that most of the cost is in parsing and planning the INSERT statements. Pre-planned statement skeletons might help, but I think the real answer will be to find a way to avoid constructing and parsing SQL statements for simple operations like INSERT. (One reason I'm interested in the idea of a CORBA interface is that it might help here.) My guess is that Oracle and mySQL have APIs that allow the construction of an SQL INSERT command to be bypassed, and that's why they show up better on this operation. regards, tom lane
pgsql-hackers by date: