> I was doing some testing on "insert" compared to "select into". I
> inserted 100 000 rows (with 8 column values) into a table, which took 14
> seconds, compared to a select into, which took 0.8 seconds.
> (fyi, the inserts where batched, autocommit was turned off and it all
> happend on the local machine)
Did you use prepared statements ?
Did you use INSERT INTO ... VALUES () with a long list of values, or just
100K insert statements ?
It's the time to parse statements, plan, execute, roundtrips with the
client, context switches, time for your client library to escape the data
and encode it and for postgres to decode it, etc. In a word : OVERHEAD.
By the way which language and client library are you using ?
FYI 14s / 100k = 140 microseconds per individual SQL query. That ain't
slow at all.
> Does the select into translate into a specially optimised function in c
> that can cut corners which a insert can not do (e.g. lazy copying), or
> is it some other reason?
Yeah : instead of your client having to encode 100K * 8 values, send it
over a socket, and postgres decoding it, INSERT INTO SELECT just takes the
data, and writes the data. Same thing as writing a file a byte at a time
versus using a big buffer.
> The reason I am asking is that select into shows that a number of rows
> can be inserted into a table quite a lot faster than one would think was
> possible with ordinary sql. If that is the case, it means that if I
> write an pl-pgsql insert function in C instead of sql, then I can have
> my db perform order of magnitude faster.
Fortunately this is already done for you : there is the PREPARE
statement, which will remove the parsing overhead. If you must insert many
rows, use VALUES (),(),()...