On 2/6/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
> On Mon, 2007-02-05 at 18:35, Karen Hill wrote:
> > I have a pl/pgsql function that is inserting 200,000 records for
> > testing purposes. What is the expected time frame for this operation
> > on a pc with 1/2 a gig of ram and a 7200 RPM disk? The processor is
> > a 2ghz cpu. So far I've been sitting here for about 2 million ms
> > waiting for it to complete, and I'm not sure how many inserts postgres
> > is doing per second.
>
> That really depends. Doing 200,000 inserts as individual transactions
> will be fairly slow. Since PostgreSQL generally runs in autocommit
> mode, this means that if you didn't expressly begin a transaction, you
> are in fact inserting each row as a transaction. i.e. this:
I think OP is doing insertion inside a pl/pgsql loop...transaction is
implied here. For creating test data, generate_series or
insert...select is obviously the way to go. If that's unsuitable for
some reason, I would suggest RAISE NOTICE every n records so you can
monitor the progress and make sure something is not binding up in a
lock or something like that. Be especially wary of degrading
performance during the process.
Another common problem with poor insert performance is a RI check to
an un-indexed column. In-transaction insert performance should be
between 1k and 10k records/second in normal situations, meaning if you
haven't inserted 1 million records inside of an hour something else is
going on.
Generally, insertion performance from fastest to slowest is:
* insert select generate_series...
* insert select
* copy
* insert (),(),()[...] (at least 10 or preferably 100 insertions)
* begin, prepare, n prepared inserts executed, commit
* begin, n inserts, commit
* plpgsql loop, single inserts
* n inserts outside of transaction.
The order of which is faster might not be absolutely set in stone
(copy might beat insert select for example), but the top 4 methods
will always be much faster than the bottom 4.
merlin