Thread: How Many Inserts Per Transactions

How Many Inserts Per Transactions

From
Trevor Astrope
Date:
I was wondering if anyone found a sweet spot regarding how many inserts to
do in a single transaction to get the best performance? Is there an
approximate number where there isn't any more performance to be had or
performance may drop off?

It's just a general question...I don't have any specific scenario, other
than there are multiple backends doing many inserts.

Thanks,

Trevor


Re: How Many Inserts Per Transactions

From
"scott.marlowe"
Date:
On Tue, 5 Aug 2003, Trevor Astrope wrote:

> I was wondering if anyone found a sweet spot regarding how many inserts to
> do in a single transaction to get the best performance? Is there an
> approximate number where there isn't any more performance to be had or
> performance may drop off?
>
> It's just a general question...I don't have any specific scenario, other
> than there are multiple backends doing many inserts.

I've found that after 1,000 or so inserts, there's no great increase in
speed.


Re: How Many Inserts Per Transactions

From
Christopher Browne
Date:
Trevor Astrope wrote:
> I was wondering if anyone found a sweet spot regarding how many
> inserts to do in a single transaction to get the best performance?
> Is there an approximate number where there isn't any more
> performance to be had or performance may drop off?
>
> It's just a general question...I don't have any specific scenario,
> other than there are multiple backends doing many inserts.

The ideal should be enough to make the work involved in establishing
the transaction context be a small part of the cost of processing the
queries.

Thus, 2 inserts should be twice as good as 1, by virtue of dividing
the cost of the transaction 2 ways.

Increasing the number of inserts/updates to 10 means splitting the
cost 10 ways.

Increasing the number to 1000 means splitting the cost 1000 ways,
which, while better than merely splitting the cost 10 ways, probably
_isn't_ a stunningly huge further improvement.

The benefits of grouping more together drops off; you'll probably NOT
notice much difference between grouping 10,000 updates together into a
transaction as compared to grouping 15,000 updates together.

Fortunately, it doesn't drop off to being downright WORSE.

On Oracle, I have seen performance Suck Badly when using SQL*Load; if
I grouped too many updates together, it started blowing up the
"rollback segment," which was a Bad Thing.  And in that kind of
context, there will typically be some "sweet spot" where you want to
commit transactions before they grow too big.

In contrast, pg_dump/pg_restore puts the load of each table into a
single COPY statement, so that if there are 15,000,000 entries in the
table, that gets grouped into a single (rather enormous) transaction.
And doing things that way presents no particular problem.
--
output = ("cbbrowne" "@" "libertyrms.info")
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

Re: How Many Inserts Per Transactions

From
Hannu Krosing
Date:
Trevor Astrope kirjutas T, 05.08.2003 kell 18:59:
> I was wondering if anyone found a sweet spot regarding how many inserts to
> do in a single transaction to get the best performance? Is there an
> approximate number where there isn't any more performance to be had or
> performance may drop off?
>
> It's just a general question...I don't have any specific scenario, other
> than there are multiple backends doing many inserts.

I did test on huge (up to 60 million rows) simple table (5 fields with
primary key) and found that at that size many inserts per transaction
was actually a little slower than single inserts. It probably had to do
with inserting/checking new index entries and moving index pages from/to
disk.

With small sizes or no index ~100 inserts/transaction was significantly
faster though.

I did run several (10-30) backends in parallel.

The computer was quad Xeon with 2GB RAM and ~50 MB/sec RAID.

------------------
Hannu


Re: How Many Inserts Per Transactions

From
"Shridhar Daithankar"
Date:
On 5 Aug 2003 at 12:28, Christopher Browne wrote:
> On Oracle, I have seen performance Suck Badly when using SQL*Load; if
> I grouped too many updates together, it started blowing up the
> "rollback segment," which was a Bad Thing.  And in that kind of
> context, there will typically be some "sweet spot" where you want to
> commit transactions before they grow too big.
>
> In contrast, pg_dump/pg_restore puts the load of each table into a
> single COPY statement, so that if there are 15,000,000 entries in the
> table, that gets grouped into a single (rather enormous) transaction.
> And doing things that way presents no particular problem.

Well, psotgresql recycles WAL files and use data files as well to store
uncommitted transaction. Correct me if I am wrong.

Oracle does not do this.

What does this buy? Oracle has constant time commits. I doubt if postgresql has
them with such a design.

For what hassle that is worth, I would buy expand-on-disk as you go approach of
postgresql rather than spending time designing rollback segments for each
application.

It's not nice when customer reports rollback segment overflow. Tablespaces over
file is royal pain when it does not work.

Just a thought..

Bye
 Shridhar

--
boss, n:    According to the Oxford English Dictionary, in the Middle Ages the
words "boss" and "botch" were largely synonymous, except that boss,    in addition
to meaning "a supervisor of workers" also meant "an    ornamental stud."