Thread: How Many Inserts Per Transactions
I was wondering if anyone found a sweet spot regarding how many inserts to do in a single transaction to get the best performance? Is there an approximate number where there isn't any more performance to be had or performance may drop off? It's just a general question...I don't have any specific scenario, other than there are multiple backends doing many inserts. Thanks, Trevor
On Tue, 5 Aug 2003, Trevor Astrope wrote: > I was wondering if anyone found a sweet spot regarding how many inserts to > do in a single transaction to get the best performance? Is there an > approximate number where there isn't any more performance to be had or > performance may drop off? > > It's just a general question...I don't have any specific scenario, other > than there are multiple backends doing many inserts. I've found that after 1,000 or so inserts, there's no great increase in speed.
Trevor Astrope wrote: > I was wondering if anyone found a sweet spot regarding how many > inserts to do in a single transaction to get the best performance? > Is there an approximate number where there isn't any more > performance to be had or performance may drop off? > > It's just a general question...I don't have any specific scenario, > other than there are multiple backends doing many inserts. The ideal should be enough to make the work involved in establishing the transaction context be a small part of the cost of processing the queries. Thus, 2 inserts should be twice as good as 1, by virtue of dividing the cost of the transaction 2 ways. Increasing the number of inserts/updates to 10 means splitting the cost 10 ways. Increasing the number to 1000 means splitting the cost 1000 ways, which, while better than merely splitting the cost 10 ways, probably _isn't_ a stunningly huge further improvement. The benefits of grouping more together drops off; you'll probably NOT notice much difference between grouping 10,000 updates together into a transaction as compared to grouping 15,000 updates together. Fortunately, it doesn't drop off to being downright WORSE. On Oracle, I have seen performance Suck Badly when using SQL*Load; if I grouped too many updates together, it started blowing up the "rollback segment," which was a Bad Thing. And in that kind of context, there will typically be some "sweet spot" where you want to commit transactions before they grow too big. In contrast, pg_dump/pg_restore puts the load of each table into a single COPY statement, so that if there are 15,000,000 entries in the table, that gets grouped into a single (rather enormous) transaction. And doing things that way presents no particular problem. -- output = ("cbbrowne" "@" "libertyrms.info") <http://dev6.int.libertyrms.com/> Christopher Browne (416) 646 3304 x124 (land)
Trevor Astrope kirjutas T, 05.08.2003 kell 18:59: > I was wondering if anyone found a sweet spot regarding how many inserts to > do in a single transaction to get the best performance? Is there an > approximate number where there isn't any more performance to be had or > performance may drop off? > > It's just a general question...I don't have any specific scenario, other > than there are multiple backends doing many inserts. I did test on huge (up to 60 million rows) simple table (5 fields with primary key) and found that at that size many inserts per transaction was actually a little slower than single inserts. It probably had to do with inserting/checking new index entries and moving index pages from/to disk. With small sizes or no index ~100 inserts/transaction was significantly faster though. I did run several (10-30) backends in parallel. The computer was quad Xeon with 2GB RAM and ~50 MB/sec RAID. ------------------ Hannu
On 5 Aug 2003 at 12:28, Christopher Browne wrote: > On Oracle, I have seen performance Suck Badly when using SQL*Load; if > I grouped too many updates together, it started blowing up the > "rollback segment," which was a Bad Thing. And in that kind of > context, there will typically be some "sweet spot" where you want to > commit transactions before they grow too big. > > In contrast, pg_dump/pg_restore puts the load of each table into a > single COPY statement, so that if there are 15,000,000 entries in the > table, that gets grouped into a single (rather enormous) transaction. > And doing things that way presents no particular problem. Well, psotgresql recycles WAL files and use data files as well to store uncommitted transaction. Correct me if I am wrong. Oracle does not do this. What does this buy? Oracle has constant time commits. I doubt if postgresql has them with such a design. For what hassle that is worth, I would buy expand-on-disk as you go approach of postgresql rather than spending time designing rollback segments for each application. It's not nice when customer reports rollback segment overflow. Tablespaces over file is royal pain when it does not work. Just a thought.. Bye Shridhar -- boss, n: According to the Oxford English Dictionary, in the Middle Ages the words "boss" and "botch" were largely synonymous, except that boss, in addition to meaning "a supervisor of workers" also meant "an ornamental stud."