Thread: why autocommit mode is slow?
Hi,
this is maybe a stupid question, but I don't know how to explain to my coworkers why performing many inserts in autocommit mode is so much slower than making all of them in one transaction.
Why is that so slow?
regards
Szymon
It has nothing to do with autocommit and everything to do with batching them together.
For instance, if you run ten update queries in autocommit mode I would expect it to take exactly the same time as:
Begin
Exec Query1
Commit
…
Begin
Exec Query10
Commit
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Szymon Guz
Sent: Thursday, April 07, 2011 2:00 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] why autocommit mode is slow?
Hi,
this is maybe a stupid question, but I don't know how to explain to my coworkers why performing many inserts in autocommit mode is so much slower than making all of them in one transaction.
Why is that so slow?
regards
Szymon
On Thursday, April 07, 2011 1:59:50 pm Szymon Guz wrote: > Hi, > this is maybe a stupid question, but I don't know how to explain to my > coworkers why performing many inserts in autocommit mode is so much slower > than making all of them in one transaction. > > Why is that so slow? Because autocommit wraps each statement in a transaction versus wrapping all the statements in one transaction. Transactions have overhead, the more you have the more time it is going to take. > > regards > Szymon -- Adrian Klaver adrian.klaver@gmail.com
On 04/08/2011 04:59 AM, Szymon Guz wrote: > Hi, > this is maybe a stupid question, but I don't know how to explain to my > coworkers why performing many inserts in autocommit mode is so much > slower than making all of them in one transaction. > > Why is that so slow? Unless you have synchronous_commit disabled and/or a commit_delay, each commit requires synchronization of all outstanding writes to the disk before the commit can return success. That stops the OS intelligently reordering and batching the writes of multiple statements for better disk I/O efficiency. It also adds pauses before each statement can return, effectively adding latency between statements. The trade-off you get is that after each statement, you know for certain that the results of that statement are on-disk and safe. That's not the case when you do them all in one transaction, or with synchronous_commit off. BTW, the other area to look at with this is query batching. If the client isn't on the same host as the server or at least on a very low latency network, it can be much more efficient to batch queries together to minimize the number of network round trips required. -- Craig Ringer
On Thu, Apr 7, 2011 at 2:59 PM, Szymon Guz <mabewlun@gmail.com> wrote: > Hi, > this is maybe a stupid question, but I don't know how to explain to my > coworkers why performing many inserts in autocommit mode is so much slower > than making all of them in one transaction. > Why is that so slow? The real issue is that transactions have fairly high overhead. When in autocommit mode, the cost of the transaction is much higher than the individual insert, so it's relatively slow. OTOH, when inserting a dozen or a hundred or a thousand rows, the transactional overhead to build up and tear down the transaction becomes smaller and smaller in comparison to the inserts. The inserts in each instance cost the same / take just as long, but the transactional wrapping is only paid for once in the large transaction scenario, and it's paid every time in the autocommit. The good news is postgresql has no real practical limit to transaction size, and the theoretical limit is VERY large (like 2B or so statements I believe.) So no error about running out of rollback space etc.
On 8 April 2011 05:06, Scott Marlowe <scott.marlowe@gmail.com> wrote:
The real issue is that transactions have fairly high overhead. WhenOn Thu, Apr 7, 2011 at 2:59 PM, Szymon Guz <mabewlun@gmail.com> wrote:
> Hi,
> this is maybe a stupid question, but I don't know how to explain to my
> coworkers why performing many inserts in autocommit mode is so much slower
> than making all of them in one transaction.
> Why is that so slow?
in autocommit mode, the cost of the transaction is much higher than
the individual insert, so it's relatively slow. OTOH, when inserting
a dozen or a hundred or a thousand rows, the transactional overhead to
build up and tear down the transaction becomes smaller and smaller in
comparison to the inserts. The inserts in each instance cost the same
/ take just as long, but the transactional wrapping is only paid for
once in the large transaction scenario, and it's paid every time in
the autocommit.
The good news is postgresql has no real practical limit to transaction
size, and the theoretical limit is VERY large (like 2B or so
statements I believe.) So no error about running out of rollback
space etc.
Thank you all for clarification.
regards
Szymon
On Thu, Apr 7, 2011 at 4:59 PM, Szymon Guz <mabewlun@gmail.com> wrote:
this is maybe a stupid question, but I don't know how to explain to my coworkers why performing many inserts in autocommit mode is so much slower than making all of them in one transaction.
as others have said, there is overhead in each transaction. the biggest of these is the file sync to disk of the write-ahead log. on spinning media this involves on average one rotation of the platter. when you batch the inserts, you save that sync per insert.
On Thu, Apr 7, 2011 at 9:59 PM, Szymon Guz <mabewlun@gmail.com> wrote: > this is maybe a stupid question, but I don't know how to explain to my > coworkers why performing many inserts in autocommit mode is so much slower > than making all of them in one transaction. > Why is that so slow? There's a performance bug that slows things down more than necessary. I've worked out a solution that we can use to improve that. Not sure that we should backpatch it though. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Apr 07, 2011 at 10:59:50PM +0200, Szymon Guz wrote: > Hi, > this is maybe a stupid question, but I don't know how to explain to my > coworkers why performing many inserts in autocommit mode is so much slower > than making all of them in one transaction. > > Why is that so slow? If you do BEGIN; [statement]; COMMIT; one after another, is that as slow as autocommit? (My bet is yes.) A -- Andrew Sullivan ajs@crankycanuck.ca