On Fri, Dec 30, 2011 at 6:19 PM, Samuel Gendler
<sgendler@ideasculptor.com> wrote:
>
>
> On Fri, Dec 30, 2011 at 4:03 PM, Scott Marlowe <scott.marlowe@gmail.com>
> wrote:
>>
>> On Fri, Dec 30, 2011 at 4:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Jan Bakuwel <jan.bakuwel@greenpeace.org> writes:
>> >> Why-o-why have the PostgreSQL developers decided to do it this way...?
>> >
>> > Because starting and cleaning up a subtransaction is an expensive thing.
>> > If we had auto-rollback at the statement level, you would be paying that
>> > overhead for every statement in every transaction, whether you need it
>> > or not (since obviously there's no way to forecast in advance whether a
>> > statement will fail). Making it depend on explicit savepoints allows
>> > the user/application to control whether that overhead is expended or
>> > not.
>> >
>> > If you want to pay that price all the time, there are client-side
>> > frameworks that will do it for you, or you can roll your own easily
>> > enough. So we do not see it as a big deal that the database server
>> > itself doesn't act that way.
>>
>> Having used PostgreSQL a LOT, I find that being able to throw an
>> entire update at the db and having it fail / be rolled back / CTRL-C
>> out of and fix the problem is actually much less work than the
>> frameworks for other databases. Once you've chased down bad data in a
>> load file a few times, it's really pretty easy to spot and fix these
>> issues and just run the whole transaction again. Since PostgreSQL
>> doesn't have a very big penalty for rolling back a whole transaction
>> it's not that bad. Some dbs, like MySQL with innodb table handler
>> have a 10:1 or greater penalty for rollbacks. Insert a million rows
>> in innodb then issue a rollback and go get a sandwich. In PostgreSQL
>> a rollback is generally instantaneous, with the only real cost being
>> bloat in the tables or indexes.
>
>
> More to the point - if a statement is truly independent of all the other
> statements in a transaction, it would seem that the transaction itself is
> poorly defined. The whole point of a transaction is to define an atomic
> unit of work. If you don't care about atomicity, enable auto commit and just
> catch the constraint violation exception and continue on your merry way.
But the performance penalty for autocommit is huge. It's still almost
always faster to run a single big transaction and fix errors than to
do single commits when you're doing a large import.