Re: Problem with aborting entire transactions on error - Mailing list pgsql-general

From Kevin Grittner
Subject Re: Problem with aborting entire transactions on error
Date
Msg-id 20121210210715.80060@gmx.com
Whole thread Raw
In response to Problem with aborting entire transactions on error  (Zbigniew <zbigniew2011@gmail.com>)
Responses Re: Problem with aborting entire transactions on error  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Problem with aborting entire transactions on error  (Zbigniew <zbigniew2011@gmail.com>)
List pgsql-general
Zbigniew wrote:

> No idea, is it simple or very complicated. But if it is simple -
> why not?

When I first started using PostgreSQL I was used to a database
product which would roll back an entire statement on failure, but
leave the transaction open for another try. This is handy for
interactive work, which is why psql (an interactive tool) gives you
an easy way to do it, using subtransactions behind the scenes.
Doing it in the general case is risky partly because almost all
data definition language (DDL) statements are allowed in
transactions; so you could have a transaction which inserted all
data from one table into somewhere else and then dropped the source
table. If your insert fails, you probably don't want to drop the
table.

PostgreSQL behavior was a bit of a surprise to me at first, but it
wasn't a hard adjustment for me, and I never thought to suggest a
change because it just seemed safer this way. I've generally found
an intermediate table to be a good idea anyway.

> Maybe someone of the devs team could answer this doubt: is the
> proposed change simple to implement?

I now work on PostgreSQL internals as part of my job. To get a
handle on how much it would take to implement, you might want to
read these two README files which document the sub-systems most
directly involved.

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/transam/README;hb=HEAD
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/executor/README;hb=HEAD

For sample queries and diagrams showing how it works a little
more graphically, you might wnat to look at:

http://momjian.us/main/writings/pgsql/mvcc.pdf

The problem is that you might have affected a million rows in the
transaction before the failing statement, and then that statement
might affect another million before it failes. What would you do
with the xmin/xmax values in the affected tuples to get visibility
correct if you haven't executed that failing transaction within a
subtransaction? I don't see how you could do it without something
very like a subtransaction for each statement executed within the
transaction, and that carries significant overhead. Nobody wants to
slow down all their work to cater to this one special case.

Could an option like this be added to the server? Probably. Has
anyone worked on such a feature? Not that I know of. Would such a
feature be accepted if written? I think a better case would need to
be made for its value compared to the added code complexity and
maintenance burden than has been made so far. Since this is
primarily of interest for bulk loads and interactive work, and
there are already solutions for both of those, it would probably be
seen as not worth its weight.

It's only by careful gatekeeping that the source code has been kept
manageable. A quick count shows that there are 1,167,732 lines of
code, counting only *.c and *.h files. Every added line has a
maintenance cost.

I hope this helps.

-Kevin


pgsql-general by date:

Previous
From: akp geek
Date:
Subject: to_tsquery and to_tsvector .. problem with Y
Next
From: Chris Angelico
Date:
Subject: Re: Problem with aborting entire transactions on error