Re: Nested transactions support for code composability - Mailing list psycopg

From Daniel Fortunov
Subject Re: Nested transactions support for code composability
Date
Msg-id CAH1rg6bzuNNcJGn94fLA2FTo08O3xxMDd-jos3g_5JEbhuTOqA@mail.gmail.com
Whole thread Raw
In response to [psycopg] Nested transactions support for code composability  (Daniel Fortunov <psycopg-list@danielfortunov.com>)
Responses Nested transactions support for code composability  (Daniel Fortunov <postgresql@danielfortunov.com>)
Re: Nested transactions support for code composability  (Reuben Rissler <silrep@emypeople.net>)
List psycopg
Based on the motivations outlined in the below message to this list (3 years ago), I have implemented a Transaction context manager that provides what I think is the most intuitive way to deal with transactions. (Functionally similar to xact / Django atomic, but available directly on psycopg2 connections)

We have been using this in production for over a year in a fairly large codebase that does a lot of database interactions directly using psycopg2 (without any other libraries layered on top).

I'd always though that this would be a good candidate for inclusion in psycopg2, and now that it is stable, I'd like to solicit feedback.

This Transaction context manager replaces the DB-API model of handling transactions (which I have found to be not very intuitive, with frequent unexpected pitfalls, and does not lend itself to code composability) with a context manager that provides:
1. Seamless support for nested transactions (implemented using Postgres savepoints).
2. Inner transactions are truly independent (meaning that a unit of code can decide whether to commit or rollback its own changes, without influencing any outer transaction(s), or even needing to be aware if outer transactions exist or not)
3. Transaction contexts function correctly inside "classic" transaction management, meaning that you can implement this incrementally, starting at the innermost libraries and working outward.

For a motivating example, please see my original email to this list (below).

For details on the library, see the README at https://github.com/asqui/psycopg-nestedtransactions

Do people think this is worthy of inclusion in psycopg2.extras (or elsewhere in psycopg2?)

Thanks in advance,
Daniel Fortunov


On Mon, 16 Jan 2017 at 23:26, Daniel Fortunov <psycopg-list@danielfortunov.com> wrote:
I'd like to implement support for nested transactions in psycopg2 using a context manager that internally uses postgres savepoints to implement the ability to nest transactions within each other, with sensible commit and rollback semantics.

The end goal is to allow code composability through support for nested transactions.

Before I go ahead and implement this I wanted to solicit feedback on the suggestion, because I find it hard to imagine I'm the first person to dream up this idea, so I imagine I'm missing something. (Perhaps it's already implemented somewhere that I've not found, or maybe it's not implemented because it's a terrible idea, and if so I'd like to understand why!)

The rest of this email describes what I'm trying to achieve, using some motivating examples.

Please share your thoughts and suggestions.

Thanks in advance,
Daniel Fortunov

------------------

Suppose we have a database storing data about widgets, and we'd like to implement a function that does some updates on a few related widget tables. We'd like to do this atomically, so we wrap it in a transaction.

Let's assume that `Transaction` is a psycopg transaction context manager with support for nested transactions. It begins a transaction on __enter__, commits it on __exit__, and rolls back if there is an exception. (This is a simplified version of what I propose to implement.)

def update_widget(cxn, widget):
    with Transaction(cxn):
        cur = cxn.cursor()
        cur.execute(...)  # Update table A
        cur.execute(...)  # Update table B
        cur.execute(...)  # Update table C

So far this works fine, and is also currently achievable with the 'autocommit' paradigm. Nothing new here. Anyone can call this code, provide a connection, and get an atomic update of tables A, B, and C.

Now, I'd like to write a maintenance job that updates a whole load of widgets. If the updates on any one widget fail, I'd still like the job to continue, and process as many widgets as it can. Specifically, I want to implement this job _without_ changing the existing maintain_widget() function, which is already in use.

maintain_widgets(cxn):
    with Transaction(cxn):
        for widget in widgets:
            try:
                update_widget(cxn, widget)
            except:
                pass  # log exception and continue with other widgets

The Transaction context manager would implement the begin/commit/rollback transaction semantics by beginning/releasing/rolling back to a savepoint, respectively. In this context, update_widget would seamlessly switch to using savepoints is it is not the outermost transaction.

This is not possible to achieve with the 'autocommit' paradigm, because you need the ability to rollback the update of a single widget, and then continue the transaction that updates the rest. It is possible to implement with explicit savepoints, however that requires maintain_widget() to be updated, and breaks existing code, because it will no longer be possible to call maintain_widget outside of a transaction!

Now I want to take it a level further, and write a database integration test for my widget maintainer script, which executes the test scenario within a transaction, guaranteeing that the database is left untouched after my test, regardless of whether the test succeeds, fails, or aborts unexpectedly (e.g. maybe the test is running in an interactive debugger and the I abort the debugging session! :-)

def test_maintain_widget_successful_run(self):
    with Transaction(cxn) as txn:
        # Set up starting database state
        # ....
        maintaint_widgets(cxn)  # Call code under test
        # Check database state against expectations
        # ...
        txn.rollback()  # Rollback all changes

Note that for this final ROLLBACK to work, none of the called code is allowed to execute COMMIT at any point. Which is why this breaks down completely with the autocommit = False; cxn.commit() paradigm, because we'd be calling commit() explicitly all over the place!

With the current paradigm you're forced to manage transactions at the uppermost level, which means callers to maintain_widgets() are obliged to call it within a pre-existing transaction in order to get correct behaviour (namely, atomic update of all three tables in the face of errors). And even then, this precludes having a test one level above, which executes that "uppermost level" code within a transaction and then rolls everything back!

So, what am I missing? Am I the first to have this crazy idea? Is this already implemented? How do others solve the problem of code composability and transactions? Is this a terrible idea? (if so, why?)

Feedback and comments welcome.
Thanks for reading this far!

psycopg by date:

Previous
From: Anthony Waye
Date:
Subject: RE: Executing stored procs
Next
From: Reuben Rissler
Date:
Subject: Re: Nested transactions support for code composability