Re: [Re] Re: PREPARE and transactions - Mailing list pgsql-hackers

From Jeroen T. Vermeulen
Subject Re: [Re] Re: PREPARE and transactions
Date
Msg-id 20040701181031.GA25134@xs4all.nl
Whole thread Raw
In response to Re: [Re] Re: PREPARE and transactions  (Oliver Jowett <oliver@opencloud.com>)
Responses Re: [Re] Re: PREPARE and transactions
Re: [Re] Re: PREPARE and transactions
List pgsql-hackers
Sorry for the delay, life tends to get complicated if you leave it
alone for a few days...

I see how making PREPARE obey rollbacks would be inconvenient for some
existing code, but frankly I'm getting a bit worried about the "why should
I care whether what I do is committed or not?" argument.  I guess one could
say that about lots of statements: "I don't really want this to be subject
to the transaction but it happens convenient for me to write it inside the
transaction, and then I have this problem that it's affected by rollbacks."
If it's that important, come up with a generic "session-not-transaction"
syntax to temporarily escape bracketing.

I'll wave my hands a bit here and use the term "middleware" for drivers,
front-end libraries, proxies, in-application abstraction layers, anything
that sits between the business logic (is that term still fashionable?)
firing queries and the backend handling them.

So far, as I believe Tom has pointed out, I've assumed regular code that
knows whether it's in a transaction, or where transactions begin and end, 
or what else the program is doing with the same session.  The arguments for
the nontransactional behaviour have been about code where this isn't the
case, but that does have exclusive control of prepared statements.

Frankly I think that's a strange point of view, since transactions are one
of the pillars of database management and elementary to statement semantics,
whereas prepared statements are a recently added optimization feature.

In this message I'll give examples of how the current behaviour may affect
other middleware that doesn't use prepared statements, but may have to deal
with applications that do.

So let's assume for a change that the middleware has at least some
knowledge or perhaps even control over transactionality, but doesn't know
about prepared statements (perhaps because it predates 7.4):

(*) The middleware may deal with transient errors, such as some deadlocks,
by rerunning the transaction.  Now if a PREPARE (and no matching DEALLOCATE)
preceded the point of failure, the retry would break under the
nontransactional semantics--just because the PREPARE escaped the rollback.

You can work around this by silently accepting redefinitions of prepared
statements, but a redefinition may also be a true honest bug.  Only
accepting identical redefinitions will not work, because a redefinition
during retry may have a slightly different body than the original definition
during the first attempt.  Another workaround as we've seen is to re-PREPARE
in a nested transaction, which besides offsetting any performance gain may
mean that you're blithely executing the wrong version of the statement!

(*) Transaction code may start out with a PREPARE statement, use the
prepared statement a few times, and then DEALLOCATE it at the very end.
This seems a clean way to work, and an attempt not to affect session state.
But now, paradoxically, there will be unintended residue of the transaction
only if the transaction _fails_.  This may break a future instance of the
transaction, which may be why the DEALLOCATE was there in the first place.

(*) Middleware may support some form of connection pooling, or other
reuse of connections.  The only difference between a reused connection
and a fresh one to the same database that I can think of right now are (i)
session variables and (ii) prepared statements.  So based on the 7.3 state
of things, the middleware might assume that a connection was reusable if (a)
no statements affecting session variables were issued; (b) any changes in
session variables are OK; or (c) no transaction affecting session variables
was committed.

Prepared statements can break each of these options--most notably, they
would break the even more diligent assumption that a session is clean and
unspoiled as long as any transactions (whether implicit or explicit) entered
inside it have been rolled back.

(*) Middleware may want to restore broken connections.  It would have to
restore any prepared statements from the old connection (quite possibly
unnecessarily) in addition to session variables.

Worse, the semantics for the two kinds of session state are different!
Middleware that tries to maintain session state but doesn't keep track of
rollbacks is really already broken when it comes to session variables.
The fix for that and a change to transactional PREPARE require the exact
same mechanism.


Now, two more issues for middleware that does prepare statements:

(*) What if preparing a statement fails?  Could be that you've just broken
the transaction at a point where the application didn't expect it, or in a
way it didn't expect.

(*) What if your prepared statement interferes with one prepared by the
application?

Sure, all of these could be worked around; there's a lot of "don't do that
then" in there--which IMHO cuts both ways.  And of course some of these
are simply real-life examples that I need to try and deal with elegantly if
behaviour is to stay nontransactional.  Just don't tell me that making
PREPARE respect rollbacks would break compatibility, or that it's possible
to write code that doesn't play well with transaction semantics, or that
it's inconvenient, because there are counterexamples for each.  And in case
of doubt, why not go with some form of transactional behaviour?

I should add here that session variables used to escape transaction
bracketing as well--but that was fixed some time ago.  Why are session
variables so different from prepared statements?  At the very least, it
would be nice for middleware to deal with one form of session state, not
"the kind you manipulate with regular statements" and "the kind that
ignores transaction bracketing except that it happens to be atomic also, and
that manipulations are still rejected inside transactions that are already
in failure mode."


Jeroen



pgsql-hackers by date:

Previous
From: Thomas Swan
Date:
Subject: Nested Transactions, Abort All
Next
From: Robert Treat
Date:
Subject: Re: possibly updating techdocs; mysql2pgsql on gborg