Re: Autonomous Transaction is back - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Autonomous Transaction is back
Date
Msg-id CAASwCXfY_3xqMCnUCLK2L5xc0s1M4ajAwuW07hjfDGweb3zysQ@mail.gmail.com
Whole thread Raw
In response to Re: Autonomous Transaction is back  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Tue, Jul 28, 2015 at 12:56 AM, Josh Berkus <josh@agliodbs.com> wrote:
Ah, ok.  The goal of the project is that the writer of X() *cannot*
prevent Y() from writing its data (B1) and committing it.

One of the primary use cases for ATX is audit triggers.  If a function
writer could override ATX and prevent the audit triggers from
committing, then that use case would be violated.

Can you explain what use case you have where simply telling the staff
"if you use ATX without clearing it, you'll be fired" is not sufficient?
 Possibly there's something we failed to account for in the unconference
discussion.

I fully understand and agree why you want to prevent X() from letting Y() commit, if the use-case is e.g. auditing.

I'll try to explain where I'm coming from by providing a bit of background and context.

One of the greatest strengths with writing an entire application using only sql and plpgsql functions,
is you don't have to worry about side effects when calling functions, since you are always in total control to
rollback all your writes at your stack-depth and deeper down the stack, and the caller can likewise be certain
that if the function it called threw an exception, all of its work is rollbacked.

However, if e.g. making use of plperlu functions, it's possible those functions might have written files to disk
or made network connections to the outside world,
i.e. it's possible they have caused side-effects that naturally cannot be rollbacked by postgres.

Currently, in our codebase at Trustly, we have already quite a few plperlu functions.
In each one of them, we need to think carefully about side-effects,
it's usually fine since most of them are immutable.

But if we had to worry about all plpgsql functions being able to write things without consent from the caller,
then we would have a completely different situation with increased complexity and risk for failures.

>"if you use ATX without clearing it, you'll be fired"
We trust each other so we don't have that kind of problem.
But still, even when trusting your own and others code, I would say quite often you make use of a function
for which there are small details you have forgotten about or never knew in the first place,
and if that little detail would be something written in an ATX, that could be a problem
if you the caller wouldn't want whatever was written to be written.

Don't get me wrong, ATX is something I would absoltely love, since then you could
for instance in function doing password validation, update the FailedLoginAttempts column
in an ATX and then still raise an exception to rollback the operation and return an error to the client.

However, the need for ATXs is at least for us a special need you won't need in most functions,
and since the risk and complexity increases with it, I would prefer if it can be enaled/disabled
by default globally and explicitly enabled/disabled per function.

If the global default is "disabled" or if it's "disabled" for a specific function, like for X() in your example,
and if it's enabled for Y(), then X() tries to call Y() you should get an error even before Y() is executed.

That way we can still do auditing, since X() couldn't execute Y() since it was declared as AT,
and that would be the same thing as if X() wouldn't have the line of code in it that executs Y(),
something which X() is in power of as if X() calls Y() or not, is ultimately X()'s decision.

If we declare entire functions as AT, then we only have to check before executing the function
if AT is allowed or not in the context, determined by the global default or if the caller function is defined AT or NOT AT.

Use cases:

1. X[AT] -> Y[AT]
OK, since caller X() is declared AT i.e. allows AT in itself and in callees.

2. X[AT] -> Y[NOT AT]
OK, since caller X() is declared AT i.e. allows AT in itself and in callees,
and since Y() is NOT AT, i.e. not making use of AT and not allowing it in callees
that is not in violation with anything.

3: X[NOT AT] -> Y[AT]
Invalid, since caller X() is declared NOT AT, i.e. disallows AT in itself and in callees,
and since Y() is declared AT it cannot be executed since it's declared AT.

4: X[NOT AT] -> Y[NOT AT]
OK, since caller X() is declared NOT AT, i.e. disallows AT in itself and in callees,
and since Y() is also declared NOT AT, it can be executed.

pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: pg_rewind tap test unstable
Next
From: Michael Paquier
Date:
Subject: Re: pg_rewind tap test unstable