Re: [HACKERS] Transaction control in procedures - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: [HACKERS] Transaction control in procedures
Date
Msg-id CAHyXU0wHC8NpGm+FQ1pf_gXznNG_WQJkEMBgf5mko-qFA3SQhA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Transaction control in procedures  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: Transaction control in procedures  (legrand legrand <legrand_legrand@hotmail.com>)
Re: [HACKERS] Transaction control in procedures  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On Tue, Nov 14, 2017 at 12:09 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/14/17 09:27, Merlin Moncure wrote:
>> *) Will it be possible to do operations like this in pl/pgsql?
>>
>> BEGIN
>>   SELECT INTO r * FROM foo;
>>
>>   START TRANSACTION;  -- perhaps we ought to have a special function
>> for this instead (BEGIN is reserved, etc).
>>   SET transaction_isololation TO serializable;
>> ...
>
> Eventually, I don't see why not.  Currently, it's not complete.
>
> One detail in your example is that when you enter the procedure, you are
> already in a transaction, so you would have to run either COMMIT or
> ROLLBACK before the START TRANSACTION.

Ok, that's good, but it seems a little wonky to me to have to issue
COMMIT first.  Shouldn't that be the default?  Meaning you would not
be *in* a transaction unless you specified to be in one.

> Also, you can't run SET TRANSACTION ISOLATION through SPI, so one would
> have to implement a separate code path for that, but that would just be
> a bit of leg work.

Roger -- I'm more interested in if your design generally supports this
being able to this (either now or in the future...).  I'm hammering on
this point for basically two reasons:

1) Trying to understand if the MVCC snapshot creation can be
meaningfully controlled (I think so, but I'll verify)).
2) This is an important case for databases that want to run in a mode
(typically serializeable) but lower the isolation for specific cases;
for example to loop on a special flag being set in a table.  It's
annoying to only be able to specify this on the client side; I tend to
like to abstract arcane database considerations into the database
whenever possible.

>>  *) Will there be any negative consequences to a procedure running
>> with an unbounded run time?  For example, something like:
>>
>> LOOP
>>   SELECT check_for_stuff_to_do();
>>
>>   IF stuff_to_do
>>   THEN
>>     do_stuff();
>>   ELSE
>>     PERFORM pg_sleep(1);
>>   END IF;
>> END LOOP;
>
> That procedure doesn't do anything with transactions, so it's just like
> a long-running function.  Otherwise, it'd just be like long-running
> client code managing transactions.

Can we zero in on this?  The question implied, 'can you do this
without being in a transaction'?  PERFORM do_stuff() is a implicit
transaction, so it ought to end when the function returns right?
Meaning, assuming I was not already in a transaction when hitting this
block, I would not be subject to an endless transaction duration?

>>> *) Will pg_cancel_backend() cancel the currently executing statement
>> or the procedure? (I guess probably the procedure but I'm curious)
>
> Same as the way it currently works.  It will raise an exception, which
> will travel up the stack and eventually issue an error or be caught.  If
> someone knows more specific concerns here I could look into it, but I
> don't see any problem.

Yeah, that works.

>> I'm somewhat surprised that SPI is the point of attack for this
>> functionality, but if it works that's really the best case scenario
>> (the only downside I can see is that the various out of core pl/s have
>> to implement the interface individually).
>
> So I tried different things here, and I'll list them here to explain how
> I got there.
>
> Option zero is to not use SPI at all and implement a whole new internal
> command execution system.  But that would obviously be a large amount of
> work, and it would look 85% like SPI, and as it turns out it's not
> necessary.
>
> The first thing I tried out what to run transaction control statements
> through SPI.  That turned out to be very complicated and confusing and
> fragile, mainly because of the dichotomy between the internal
> subtransaction management that the PLs do and the explicit transaction
> control statements on the other hand.  It was just a giant unworkable mess.
>
> The next thing I tried was to shut down (SPI_finish) SPI before a
> transaction boundary command and restart it (SPI_connect) it afterwards.
>  That would work in principle, but it would require a fair amount of
> work in each PL, because they implicitly rely on SPI (or perhaps are
> tangled up with SPI) for memory management.
>
> The setup I finally arrived at was to implement the transaction boundary
> commands as SPI API calls and let them internally make sure that only
> the appropriate stuff is cleared away at transaction boundaries.  This
> turned out to be the easiest and cleanest.  I have since the last patch
> implemented the transaction control capabilities in PL/pgSQL, PL/Perl,
> and PL/Tcl, and it was entirely trivial once the details were worked out
> as I had shown in PL/Python.  I will post an updated patch with this soon.

well, you've convinced me.  now that you've got pl/pgsql implemented
I'll fire it up and see if I can make qualitative assessments...

merlin


pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: [HACKERS] [PATCH] Generic type subscripting
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] Proposal: Local indexes for partitioned table