Thread: DBI driver and transactions

DBI driver and transactions

From

"Nigel J. Andrews"

Date:

03 February 2003, 00:18:13

I'm getting very odd behaviour and am getting quite frustrated at fighting this
thing. So just so I know for the future does anyone know if there are problems
with using transactions when using cached dbi handles?

I was using my own caching of the handle(s) but I didn't have time to work out
who, what, why and where someone was disconnecting it so switched to the DBI
caching connect. Note, this is NOT persistent/pooled connections.

However, either there is something wrong (and it's been fixed since I loaded
the software) or I completely fail to grasp the concept of transactions as
understood by DBI.

With AutoCommit => 0 I see a BEGIN logged right before the
first query I send...

Oh I give up. Trying to clarify what happens when I'm now seeing rollbacks
issued in AutoCommit => mode where I'm not doing any and it's certainly not me
since I've got a trap on my interface layer to DBI for that and it's not been
tripped at all.

So,

a) anyone know of any problems,

b) can I assume it is me and not just give up on this DBI stuff in the future (
I haven't got the time to rewrite my application now; not that I can't see it
being a particularly onerous task for what I use from DBI )

c) can I just issue BEGIN, COMMIT, ROLLBACK when I want to via normal
$dbh->do(...) without causing myself even more problems?

and
d) I've said it before but what's the point of DBI if it's not to make life
easier?


This on a linux debian potato with:

Pg      7.2.3
Perl    5.8.0
DBI     1.30
DBD::Pg 1.13


--
Nigel J. Andrews

Re: DBI driver and transactions

From

Rudy Lippan

Date:

04 February 2003, 19:27:07

On Mon, 3 Feb 2003, Nigel J. Andrews wrote:

> I'm getting very odd behaviour and am getting quite frustrated at fighting this
> thing. So just so I know for the future does anyone know if there are problems
> with using transactions when using cached dbi handles?
>
You will have connections that are idle in transaction on the DB
server for the duration of the time that you hold a $dbh's handle  that
is not AutoCommit=1  :(


> I was using my own caching of the handle(s) but I didn't have time to work out
> who, what, why and where someone was disconnecting it so switched to the DBI
> caching connect. Note, this is NOT persistent/pooled connections.
>
> However, either there is something wrong (and it's been fixed since I loaded
> the software) or I completely fail to grasp the concept of transactions as
> understood by DBI.
>
> With AutoCommit => 0 I see a BEGIN logged right before the
> first query I send...

That is how DBD::Pg starts a transaction. When you do a AutoCommit=0,
DBD::Pg sends a BEGIN.  And when you do a $dbh->commit()/$dbh->rollback(),
DBD::Pg will do a COMMIT and then a BEGIN (there has been some discussion
on this  commit() behavior on dbi-dev)

>
> Oh I give up. Trying to clarify what happens when I'm now seeing rollbacks
> issued in AutoCommit => mode where I'm not doing any and it's certainly not me
> since I've got a trap on my interface layer to DBI for that and it's not been
> tripped at all.
>
IIRC, It will do a rollback for you on disconnect.

> So,
>
> a) anyone know of any problems,
>
> b) can I assume it is me and not just give up on this DBI stuff in the future (
> I haven't got the time to rewrite my application now; not that I can't see it
> being a particularly onerous task for what I use from DBI )
>
> c) can I just issue BEGIN, COMMIT, ROLLBACK when I want to via normal
> $dbh->do(...) without causing myself even more problems?
>

Yes and no. DBD::Pg will stop you from calling those commands;
however, I think that you can trick DBD::Pg by make it not look like a
transaction command (eg. $dbh->do(q{/* Trick Pg */ BEGIN}) ); although, I
have not tried it, and it is not recommended.

I, for the most part, do:

sub stuff {
    local($dbi->{AutoCommit}) = 0;

    eval {
        #do transaction stuff.
        $dbi->commit();
    }; if (my $e = $@)
        $dbi->rollback();
    }
   # $on exit $dbi->{AutoCommit} will go to 1 and DBD::Pg will call
   # commit() for you
}

DBI now supports a begin_work method, but DBD::Pg does not have
support for it at this time.



-r

Re: DBI driver and transactions

From

"Nigel J. Andrews"

Date:

04 February 2003, 20:28:29

On Tue, 4 Feb 2003, Rudy Lippan wrote:

> On Mon, 3 Feb 2003, Nigel J. Andrews wrote:
>
> > Oh I give up. Trying to clarify what happens when I'm now seeing rollbacks
> > issued in AutoCommit => mode where I'm not doing any and it's certainly not me
> > since I've got a trap on my interface layer to DBI for that and it's not been
> > tripped at all.
> >
> IIRC, It will do a rollback for you on disconnect.

Yep, it was a disconnect that was doing that. What I wasn't expecting was a
disconnect on a cached connection (with others still open) to be rolling
back.

It sort of makes sense to do that on the assumption that each connection is
opened for a particular transaction and others are idle. However, there is only
one connection to the backend and to make that assumption the driver should
also not send any requests for a particular cached connection to the backend
until it sees a commit/rollback. Why? Well because it's silly to make the
assumption that a single cached connection close is terminating a transaction
when there can be another one interleaving queries with it. For example take
the situation:

Object or Library 1:
  dbh1 = DBI->connect_cached

Object or Library 2:
  dbh2 = DBI->connect_cached

main:

 do_selects_in_lib1();
 do_inserts_in_lib2();
 close_lib1_without_commit();
 ...all of a sudden here we find lib2's inserts have been rolled back.

It may be that I've taken an odd design route and that a more usual one is to
create a connection and pass that around instead of letting subsystems create
their own (it's not as though I don't need to pass the information to make the
connection around).

My point, however, is that the connect_cached method is multiplexing request
channels to a single processing unit. It is not normal for one such channel to
invalidate all others, or indeed validate (commit), all others. On the one hand
DBI is saying each channel is a separate entity, otherwise it wouldn't rollback
when only one closes, but on the other hand all channels are one, otherwise it
wouldn't allow query interleaving, i.e. would have locking on the channel to
the server.

> > So,
> >
> > a) anyone know of any problems,
> >
> > b) can I assume it is me and not just give up on this DBI stuff in the future (
> > I haven't got the time to rewrite my application now; not that I can't see it
> > being a particularly onerous task for what I use from DBI )
> >
> > c) can I just issue BEGIN, COMMIT, ROLLBACK when I want to via normal
> > $dbh->do(...) without causing myself even more problems?
> >
>
> Yes and no. DBD::Pg will stop you from calling those commands;
> however, I think that you can trick DBD::Pg by make it not look like a
> transaction command (eg. $dbh->do(q{/* Trick Pg */ BEGIN}) ); although, I
> have not tried it, and it is not recommended.
>
> I, for the most part, do:
>
> sub stuff {
>     local($dbi->{AutoCommit}) = 0;
>
>     eval {
>         #do transaction stuff.
>         $dbi->commit();
>     }; if (my $e = $@)
>         $dbi->rollback();
>     }
>    # $on exit $dbi->{AutoCommit} will go to 1 and DBD::Pg will call
>    # commit() for you
> }
>
> DBI now supports a begin_work method, but DBD::Pg does not have
> support for it at this time.

Interesting. Just wondering though, why the comment on exit calling
commit? Within your eval block you've already done the commit and dbi won't
issue another one until another statement is executed.

--
Nigel J. Andrews

Re: DBI driver and transactions

From

Rudy Lippan

Date:

04 February 2003, 23:38:10

On Tue, 4 Feb 2003, Nigel J. Andrews wrote:

> > IIRC, It will do a rollback for you on disconnect.
>
> Yep, it was a disconnect that was doing that. What I wasn't expecting was a
> disconnect on a cached connection (with others still open) to be rolling
> back.
>
I do not think that connect_cached() is doing what you expect of it. When
you call disconnect() on your db handle it disconnects from the databse.
You were probably thinking of something like Apache::DBI which overloads
disconnect() so that it does not actually  disconnect from the db.


> It sort of makes sense to do that on the assumption that each connection is
> opened for a particular transaction and others are idle. However, there is only
> one connection to the backend and to make that assumption the driver should
> also not send any requests for a particular cached connection to the backend
> until it sees a commit/rollback. Why? Well because it's silly to make the
> assumption that a single cached connection close is terminating a transaction
> when there can be another one interleaving queries with it. For example take
> the situation:
>
I think you are missing the point behind connect_cached()

> Object or Library 1:
>   dbh1 = DBI->connect_cached
>
    # DBI creates a connection to the database, and returns it. remembering
    # that you connected with @stuff;

    $dbh1 = DBI=>connect_cahced(@stuff);

> Object or Library 2:
>   dbh2 = DBI->connect_cached
>
    #DBI looks up sees that you already connected with @suff and returns
    # a reference to handle that it already created (ie. $db1 == $dbh2)
    $dbh2 = DBI->connect_cached(@stuff);


   # now if you connect with @fooo -- you would get a different handle.
>
> main:
>
>  do_selects_in_lib1();
>  do_inserts_in_lib2();
>  close_lib1_without_commit();
>  ...all of a sudden here we find lib2's inserts have been rolled back.
>
because dbh1 == dbh2.

> It may be that I've taken an odd design route and that a more usual one is to
> create a connection and pass that around instead of letting subsystems create
> their own (it's not as though I don't need to pass the information to make the
> connection around).

If I can get away with it, I put $dbh in a global/package global.

>
> My point, however, is that the connect_cached method is multiplexing request
> channels to a single processing unit. It is not normal for one such channel to
> invalidate all others, or indeed validate (commit), all others. On the one hand
> DBI is saying each channel is a separate entity, otherwise it wouldn't rollback
> when only one closes, but on the other hand all channels are one, otherwise it
> wouldn't allow query interleaving, i.e. would have locking on the channel to
> the server.
>
connect_cached() is not multiplexing anything. It just returns a cached
copy of a valid database handle. Everything is going over one connection
to the backend because there is only one object talking to the backend.
If you were to drop the connect_cached() and do two connect calls you
would get the behavior that you expect.  Well, Except (maybe) for the
transaction behavour (depending on how you are using transactions) because
then you will be looking at to different data base connexions with
separate transactions -- And I don't think you want to go there.


>
> > sub stuff {
> >     local($dbi->{AutoCommit}) = 0;
> >
> >     eval {
> >         #do transaction stuff.
> >         $dbi->commit();
> >     }; if (my $e = $@)
> >         $dbi->rollback();
> >     }
> >    # $on exit $dbi->{AutoCommit} will go to 1 and DBD::Pg will call
> >    # commit() for you
> > }
> >
> > DBI now supports a begin_work method, but DBD::Pg does not have
> > support for it at this time.
>
> Interesting. Just wondering though, why the comment on exit calling
> commit? Within your eval block you've already done the commit and dbi won't
> issue another one until another statement is executed.
>

I ment on exit from the sub, or more correctly the exit from the scope in
which local() was called.  The reason for the commit is that
$dbi->{AutoCommit} gets changed from a value of 0 to a value of 1
(assuming that the value of $dbi->{AutoCommit} was 1 when you entered the
sub, of course). When autocommit is turned on DBD::Pg issues a commit.
And remember you are in transaction because the $dbi->commit/rollback()
automatically started another transaction for you.



-r

Re: DBI driver and transactions

From

"Nigel J. Andrews"

Date:

04 February 2003, 23:53:13

On Tue, 4 Feb 2003, Rudy Lippan wrote:

> On Tue, 4 Feb 2003, Nigel J. Andrews wrote:
>
> > Object or Library 1:
> >   dbh1 = DBI->connect_cached
> >
>     # DBI creates a connection to the database, and returns it. remembering
>     # that you connected with @stuff;
>
>     $dbh1 = DBI=>connect_cahced(@stuff);
>
> > Object or Library 2:
> >   dbh2 = DBI->connect_cached
> >
>     #DBI looks up sees that you already connected with @suff and returns
>     # a reference to handle that it already created (ie. $db1 == $dbh2)
>     $dbh2 = DBI->connect_cached(@stuff);
>
>
>    # now if you connect with @fooo -- you would get a different handle.
> >
> > main:
> >
> >  do_selects_in_lib1();
> >  do_inserts_in_lib2();
> >  close_lib1_without_commit();
> >  ...all of a sudden here we find lib2's inserts have been rolled back.
> >
> because dbh1 == dbh2.
>
> ...
>
> >
> > My point, however, is that the connect_cached method is multiplexing request
> > channels to a single processing unit. It is not normal for one such channel to
> > invalidate all others, or indeed validate (commit), all others. On the one hand
> > DBI is saying each channel is a separate entity, otherwise it wouldn't rollback
> > when only one closes, but on the other hand all channels are one, otherwise it
> > wouldn't allow query interleaving, i.e. would have locking on the channel to
> > the server.
> >
> connect_cached() is not multiplexing anything. It just returns a cached
> copy of a valid database handle. Everything is going over one connection
> to the backend because there is only one object talking to the backend.
> If you were to drop the connect_cached() and do two connect calls you
> would get the behavior that you expect.  Well, Except (maybe) for the
> transaction behavour (depending on how you are using transactions) because
> then you will be looking at to different data base connexions with
> separate transactions -- And I don't think you want to go there.

Bingo. Thanks for the explanation. You are indeed correct I was
misunderstanding the cached connection stuff was doing.

I thought it was doing something more like how I changed my usage to use
connect() and a reference count. My connection only gets closed when there the
reference count drops back to zero. To me that's much more sensible a scheme.


--
Nigel J. Andrews