Thread: PREPARE TRANSACTION and webapps

PREPARE TRANSACTION and webapps

From

Lincoln Yeoh

Date:

10 November 2005, 16:28:32

Hi,

Is it OK to use PREPARE TRANSACTION and COMMIT PREPARED in order to have
transactions that last longer than just a single web request?

Previously it was usually a bad idea to keep database connections alive
just to keep a transaction pending.

Now I'm thinking that we could keep transactions around for as long as the
relevant web _session_ is valid. That will be quite nice for many things.

Would it be easier to support X "prepared transactions" than X database
connections for increasing values of X?

Thanks,

Link.

Re: PREPARE TRANSACTION and webapps

From

Tom Lane

Date:

10 November 2005, 17:11:10

Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> Is it OK to use PREPARE TRANSACTION and COMMIT PREPARED in order to have
> transactions that last longer than just a single web request?

> Previously it was usually a bad idea to keep database connections alive
> just to keep a transaction pending.

A prepared transaction eats just about the same resources (other than an
active connection) as a live one.  In particular it still holds its
locks, which makes leaving it around for a long time just as evil as
simply sitting on it in an un-prepared state.

            regards, tom lane

Re: PREPARE TRANSACTION and webapps

From

Lincoln Yeoh

Date:

11 November 2005, 05:50:14

At 04:11 PM 11/10/2005 -0500, Tom Lane wrote:

>Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> > Is it OK to use PREPARE TRANSACTION and COMMIT PREPARED in order to have
> > transactions that last longer than just a single web request?
>
> > Previously it was usually a bad idea to keep database connections alive
> > just to keep a transaction pending.
>
>A prepared transaction eats just about the same resources (other than an
>active connection) as a live one.  In particular it still holds its
>locks, which makes leaving it around for a long time just as evil as
>simply sitting on it in an un-prepared state.

Assuming the transactions don't explicitly do any locks (lock table, select
for update - just selects, inserts and normal updates), would it be
possible to have say 10000 pending prepared transactions? What would the
main limiters be?

It will be very nice if that sort of thing is viable. Previously if you
want to do transactional stuff with webapps, you'd have to simulate it at
the application layer (or leave db connections open[1]). Doing transaction
stuff at the application level seems rather MySQL-ish (OK MySQL 3-ish ;) ).
Having to have tables with transactionid columns, transaction table etc.

Leaving transactions pending will affect vacuuming, but perhaps we can just
put the web transaction stuff in a separate database, so it doesn't affect
vacuuming of other normal transactions. If you do such stuff at the
application layer, you will still have to keep those rows around anyway.

Last but not least, is this a silly thing to do? Are people already doing
such stuff on other databases, or they do such things in other ways for
good reasons (which are?)?

Best regards,
Link.

[1] Which I'd consider viable only in a controlled environment- internal
web app for internal users.

Hmm. I wonder if it would be possible to simulate 20K concurrent database
connections, using many db proxies (e.g. pgpool), and prepared transactions
(just prepare all transactions, but only process a manageable number of
transactions at a time).

Re: PREPARE TRANSACTION and webapps

From

Martijn van Oosterhout

Date:

11 November 2005, 09:22:23

On Fri, Nov 11, 2005 at 05:45:28PM +0800, Lincoln Yeoh wrote:
> At 04:11 PM 11/10/2005 -0500, Tom Lane wrote:
> >A prepared transaction eats just about the same resources (other than an
> >active connection) as a live one.  In particular it still holds its
> >locks, which makes leaving it around for a long time just as evil as
> >simply sitting on it in an un-prepared state.
>
> Assuming the transactions don't explicitly do any locks (lock table, select
> for update - just selects, inserts and normal updates), would it be
> possible to have say 10000 pending prepared transactions? What would the
> main limiters be?

Every transaction takes locks, on every table it accesses. Shared lock,
but locks anyway. UPDATEs take stronger locks, so any UPDATE may cause
other queries to wait until you COMMIT or ABORT.

See also:
: The state of each prepared transaction is kept in a so called "2PC
: state file" in the pg_twophase directory. There is one state file for
: each prepared transaction, and the filename is the xid of the
: transaction.
:
: The state file is created and populated when the transaction is
: prepared, and it's used in commit/rollback to finish the transaction
: on behalf of the original backend. It's also used on database
: recovery to recover any in-memory state the transaction must have,
: like locks held.
http://users.tkk.fi/~hlinnaka/pgsql/

> It will be very nice if that sort of thing is viable. Previously if you
> want to do transactional stuff with webapps, you'd have to simulate it at
> the application layer (or leave db connections open[1]). Doing transaction
> stuff at the application level seems rather MySQL-ish (OK MySQL 3-ish ;) ).
> Having to have tables with transactionid columns, transaction table etc.

But once you've prepared a transaction, you can't reopen it, all you
can do is either commit it or abort it. I don't see how prepared
transaction relate to webapps at all.

See also the docs:
http://www.postgresql.org/docs/8.1/static/sql-prepare-transaction.html

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-107410.dat

Re: PREPARE TRANSACTION and webapps

From

Martijn van Oosterhout

Date:

11 November 2005, 09:44:30

On Fri, Nov 11, 2005 at 02:22:05PM +0100, Martijn van Oosterhout wrote:
> Every transaction takes locks, on every table it accesses. Shared lock,
> but locks anyway. UPDATEs take stronger locks, so any UPDATE may cause
> other queries to wait until you COMMIT or ABORT.

Note also, you don't want to use prepare transactions until you know
exactly what you're doing. Take for example (in a single session):

test=# begin;
BEGIN
test=# update test set value=1 where value=1;
UPDATE 1
test=# prepare transaction 'test';
PREPARE TRANSACTION
test=# begin;
BEGIN
test=# update test set value=1 where value=1;

and you've deadlocked yourself. Until you create another connection and
commit or rollback the prepared transaction, this query will never end.
This is at the lowest isolation level. You do not want to keep
transactions open longer than absolutly necessary.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-107412.dat

Re: PREPARE TRANSACTION and webapps

From

Tom Lane

Date:

11 November 2005, 10:13:17

Martijn van Oosterhout <kleptog@svana.org> writes:
> On Fri, Nov 11, 2005 at 05:45:28PM +0800, Lincoln Yeoh wrote:
>> Assuming the transactions don't explicitly do any locks ...

> Every transaction takes locks, on every table it accesses. Shared lock,
> but locks anyway. UPDATEs take stronger locks, so any UPDATE may cause
> other queries to wait until you COMMIT or ABORT.

Also, the mere existence of an old open transaction restricts VACUUM's
ability to reclaim dead rows.

            regards, tom lane

Re: PREPARE TRANSACTION and webapps

From

Lincoln Yeoh

Date:

11 November 2005, 23:59:32

At 02:22 PM 11/11/2005 +0100, Martijn van Oosterhout wrote:

>But once you've prepared a transaction, you can't reopen it, all you
>can do is either commit it or abort it. I don't see how prepared
>transaction relate to webapps at all.
>
>See also the docs:
>http://www.postgresql.org/docs/8.1/static/sql-prepare-transaction.html

Oh. I thought one could reopen transactions. Why can't we do that? ;)

Would it be reasonably possible to add a postgresql feature to save
transactions to disk, disconnect from the database, reconnect to the
database, reopen and continue a desired transaction? e.g. CONTINUE
TRANSACTION transaction_id

I think I asked for such a feature in postgresql years ago, but didn't get
a positive reply, so I figured it was not possible, but with the recent
announcement of the "prepare transaction" feature, I'm getting a bit more
hopeful :).

Is there a reason why transactions should be so tightly linked to database
connections?  Being able to decoupling transactions from database
connections could make a lot of tasks easier.

Vacuum not being able to reclaim dead rows isn't a big issue. Unless I'm
mistaken, doing such a thing at the application level will by necessity
result in a similar situation. It's all a necessary cost of supporting that
many concurrent outstanding _transactions_ (in contrast with the cost of
supporting "real" concurrent DB connections).

Deadlocking is an issue of course. But are there any differences in the
locking situation? Wouldn't it be the same as having a normal transaction
that takes a long time to complete? We already have users with transactions
that are open for days at least.

I think it's easier to use NOWAIT than to reimplement MVCC at a
webapplication level :).

If a CONTINUE TRANSACTION feature is possible, it could also allow
postgresql systems to "pretend" to support many more concurrent open
"database connections" ;).

Regards,

Link.

Re: PREPARE TRANSACTION and webapps

From

Lincoln Yeoh

Date:

15 November 2005, 11:46:02

Hi,

Can we have a reconnect and "reopen prepared/saved transactions" feature?

Please? :)

I'm sure there'll be uses for it. e.g. the stuff I mentioned.

Maybe we can also use it to help migrate queries to a different node.

At 11:54 AM 11/12/2005 +0800, Lincoln Yeoh wrote:

>At 02:22 PM 11/11/2005 +0100, Martijn van Oosterhout wrote:
>
>>But once you've prepared a transaction, you can't reopen it, all you
>>can do is either commit it or abort it. I don't see how prepared
>>transaction relate to webapps at all.
>>
>>See also the docs:
>>http://www.postgresql.org/docs/8.1/static/sql-prepare-transaction.html
>
>Oh. I thought one could reopen transactions. Why can't we do that? ;)

Re: PREPARE TRANSACTION and webapps

From

Kris Jurka

Date:

15 November 2005, 16:04:06

On Tue, 15 Nov 2005, Lincoln Yeoh wrote:

> Can we have a reconnect and "reopen prepared/saved transactions" feature?
>
> Please? :)
>

Note that this (transaction suspend/resume) is also required for a full
implementation of XA.  Our current 2PC only supports the basics.  There's
a bunch of other complicated features, like transaction interleaving[1]
and multiple threads of control participating in the same backend
transaction[2] that we currently don't support either.  Now some of these
may be worked around and faked on the driver side, but it won't be able to
do these well.  For example you could implement suspend/resume by simply
holding the backend connection open or you could implement interleaved
transactions by opening multiple connections, but both have a serious cost
in the number of open connections.  It would be better to implement this
functionality in the backend, but I'm not sure how important these
situations are in the real world.  Some on the jdbc list have shown ways
to configure transaction managers to avoid using these exotic features.

Also I think that trying to use 2PC without a real transaction manager is
just asking for trouble.  Normal XA usage is two serverside resources held
open the time it takes to service a single request, not wait for user
input.  A random webapp leaving suspended or prepared transactions around
is going to lock things up in a hurry.

Kris Jurka

[1] http://archives.postgresql.org/pgsql-jdbc/2005-06/msg00165.php
[2] http://archives.postgresql.org/pgsql-jdbc/2005-06/msg00171.php

Re: PREPARE TRANSACTION and webapps

From

Tom Lane

Date:

16 November 2005, 00:27:36

Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> Can we have a reconnect and "reopen prepared/saved transactions" feature?
> Please? :)

No.  A prepared transaction has already run its end-of-transaction
operations, eg at-commit triggers.  Reopening it would be a fundamental
semantics violation.

That said, it seems to me that the prepared-xacts infrastructure could
possibly support a separate "suspend transaction" and "resume
transaction" facility, if anyone wants to do the legwork to make it
happen.  What this would actually be useful for is a fair question
though --- what's it do that you don't have now?

            regards, tom lane

Re: PREPARE TRANSACTION and webapps

From

Greg Stark

Date:

16 November 2005, 01:19:56

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> > Can we have a reconnect and "reopen prepared/saved transactions" feature?
> > Please? :)
>
> What this would actually be useful for is a fair question
> though --- what's it do that you don't have now?

I think what they want to do is make the database concept of transactions
match up 1-1 with their application's concept of transactions. Which may span
multiple stateless http requests.

That usually means reengineering the http server and driver layers to keep
backends around and pull out the right one for every http request. But doing
that involves keeping around entire backends, which isn't terribly efficient
with kernel resources. It also involves adding some connection pool
infrastructure which are always annoying. At every level it imposes lots of
constraints on the application design.

The alternative is to reimplement some limited locking at the application
layer or finessing the situation somehow. That's what's usually recommended
here and it's what I usually do. But if what you really need is full ACID
semantics presented to the user then reimplementing the entire MVCC system
seems unnecessary when it all already exists in the database.

--
greg

Re: PREPARE TRANSACTION and webapps

From

Tom Lane

Date:

16 November 2005, 01:35:13

Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> What this would actually be useful for is a fair question
>> though --- what's it do that you don't have now?

> I think what they want to do is make the database concept of transactions
> match up 1-1 with their application's concept of transactions. Which may span
> multiple stateless http requests.

[ itch... ]  This seems to me to fly right in the face of the
oft-repeated advice that you don't hold a transaction open while the
user thinks about it, goes off to lunch, vacations in the Bahamas, etc.

The question remains: what problem are we solving that actually
should be solved?

            regards, tom lane

Re: PREPARE TRANSACTION and webapps

From

Greg Stark

Date:

16 November 2005, 01:58:54

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> What this would actually be useful for is a fair question
> >> though --- what's it do that you don't have now?
>
> > I think what they want to do is make the database concept of transactions
> > match up 1-1 with their application's concept of transactions. Which may span
> > multiple stateless http requests.
>
> [ itch... ]  This seems to me to fly right in the face of the
> oft-repeated advice that you don't hold a transaction open while the
> user thinks about it, goes off to lunch, vacations in the Bahamas, etc.

Sure, I said that was the answer people get when they ask about this. And it's
clearly better approach if it's available. But what if the locking or MVCC
semantcis *are* what you need?

If you really do need to allow one user to edit the information and still
present the existing information to others but not let them update it
concurrently, etc. Reimplementing full ACID semantics is hard and easy to get
wrong. We already have a tool that provides them properly.

--
greg

Re: PREPARE TRANSACTION and webapps

From

Lincoln Yeoh

Date:

16 November 2005, 12:37:30

At 11:27 PM 11/15/2005 -0500, Tom Lane wrote:

>That said, it seems to me that the prepared-xacts infrastructure could
>possibly support a separate "suspend transaction" and "resume
>transaction" facility, if anyone wants to do the legwork to make it
>happen.  What this would actually be useful for is a fair question
>though --- what's it do that you don't have now?

MVCC-style transactions that are not limited by/to database connections.

This could be useful if you want to have X pending database transactions
and Y max concurrent database connections, where X is significantly greater
than Y (magnitudes higher?).

My assumption is that pending transactions (e.g. locks and other metainfo)
will take much less memory than database backends.

It'll be nice (but it might be difficult) to have an implementation that
allowed migration of transactions to a different node in a cluster - so
that one could bring down a database node server in the middle of a
transactions without affecting database users/applications severely. A
suitable protocol might allow a database client to automatically save its
transaction, and then resume it on another node, without the database
user/app noticing much (not sure if this is a good idea though).

With respect to concerns about users leaving transactions open for long
periods, this sort of thing already happens with the current implementation.

As such, similar measures can be taken: rollback/commit the offending
transactions. One needs a way of listing information about pending
transactions, and some methods to manage them.

My assumption is managing pending transactions would be easier than
reimplementing MVCC and the other stuff. Especially if only controlled
types of transactions are saved and resumed - one scenario might even put
such transactions in a different database so as not to affect other
transactions. But I could be wrong :).

Sure one could create a tangled mess with thousands of transactions. But I
don't think that's the fault of supplying X amounts of rope instead of Y
amounts of rope, where X >> Y.

Are there RDBMSes out there with this feature already? I'm not sure what
keywords to search for.

I suspect it might be very difficult to do on a database without an MVCC
architecture.

Regards,
Link.

Re: PREPARE TRANSACTION and webapps

From

Tom Lane

Date:

16 November 2005, 12:55:29

Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> At 11:27 PM 11/15/2005 -0500, Tom Lane wrote:
>> That said, it seems to me that the prepared-xacts infrastructure could
>> possibly support a separate "suspend transaction" and "resume
>> transaction" facility, if anyone wants to do the legwork to make it
>> happen.  What this would actually be useful for is a fair question
>> though --- what's it do that you don't have now?

> MVCC-style transactions that are not limited by/to database connections.

> This could be useful if you want to have X pending database transactions
> and Y max concurrent database connections, where X is significantly greater
> than Y (magnitudes higher?).

I don't think the prepared-xacts facility has the performance that would
be needed to sustain that kind of usage.   Suspend/resume would not be
all that cheap, and a suspended transaction would still hold a lot of
resources (locks mostly).

            regards, tom lane

Re: PREPARE TRANSACTION and webapps

From

Martijn van Oosterhout

Date:

16 November 2005, 13:05:09

On Thu, Nov 17, 2005 at 12:29:25AM +0800, Lincoln Yeoh wrote:
> MVCC-style transactions that are not limited by/to database connections.
>
> This could be useful if you want to have X pending database transactions
> and Y max concurrent database connections, where X is significantly greater
> than Y (magnitudes higher?).
>
> My assumption is that pending transactions (e.g. locks and other metainfo)
> will take much less memory than database backends.

They make take less memory but they take many more resources. Backend
don't take locks by themselves, transactions do.

What I don't understand about this whole discussion is that the
concurrency control needed on a user level is of a completely different
nature to what a programmer needs when writing programs. Let me give an
example:

User 1: Opens record 1 and begins to edit
User 2: Opens record 1 and begins to edit

Obviously these should both succeed. reading data doesn't block. Ten
minutes later user 1 submits an update and goes to lunch without
committing. User 2 then does an update but he has to wait. How long?
Well, by your definition, forever. I doubt user 2 will be very happy
with that.

The way I would think about it would be to (a) let user 2 know straight
away someone else is already looking at this record. This is useful
info, maybe they talked to the same customer? and (b) when user 2
submits his edit he should be warned there are conflict and be asked to
resolve them. If you abort either transaction you're going to have some
annoyed users on your hands.

Both of these fall outside MVCC. You can already check if the record
was modified since you looked at it, no extra features needed there.
Can you give an example of where MVCC for long running transactions
makes sense?

The example given where you have an ordering system for a limited
number of widgets where the ordering process might take some time to
enter is silly. The discussion about concurrency control is bypassing
the fact that what you really want is a queue. You know, "there are 3
widgets available but 5 people started their orders before you. If they
cancel you get yours". Much better than waiting an hour for everyone
else to finish.

> It'll be nice (but it might be difficult) to have an implementation that
> allowed migration of transactions to a different node in a cluster - so
> that one could bring down a database node server in the middle of a
> transactions without affecting database users/applications severely. A
> suitable protocol might allow a database client to automatically save its
> transaction, and then resume it on another node, without the database
> user/app noticing much (not sure if this is a good idea though).

This is a completely different kettle of fish. I'm not sure what it
would take to serialise a transaction, maybe most of that is done
already.

> My assumption is managing pending transactions would be easier than
> reimplementing MVCC and the other stuff. Especially if only controlled
> types of transactions are saved and resumed - one scenario might even put
> such transactions in a different database so as not to affect other
> transactions. But I could be wrong :).

This is silly. Any transaction that updates a row will block any other
transaction using that row until he commits or aborts. Putting it on
another server doesn't change the fact that the row is locked *for
everybody*.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-107629.dat

Re: PREPARE TRANSACTION and webapps

From

Greg Stark

Date:

16 November 2005, 13:05:20

Tom Lane <tgl@sss.pgh.pa.us> writes:

> > This could be useful if you want to have X pending database transactions
> > and Y max concurrent database connections, where X is significantly greater
> > than Y (magnitudes higher?).
>
> I don't think the prepared-xacts facility has the performance that would
> be needed to sustain that kind of usage.   Suspend/resume would not be
> all that cheap, and a suspended transaction would still hold a lot of
> resources (locks mostly).

Well it'll be better than having to maintain a connection for each
transaction.

--
greg

Re: PREPARE TRANSACTION and webapps

From

Greg Stark

Date:

16 November 2005, 14:04:39

Martijn van Oosterhout <kleptog@svana.org> writes:

> They make take less memory but they take many more resources. Backend
> don't take locks by themselves, transactions do.

And backends have transactions implicitly. The point here is that if you're
going to suspend transactions by leaving idle backends around that's an added
cost over just suspending the transaction. It's not a trivial cost either,
processes consume memory, they consume kernel resources and cause extra
context switching.

> What I don't understand about this whole discussion is that the
> concurrency control needed on a user level is of a completely different
> nature to what a programmer needs when writing programs. Let me give an
> example:
>
> User 1: Opens record 1 and begins to edit
> User 2: Opens record 1 and begins to edit
>
> Obviously these should both succeed. reading data doesn't block. Ten
> minutes later user 1 submits an update and goes to lunch without
> committing. User 2 then does an update but he has to wait. How long?
> Well, by your definition, forever. I doubt user 2 will be very happy
> with that.

There's nothing stopping you from coding up a daemon that checks for suspended
transactions older than some predetermined policy and rolling them back
automatically. If you invent your own transaction semantics above Postgres's
you'll have to do the same thing anyways.

> The way I would think about it would be to (a) let user 2 know straight
> away someone else is already looking at this record. This is useful
> info, maybe they talked to the same customer? and (b) when user 2
> submits his edit he should be warned there are conflict and be asked to
> resolve them. If you abort either transaction you're going to have some
> annoyed users on your hands.

It's not obvious that these should both succeed anyways. I would have expected
you to do SELECT ... FOR UPDATE and lock the record. This would still allow
other sessions to retrieve the data but not begin editing it. You would
presumably want to use NOWAIT as well and handle the error if it's already
locked.

That would prevent two users from ever getting to the edit screen. You could
give the second user the option of breaking the lock -- rolling back the other
user's transaction.

> Both of these fall outside MVCC. You can already check if the record
> was modified since you looked at it, no extra features needed there.
> Can you give an example of where MVCC for long running transactions
> makes sense?

You're assuming a simple case of a single record. What if the update screen
covers a complete data structure represented by many records in many tables.
And the update itself requires multiple stages on several different screens.
Now you reach a conflict and want to roll back all the changes from all those
screens. That requires a fairly large amount of machinery and all that
machinery already exists in Postgres. If you really need all that complexity
it makes sense to leverage the tool you have that implements it all.

I agree with Tom Lane here and the conventional dogma that you can nearly
always avoid this entire problem. And avoiding the problem nearly always leads
to simpler cleaner systems than trying to present transactional semantics to
the user. Your complaints all boil down to it being a bad idea to have such a
complex interface. But if your business case requires it then you're going to
have to bite the bullet and eat the added complexity and you may as well use
the best tool available to do it.

--
greg

Re: PREPARE TRANSACTION and webapps

From

"Guy Rouillier"

Date:

16 November 2005, 15:51:39

Greg Stark wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>
>> Greg Stark <gsstark@mit.edu> writes:
>>> Tom Lane <tgl@sss.pgh.pa.us> writes:
>>>> What this would actually be useful for is a fair question though
>>>> --- what's it do that you don't have now?
>>
>>> I think what they want to do is make the database concept of
>>> transactions match up 1-1 with their application's concept of
>>> transactions. Which may span multiple stateless http requests.
>>
>> [ itch... ]  This seems to me to fly right in the face of the
>> oft-repeated advice that you don't hold a transaction open while the
>> user thinks about it, goes off to lunch, vacations in the Bahamas,
>> etc.
>
> Sure, I said that was the answer people get when they ask about this.
> And it's clearly better approach if it's available. But what if the
> locking or MVCC semantcis *are* what you need?

This problem is well understood and solutions are readily available.  If
you have a small amount of data, you cache it in the web server's
session, then once the user "confirms" the transaction, you write it all
at once to the DB.  If you have a significant amount of information, you
create a set of "shopping cart" tables and populate those as the end
user progresses through the transaction.  Once the user confirms the
transaction, you read it from the shopping cart tables and write it all
at once into the final tables.

Having worked with web-based, transaction-oriented applications for
almost 10 years now, I don't see any justification for holding an actual
database transaction open between HTTP requests.  As Tom correctly
points out, there is no guarantee whatsoever that the end user will ever
complete such a transaction.

>
> If you really do need to allow one user to edit the information and
> still present the existing information to others but not let them
> update it concurrently, etc. Reimplementing full ACID semantics is
> hard and easy to get wrong. We already have a tool that provides them
> properly.

--
Guy Rouillier

Re: PREPARE TRANSACTION and webapps

From

Greg Stark

Date:

16 November 2005, 17:09:15

"Guy Rouillier" <guyr@masergy.com> writes:

> Having worked with web-based, transaction-oriented applications for
> almost 10 years now, I don't see any justification for holding an actual
> database transaction open between HTTP requests.  As Tom correctly
> points out, there is no guarantee whatsoever that the end user will ever
> complete such a transaction.

Indeed I've never needed them either. But then I've never worked on a banking
system or an airline ticket reservations system, or anything that would need
anything but the simplest of transactions. So I've always found a way to
finesse the issue and avoid entirely the entire field of having to deal with
expiring sessions and conflict resolution.

But the fact that these problems exist don't militate for either database
transactions or an application level reimplementation of transactions. In
either case you'll have to deal with expiring and rolling back old
transactions and with resolving conflicts.

I take it as a given that if suspended transactions were ever to appear people
would expect a system table that let them list suspended transactions and how
when they were suspended. Otherwise they just wouldn't be very manageable.

--
greg

Re: PREPARE TRANSACTION and webapps

From

"Guy Rouillier"

Date:

16 November 2005, 20:04:59

gsstark@mit.edu wrote:

> I take it as a given that if suspended transactions were ever to
> appear people would expect a system table that let them list
> suspended transactions and how when they were suspended. Otherwise
> they just wouldn't be very manageable.

Regarding web applications, this turns out not to be too hard of a
problem to solve.  Using the two options I identified: if you are able
to keep all your transaction data in the web server session, then this
data just disappears when the session goes away.  No clean up necessary.
If you maintain state in a set of database tables, most implementations
assume that if the transaction ages past some threshold value (e.g., one
hour) without completion, the submitter decided not to complete.  So you
just run a cron job once an hour that sweeps through these tables and
deletes anything older than the threshold.  If you want to allow the
submitter to return at a later date and resume, then you are right, this
gets a little trickier, but not much.  You still do the threshold
checking in case the submitter never returns, but you up the threshold
value to two weeks (or whatever.)  And if the submitter does return, you
force him/her to resume or delete the existing transaction before they
can start a new one.

--
Guy Rouillier

Re: PREPARE TRANSACTION and webapps

From

Lincoln Yeoh

Date:

17 November 2005, 11:16:27

At 06:04 PM 11/16/2005 +0100, Martijn van Oosterhout wrote:

>On Thu, Nov 17, 2005 at 12:29:25AM +0800, Lincoln Yeoh wrote:
> > My assumption is that pending transactions (e.g. locks and other metainfo)
> > will take much less memory than database backends.
>
>They make take less memory but they take many more resources. Backend
>don't take locks by themselves, transactions do.

Just curious: how much memory do locks/transactions occupy as a rough
percentage of backend memory usage? Assume a "typical" active backend
(5MB?). If it's 50% then sure forget it. But if it's 5% or even 1%...

>Obviously these should both succeed. reading data doesn't block. Ten
>minutes later user 1 submits an update and goes to lunch without
>committing. User 2 then does an update but he has to wait. How long?
>Well, by your definition, forever. I doubt user 2 will be very happy
>with that.

I believe in postgresql there's "select for update ...  nowait" or
something like that, and transactions can have savepoints.

Also, if that sort of thing is a problem you could very easily link a user
session to pending uncommitted database transactions. Once the user session
times out you rollback all linked transactions.

I'm sure the solutions are decades old. After all in the dumb terminal
days, couldn't transactions be held open for quite a long time too?

>The way I would think about it would be to (a) let user 2 know straight
>away someone else is already looking at this record. This is useful
>info, maybe they talked to the same customer? and (b) when user 2
>submits his edit he should be warned there are conflict and be asked to
>resolve them. If you abort either transaction you're going to have some
>annoyed users on your hands.

What I used to do was make copies in event of a "collision" - but it starts
to get closer to a "version control and merging" problem, and less of a
transaction problem ;).

If so many people have no problems with doing transactions at the
application/middleware level, no wonder MySQL 3 was good enough for them -
they had little need for MVCC and ACID databases, since they were already
doing all that at a higher layer.

For what it is worth, I've done that sort of stuff at the application level
too. "shopping cart" tables, tables with "transaction_id" columns, a
transaction table, etc etc. I dunno about you all, but having to do that
feels a bit like using MySQL 4 - some tables "support transactions" and
some don't.

Oh well, maybe it's just not such a good idea after all. Just thought it
might be feasible and useful.

Regards,
Link.

Re: PREPARE TRANSACTION and webapps

From

Greg Stark

Date:

17 November 2005, 12:32:18

Lincoln Yeoh <lyeoh@pop.jaring.my> writes:

> At 06:04 PM 11/16/2005 +0100, Martijn van Oosterhout wrote:
>
> >On Thu, Nov 17, 2005 at 12:29:25AM +0800, Lincoln Yeoh wrote:
> > > My assumption is that pending transactions (e.g. locks and other metainfo)
> > > will take much less memory than database backends.
> >
> >They make take less memory but they take many more resources. Backend
> >don't take locks by themselves, transactions do.
>
> Just curious: how much memory do locks/transactions occupy as a rough
> percentage of backend memory usage? Assume a "typical" active backend (5MB?).
> If it's 50% then sure forget it. But if it's 5% or even 1%...

I'm not sure I agree that 100% overhead is reasonable.

The biggest cost though is in context switching. Instead of having 8 processes
100% busy on 8 processors you have 100 or 1,000 processes mostly idle and
frantically context switching between them.

And in the additional complexity of having to make sure the right database
connection gets reassociated with each application request. The most popular
web server architecture can't even do this since they're in separate
processes; it would necessitate involving yet another process and another
context switch for every bit of data going both directions.

--
greg