Thread: Re: Transactions and web applications

Re: Transactions and web applications

From

Lincoln Yeoh

Date:

23 June 2000, 02:47:11

Hi people,

What are the recommended ways to do transactions in web applications?

Right now I have persistent database connections, but currently they are
shared by multiple sessions. So I make sure that database level
transactions are closed after each page/http request.

In theory I could tie database connections to sessions, but then if I have
1000 outstanding (not necessarily active) sessions that would mean 1000
database connections, and 1000 postgres backend processes right? Ouch!

Because of this it looks like I may actually have to do application level
transactions instead of being able to rely solely on database level
transactions.

Am I missing something? Are there better/good alternatives?

Would it be possible to "name" transactions and continue them from other
database connections? Sounds impractical tho- what happens if two
connections try to continue a transaction at overlapping times :) (user
opens up multiple browsers from one page).

Cheerio,
Link.

Re: Transactions and web applications

From

Lincoln Yeoh

Date:

27 June 2000, 04:39:17

At 10:01 PM 26-06-2000 -0400, Michael Mayo wrote:
>----- Original Message -----
>From: "Lincoln Yeoh" <lylyeoh@mecomb.com>
>> What are the recommended ways to do transactions in web applications?
>>
>> Right now I have persistent database connections, but currently they are
>> shared by multiple sessions. So I make sure that database level
>> transactions are closed after each page/http request.
>>
>> In theory I could tie database connections to sessions, but then if I have
>> 1000 outstanding (not necessarily active) sessions that would mean 1000
>> database connections, and 1000 postgres backend processes right? Ouch!

>    However, are you sure that you're not overestimating your requirements?
>Assuming the average database connection takes 2 seconds to complete, it
>would require a load of 500 hits/sec to reach that 1000 figure you
>state...only the largest of sites get that kind of traffic.  If your site
>does get that popular, it is likely you will have the $$ to invest in enough
>hardware to support it.

Let me illustrate the problem with an example:

page 1
user logs in
page 2
user does A
page 3
user does B
page 4
user does C
page 5
user says, oops forget the whole thing, please rollback. Or user just says
forget it and closes browser. Or browser crashes at page 4.

What you suggest (database connection takes 2 second) seems to be CGI
style: for each webpage, connect to database, do stuff then disconnect. You
will not be able to take full advantage of database level transactions with
this approach- because with most databases when you disconnect there'll be
a rollback or commit.

Whereas if I try to use database level transactions to handle actual
transactions (reservations, rollbacks etc) for a webapp, if a user visits
the site, does something and then quits halfway, there's no way to know
about it until the timeout.

If the timeout value is 15 minutes, that's 900 seconds of waiting before
rollback/continuing the database transaction.

If I get 2 connections per second and people don't log out I hit that 1000
figure in 500 seconds. For postgres, each connection will involve a
separate backend taking say about 1MB. Bad news if:
database_connection_limit < timeout x peak_connections_per_second.

So this approach is not suitable for serving the whole world, might work
for a controlled number of internal users (tellers, helpdesks etc).

Right now I use application level transactions, with shared persistent
database connections for performance.

What I want to know is if I'm missing something? Is there another way to do
things?

>> Because of this it looks like I may actually have to do application level
>> transactions instead of being able to rely solely on database level
>> transactions.
>
>    I fail to see how application-level tranactions will help you here.
>Even if you do app-level transactions, you will still need a database engine
>to hold and query whatever data you have.  Maybe you are suggesting building
>your own database engine built into your app.  IMHO, this would be a rather

The application still uses the database. What I mean by application level
transactions is that the rollback/commit stuff, transactional isolation
will be handled by the application (at the application layer).

For example if there's a commit, I then update all relevant records to
indicate they're committed.
i.e. update tablex set committed=true where transactionid=14452;

And that's not what I call "help", just something I had to put up with :)..

For MySQL I had to do things differently: no database level transactions so
can't update X tables atomically. Every row had a transactionid but no
committed flag. Had a transaction table keeping track of which transactions
are valid. This allowed me to mark rows in X tables as committed, just by
updating a row in the transaction table. But it meant that every select has
to be joined with the transaction table.

Fortunately I didn't have to deal with uniqueness problems and all that,
that would have been more painful. Also not all tables needed transactions.
And I didn't have to write my own WAL :).

Do those Java object thingies help for these problems?

Cheers,
Link.

Re: Transactions and web applications

From

brianb-pggeneral@edsamail.com

Date:

27 June 2000, 06:44:45

Lincoln Yeoh writes:

> At 10:01 PM 26-06-2000 -0400, Michael Mayo wrote:
> >----- Original Message -----
> >From: "Lincoln Yeoh" <lylyeoh@mecomb.com>
> >> What are the recommended ways to do transactions in web applications?
> Let me illustrate the problem with an example:
>
> page 1
> user logs in
> page 2
> user does A
> page 3
> user does B
> page 4
> user does C
> page 5
> user says, oops forget the whole thing, please rollback. Or user just says
> forget it and closes browser. Or browser crashes at page 4.

Typically, web applications do not need to store the intermediate results
in the database. You can store them as hidden form variables on the
in-between pages or as cookies on the browser. Only when all the data is
ready to be passed to the database do you need to worry about the
transaction, and then it's only done when the webserver handles that
particular page.

This way, you limit your transaction to the time it takes to handle the
final "Submit" page.

If you're doing serious web/db stuff, you may want to look at the numerous
resources available at http://www.arsdigita.com/asj (formerly
http://photo.net/wtr/).

> Do those Java object thingies help for these problems?

:-)

Brian
--
Brian Baquiran <brianb@edsamail.com>
http://www.baquiran.com/ AIM: bbaquiran
Work: (632)7182222       Home: (632)9227123

Re: Transactions and web applications

From

"Michael Mayo"

Date:

28 June 2000, 23:31:54

----- Original Message -----
From: "Lincoln Yeoh" <lylyeoh@mecomb.com>
> What are the recommended ways to do transactions in web applications?
>
> Right now I have persistent database connections, but currently they are
> shared by multiple sessions. So I make sure that database level
> transactions are closed after each page/http request.
>
> In theory I could tie database connections to sessions, but then if I have
> 1000 outstanding (not necessarily active) sessions that would mean 1000
> database connections, and 1000 postgres backend processes right? Ouch!

    It depends on what form of caching you use.  If you are using
AOLserver-style pooling, then only those connections that require the use of
a database will get one from the pool.  Thus, you can have 50 active
AOLserver threads, but only 3 database handles in use if only 3 threads
require it.

    If you are using Apache-style pooling, each pool is kept on a per-child
basis.  So, the minimum number would be 1 for each child, which means if you
have 1000 children, you have 1000 database connections.  Partly for this
reason, really big sites that use Apache normally have separate computers
that serve static, dynamic, and database requests.

    However, are you sure that you're not overestimating your requirements?
Assuming the average database connection takes 2 seconds to complete, it
would require a load of 500 hits/sec to reach that 1000 figure you
state...only the largest of sites get that kind of traffic.  If your site
does get that popular, it is likely you will have the $$ to invest in enough
hardware to support it.

> Because of this it looks like I may actually have to do application level
> transactions instead of being able to rely solely on database level
> transactions.

    I fail to see how application-level tranactions will help you here.
Even if you do app-level transactions, you will still need a database engine
to hold and query whatever data you have.  Maybe you are suggesting building
your own database engine built into your app.  IMHO, this would be a rather
bad idea.  Do you really think you can duplicate most of the functionality
in PostgreSQL, so that it is reasonably stable, and reasonably fast, in a
reasonable amount of time?

    Hardware is generally cheaper than programmer time...personally, I would
need a better reason than "Apache::DBI requires more resources than I would
like" to embark upon a project like this.  But maybe I am misinterpeting
your question.

> Am I missing something? Are there better/good alternatives?
>
> Would it be possible to "name" transactions and continue them from other
> database connections? Sounds impractical tho- what happens if two
> connections try to continue a transaction at overlapping times :) (user
> opens up multiple browsers from one page).

    What would be the purpose anyway?  I don't see how it would help
performance or
the # of needed database connections.

            -Mike