Re: Failed Statements within Transactions - Mailing list pgsql-general

From Tim Kientzle
Subject Re: Failed Statements within Transactions
Date
Msg-id 3A4D44D2.BB34862B@acm.org
Whole thread Raw
Responses Re: Re: Failed Statements within Transactions  (Jan Wieck <janwieck@Yahoo.com>)
List pgsql-general
> ... if a query fails inside a transactions block,
> PostgreSQL "RollBack"s the whole transaction ...

In my experience, it's unusual for SELECT statements
(queries) to cause errors.  If they do, the statements
themselves are likely broken.  (A SELECT that returns
zero rows is not an error.)  Likewise, many
other SQL statements (UPDATE, DELETE) rarely
cause errors; they usually "fail" by modifying
zero rows, which does not affect the transaction
state.

However, I do agree that a failed INSERT should
not automatically rollback the entire transaction.
INSERTs often fail due to integrity constraints.
The common suggestion of doing a test SELECT
first to determine whether to INSERT or UPDATE
is simple nonsense; that just creates a race
condition.  If you lose the race, you have
to replay the entire transaction, which
is a waste of CPU cycles.  (Worse, you're more
likely to lose the race on a more heavily loaded
system, which is exactly when you most want
to avoid replaying complex transactions.)

The INSERT/UPDATE problem is an inherent part
of the SQL language.  There is no particularly
direct way in standard SQL to ensure that a particular
(unique) piece of data is in a table.  The most
consistently reliable way to handle this is to have an
integrity constraint, try the INSERT, and then possibly
do an UPDATE if the INSERT fails.  I'm told this is the
preferred strategy on other DB systems, and it's the only
one that avoids any kind of race condition.  Furthermore,
it is usually much faster than doing a pre-SELECT.  (If the data
isn't already there, the INSERT is clearly fastest, if it is,
the INSERT will typically fail and return an error more
quickly than a SELECT would complete.  Other strategies
are more efficient if the data is usually already
there.)

PostgreSQL's behavior essentially forces you to
do BOTH of the following:
  * SELECT first to see whether to INSERT or UPDATE
  * AND be prepared to replay the _entire_ transaction
    when you lose the race.
You will sometimes lose the race, so the second
step is not optional.  I suspect many complex
PostgreSQL applications have infrequent failures
precisely because they aren't prepared to replay entire
transactions (with the correct INSERTs converted to
UPDATEs).  For a complex transaction which must ensure
certain data exists in several tables, this gets
pretty ugly.

Basically, the PostgreSQL developers have decided
that any integrity violation is a serious error;
therefore, PostgreSQL does not really permit
tentative INSERTs within transactions.  This violates
SQL conventions that are pretty well-established
in some circles, needlessly complicates
applications that use complex transactions
and introduces a fairly minor performance issue.

            - Tim

pgsql-general by date:

Previous
From: Hubert Hafner
Date:
Subject: trouble with phpPgAdmin
Next
From: Chris Green
Date:
Subject: authenticating local users