Thread: Re: [Fwd: Re: Python client + select = locked resources???]

Re: [Fwd: Re: Python client + select = locked resources???]

From
Craig Ringer
Date:
On Wed, 2009-07-01 at 09:23 +0200, durumdara wrote:

> > > In this world I was everytime in transaction, because of reads are
> > > also working under transactions.
> > >
> >
> > Just like PostgreSQL. You can't run a query without a transaction in
> > PostgreSQL; if you don't issue an explicit BEGIN, it'll do an implicit
> > BEGIN/COMMIT around the statement.
> >
> Aha... So if I'm getting out from normal transactions I moved into
> implicit autocommit way.

Correct.

> Hmmm... Then that is meaning that every statement is in new
> transaction context which can makes inconsistency in the views...
> For example (pseudo):
> select item_head, count(items)
> select items
>
> Possible: count(items) <> len(fetchall(items)) if someone committed a
> new record into "items" table...
>
> Am I thinking good?

You are. However, this is ALSO true in a transaction by default.
PostgreSQL defaults to the READ COMMITTED isolation level, which means
that statements may see data that was committed by another transaction
after the start of the transaction in which the statement is run, but
before the start of the statement.

If you want to avoid that, you may use the SERIALIZABLE isolation level.
That has its own complications and costs, though, including the need to
be prepared to retry any transaction after a serialization failure.

(Of course, your app should be prepared to retry a transaction ANYWAY
unless you're incredibly sure your code is perfectly free from lock
conflicts etc).

See:

http://www.postgresql.org/docs/8.3/static/transaction-iso.html

Once again, I VERY strongly recommend reading the whole PostgreSQL
manual. It'll teach you a lot about SQL and relational databases in
general as well as PostgreSQL in particular, and is very well written.

> So I need:
>
>         begin;
>         select item_head, count(items)
>         select items
>         rollback;
>
> to get full consistent data-sets?

That won't protect you from:

TRANSACTION 1                     TRANSACTION 2
begin;
                                  begin;
select item_head, count(items)
                                  INSERT INTO items(...)
                                  commit;
select items

"select items" in transaction 1 can see the changes made by transaction
2 after transaction 2 commits in the default READ COMMITTED isolation
level.

See the documentation I linked to above.

> And now I know from your mail that isn't true - it will be easier if I
> shut down the webserver, make the modifications on PGDB and after that
> I restart them all.

You can do that, but YOU SHOULD NOT HAVE TO.

If you have to, your web server or the code running in your web server
is buggy and is holding transactions open longer than it needs them.
Your web server and web app should be able to remain running and retain
connections to the database (so long as they're not idle in a
transaction that's done work). There will be a brief pause while the
ALTER TABLE executes, but that's all.

Make sure there are no connections that're idle in transaction:

   select * from pg_stat_activity

If there are, you may need to alter settings in your web server,
connection pooler, or web app.

> Yes. If I can make a rollback on it, all of resources released.
> Now I search for a way to "force dbutils to it must make a rollback
> before it re-move the connection into it's pool", or a way to I can do
> this easily from the webserver...

Yes, that's what you need to do.

--
Craig Ringer


Re: [Fwd: Re: Python client + select = locked resources???]

From
durumdara
Date:
Hi!

2009.07.01. 9:43 keltezéssel, Craig Ringer írta:
On Wed, 2009-07-01 at 09:23 +0200, durumdara wrote:

 
Firebird have repeatable read, but PG is not have it. Hmmm... Then that is meaning that every statement is in new
transaction context which can makes inconsistency in the views...
For example (pseudo):
select item_head, count(items)
select items

Possible: count(items) <> len(fetchall(items)) if someone committed a
new record into "items" table...

Am I thinking good?   
You are. However, this is ALSO true in a transaction by default. 
Sh*t... (* = [o | u ])... :-(

MySQL doc:

REPEATABLE READ

This is the default isolation level for InnoDB
. For consistent reads, there is an important difference from the READ COMMITTED isolation level: All consistent reads within the same transaction read the snapshot established by the first read. This convention means that if you issue several plain (nonlocking) SELECT statements within the same transaction, these SELECT statements are consistent also with respect to each other. See Section 13.2.8.2,
PostgreSQL defaults to the READ COMMITTED isolation level, which means
that statements may see data that was committed by another transaction
after the start of the transaction in which the statement is run, but
before the start of the statement. 
Ahhhh... this was I want to avoid.

In the Firebird world I simply used Repeatable Read without concurrency or other error.

If some updates had conflicts in nonwaiting mode, the FB sent an error message me to show, we had a problem.
But normally, if someone simply read the tables, or inserted new records to it, the FB handle this case without errors.

I got errormessage only if two transactions want to do something in same record in same time...

This is conflict:
tr1: update a set a.code = a.code
tr2: update a set a.name = "al" where a.code = 1

This is not:
tr1: select count(*) a1 from a
tr2: update a set a.name = "al" where a.code = 1
tr2: insert into a ...
tr1: select count(*) a2 from a (a1 = a2)
tr1: select * from a (a1 = a2 = fetched(records))

If you want to avoid that, you may use the SERIALIZABLE isolation level.
That has its own complications and costs, though, including the need to
be prepared to retry any transaction after a serialization failure.

(Of course, your app should be prepared to retry a transaction ANYWAY
unless you're incredibly sure your code is perfectly free from lock
conflicts etc). 
Sometimes we need consistent data, this [select count(*) from a <> fetchall(select * from a)] not good result.

But I'm not sure in SERIALIZABLE mode because I don't know, if I change the records, or add new records to table a, I can get some errors in any of the clients, or PG handle this without problems - as Firebird do it, or I got many errors in the clients.

Data integrity is very important sometimes  - for count(*) = len(fetched(*)), and for querys, sums, subqueries are let equal.

See:

http://www.postgresql.org/docs/8.3/static/transaction-iso.html

Once again, I VERY strongly recommend reading the whole PostgreSQL
manual. It'll teach you a lot about SQL and relational databases in
general as well as PostgreSQL in particular, and is very well written. 
Ok, I understand it, I read it, but experience is more and more than the read.

For example:
I read the apache/fastcgi documentation, but never I think that it is not working in Windows (as working in Linux)... :-(

Sometimes the people need to release her/his ideas from the idealist world, because hard to realize.

Possible if I change my default transactions to "serial", it is not working; and it is better to working with read committed - and with some little mistakes that client's won't see...

Thanks:
    dd