Thread: Transaction isolation levels

Transaction isolation levels

From

Geert Jansen

Date:

09 July 2005, 08:18:41

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I'm having some trouble with transaction isolation levels, and would
appreciate some input on this.

According to the documentation of PostgreSQL 8.0.3, section 12.2:

"Read Committed is the default isolation level in PostgreSQL. When a
transaction runs on this isolation level, a SELECT query sees only data
committed before the query began; it never sees either uncommitted data
or changes committed during query execution by concurrent transactions."

Therefore, in this isolation level, I should not see data committed by
another concurrent transaction.

Now I open two terminals with a 'pgsql' command, and do the following:

terminal 1:
    geertj=> \set AUTOCOMMIT off
    geertj=> set transaction isolation level read committed;
    SET
    geertj=> create table products (id integer not null, code char(20)
not null);
    CREATE TABLE
    geertj=> commit; begin;
    COMMIT
    BEGIN

terminal 2:
    geertj=> \set AUTOCOMMIT off
    geertj=> set transaction isolation level read committed;
    SET
    geertj=> select * from products;
     id | code
    ----+------
    (0 rows)

terminal 1:
    geertj=> insert into products values (10, 'pr10');
    INSERT 34071 1
    geertj=> commit;
    COMMIT

terminal 2:
    geertj=> select * from products;
     id |         code
    ----+----------------------
     10 | pr10


As you see, the row inserted in the transaction from terminal 1 appears
into terminal 2, but terminal 2 had a transaction open that did not commit.

The top of section 12.2 of the manual tells that nonrepeatable reads may
happen in the 'read committed' isolation level. I can understand the
above behaviour in terms of this.  However, the start of section 12.2.1
tells that in read committed mode, you get a snapshot of the database
when the transaction starts, and that snapshot will not inlude committed
changes from other transactions. Who is right here?

Regards,
Geert




-----BEGIN PGP SIGNATURE-----
Version: PGP Desktop 9.0.1 (Build 2185)

iQEVAwUBQs+ycZrPkVOV4lDgAQK0NggAonVk+Qmwui4a49UJr10P7mxcmrFblw+x
6HM6yXVCBk4qRczRT8BelQp3fBQR/8kostAbmnQ27Pes0wFPOsUEjiyWyskKdCtU
Mt5OXQsV7jAPAwPgFDwnYbd0geDVK76WwTJAKuXBunL5/Mz92nv6XHxWvFjLNNEs
laPpirH1xGFy9Po0kpYLx7Orgg6I/m/BM4V4BItZqVcFaiFLnCTt+Lolimk6j65Z
NzHIkTLq2C+ju1EqvbF6M1euAR0ni6D4DLIUd11XJnw0A8mx/+7ZV5ZGcHv+X6YT
dsfutW17jZkzlB9yUyKa91wII0leo/fFlRCnoSYqG3ONGJLoiNJdXw==
=/78D
-----END PGP SIGNATURE-----

Re: Transaction isolation levels

From

Michael Fuhr

Date:

09 July 2005, 10:03:02

On Sat, Jul 09, 2005 at 01:18:09PM +0200, Geert Jansen wrote:
>
> The top of section 12.2 of the manual tells that nonrepeatable reads may
> happen in the 'read committed' isolation level. I can understand the
> above behaviour in terms of this.  However, the start of section 12.2.1
> tells that in read committed mode, you get a snapshot of the database
> when the transaction starts, and that snapshot will not inlude committed
> changes from other transactions. Who is right here?

What words in section 12.2.1 are you referring to?  I see the
following (emphasis mine):

  In effect, a SELECT query sees a snapshot of the database as of
  the instant that that *query* begins to run.  Notice that two
  successive SELECT commands can see different data, even though
  they are within a single transaction, if other transactions commit
  changes during execution of the first SELECT.

And later (emphasis mine):

  Since in Read Committed mode each new *command* starts with a new
  snapshot that includes all transactions committed up to that
  instant, subsequent commands in the same transaction will see the
  effects of the committed concurrent transaction in any case.

Perhaps you're reading 12.2.2 on Serializable (again, emphasis mine):

  When a transaction is on the serializable level, a SELECT query
  sees only data committed before the *transaction* began; it never
  sees either uncommitted data or changes committed during transaction
  execution by concurrent transactions....This is different from
  Read Committed in that the SELECT sees a snapshot as of the start
  of the *transaction*, not as of the start of the current query
  within the transaction.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Transaction isolation levels

From

Geert Jansen

Date:

09 July 2005, 11:45:03

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Fuhr wrote:

>What words in section 12.2.1 are you referring to?  I see the
>following (emphasis mine):
>
>  In effect, a SELECT query sees a snapshot of the database as of
>  the instant that that *query* begins to run.  Notice that two
>  successive SELECT commands can see different data, even though
>  they are within a single transaction, if other transactions commit
>  changes during execution of the first SELECT.
>
>
Apologies for the confusion. Indeed the documentation is right. I
confused "start of query" with "start of transaction".

For my curiosity, what is the reason that in read comitted mode MVCC can
guarantee a consistent snapshot durign a query, while in serializable
mode MVCC can guarantee it for a longer interval (the entire
transaction). Are these different MVCC implementations, or is some kind
of locking performed when executing queries in read committed mode?

Thanks,
Geert

-----BEGIN PGP SIGNATURE-----
Version: PGP Desktop 9.0.1 (Build 2185)

iQEVAwUBQs/izprPkVOV4lDgAQLpFggAqbxaFnMfajg94aaUcWF/IK0ZlEuEVVGl
2hmb295ukCGOV7IjhMFocCQMWxLHcAuwP4vBJM/Dkcy9HW7PGCM6XWJyWbQMYHcn
iWZAw/p2bllxByEe6xli+J2VdeZAohHPixRibLa+uqCE594o2NPK8g06HmPz28rb
noZefXv0+4rGI4Nlw03bLbt80WV2+bYkLFfDveEHt8tx93KS8p+cfv+KL7sFwtzK
1ZbSm1TEdCmPfLAo8UI5YZKJtHA58XoW89WTgACATz04TNHMzni+U5ToMqX4tKOI
6AniwTocjko9Nt+ul8CFquOOoyk7WlOVrtUXpUAg4QEsBMA503Z3uA==
=JBQU
-----END PGP SIGNATURE-----

Re: Transaction isolation levels

From

Tom Lane

Date:

09 July 2005, 12:31:07

Geert Jansen <geert@boskant.nl> writes:
> For my curiosity, what is the reason that in read comitted mode MVCC can
> guarantee a consistent snapshot durign a query, while in serializable
> mode MVCC can guarantee it for a longer interval (the entire
> transaction). Are these different MVCC implementations, or is some kind
> of locking performed when executing queries in read committed mode?

It's the same implementation in either case; it's just a matter of which
"snapshot" we refer to when deciding whether recently-committed row
versions are visible to our own query.  The snapshot is basically a
list of open transactions, which are to be considered not-committed
even if they in fact commit while our query or transaction continues.

            regards, tom lane

checkpoint segments

From

Havasvölgyi Ottó

Date:

09 July 2005, 19:37:40

Hi,

I have a small database on my Windows XP, I rarely use it. Even so, the log
file show says this:

[2005-07-02 02:02:09] LOG:  received fast shutdown request

 [2005-07-02 02:02:09] LOG:  checkpoints are occurring too frequently (0
seconds apart)

 [2005-07-02 02:02:09] HINT:  Consider increasing the configuration
parameter "checkpoint_segments".

 [2005-07-02 02:02:09] LOG:  shutting down

 [2005-07-02 02:02:10] LOG:  database system is shut down

 [2005-07-02 02:02:13] LOG:  logger shutting down


My checkpoint_segments setting is default, 3, and there are 2 wal log files.
Why does it say, that "too frequenty"?
In fact what are checkpoints? These files? I couldn't understand it exactly
from the docs.

Thanks,
Otto

Re: checkpoint segments

From

Havasvölgyi Ottó

Date:

09 July 2005, 19:56:37

Hi,

Sorry, this is Pg 8.0.3

Regards,
Otto

Re: checkpoint segments

From

Bruce Momjian

Date:

09 July 2005, 20:05:32

There was a problem with this setting that it reports during shutdown
when in fact it should ignore shutdown checkpoints.  This is fixed in 8.1.

---------------------------------------------------------------------------

Havasv�lgyi Ott� wrote:
> Hi,
>
> I have a small database on my Windows XP, I rarely use it. Even so, the log
> file show says this:
>
> [2005-07-02 02:02:09] LOG:  received fast shutdown request
>
>  [2005-07-02 02:02:09] LOG:  checkpoints are occurring too frequently (0
> seconds apart)
>
>  [2005-07-02 02:02:09] HINT:  Consider increasing the configuration
> parameter "checkpoint_segments".
>
>  [2005-07-02 02:02:09] LOG:  shutting down
>
>  [2005-07-02 02:02:10] LOG:  database system is shut down
>
>  [2005-07-02 02:02:13] LOG:  logger shutting down
>
>
> My checkpoint_segments setting is default, 3, and there are 2 wal log files.
> Why does it say, that "too frequenty"?
> In fact what are checkpoints? These files? I couldn't understand it exactly
> from the docs.
>
> Thanks,
> Otto
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Update more than one table

From

David Pratt

Date:

09 July 2005, 22:55:38

Hi. I have a form that collects information from the user but then I
need to update three separate tables from what the user has submitted.
I could do this with application logic but I would feel it would be
best handled in Postgres as a transaction.

I need to do things in this order to satisfy the foreign key
constraints:

1.   Insert part of the data into 2 records of the first table (I need
to return theses ids so available for the next insert).

2.   Insert part of the data into a record in a second table.  The id's
created in 1. need to be part of this record (cannot be null values)
and have also have referential integrity with the first table

3.   Insert the last part of the data into a record in a third table.
The id created in 2 needs to be part of this record). This has
referential integrity with the second table.

Can someone suggest the best way of handling this.  Triggers are out
since each update requires different fields.  I am thinking the only
way to do this is a function. So biggest question is how to return the
ids created from the first update (so they can be used by the second)
and then need the id generated from second update (so it can be used
the third). The ids for each table are serial type so they each have a
sequence associated with them.  Would you nest functions?  Create each
separately and wrap them within one function? Even with this, I am
thinking the most important part is how do I return the id from the
record I just inserted in a table.

Many thanks
David

Re: Update more than one table

From

Roman Neuhauser

Date:

10 July 2005, 14:32:26

# fairwinds@eastlink.ca / 2005-07-09 22:55:26 -0300:
> Hi. I have a form that collects information from the user but then I
> need to update three separate tables from what the user has submitted.
> I could do this with application logic but I would feel it would be
> best handled in Postgres as a transaction.

    Those two don't conflict.

> I need to do things in this order to satisfy the foreign key
> constraints:
>
> 1.   Insert part of the data into 2 records of the first table (I need
> to return theses ids so available for the next insert).
>
> 2.   Insert part of the data into a record in a second table.  The id's
> created in 1. need to be part of this record (cannot be null values)
> and have also have referential integrity with the first table
>
> 3.   Insert the last part of the data into a record in a third table.
> The id created in 2 needs to be part of this record). This has
> referential integrity with the second table.

    metacode:

    BEGIN;
    INSERT INTO first_table ...;
    SELECT currval(first_table);
    INSERT INTO first_table ...;
    SELECT currval(first_table);
    INSERT INTO second_table ...;
    INSERT INTO third_table (... currval(second_table));
    COMMIT;

    You can do this with any CLI, like libpq, the Perl DBI, PHP/PEAR
    pgsql_* functions or DB...

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991

Re: Update more than one table

From

David Pratt

Date:

10 July 2005, 15:05:46

Hi Roman.  Many thanks for your reply.  This is interesting and will I
give this a try and let you know how it works out. With this you are
right, application logic and transaction don't have to be separate
which would be nice for this.  I was thinking the only way to solve was
a function that performed an update and returned the nextval at the
same time so that I could use that value to perform the update on next
table,etc.

Regards,
David

On Sunday, July 10, 2005, at 02:32 PM, Roman Neuhauser wrote:

> # fairwinds@eastlink.ca / 2005-07-09 22:55:26 -0300:
>> Hi. I have a form that collects information from the user but then I
>> need to update three separate tables from what the user has submitted.
>> I could do this with application logic but I would feel it would be
>> best handled in Postgres as a transaction.
>
>     Those two don't conflict.
>
>> I need to do things in this order to satisfy the foreign key
>> constraints:
>>
>> 1.   Insert part of the data into 2 records of the first table (I need
>> to return theses ids so available for the next insert).
>>
>> 2.   Insert part of the data into a record in a second table.  The
>> id's
>> created in 1. need to be part of this record (cannot be null values)
>> and have also have referential integrity with the first table
>>
>> 3.   Insert the last part of the data into a record in a third table.
>> The id created in 2 needs to be part of this record). This has
>> referential integrity with the second table.
>
>     metacode:
>
>     BEGIN;
>     INSERT INTO first_table ...;
>     SELECT currval(first_table);
>     INSERT INTO first_table ...;
>     SELECT currval(first_table);
>     INSERT INTO second_table ...;
>     INSERT INTO third_table (... currval(second_table));
>     COMMIT;
>
>     You can do this with any CLI, like libpq, the Perl DBI, PHP/PEAR
>     pgsql_* functions or DB...
>
> --
> How many Vietnam vets does it take to screw in a light bulb?
> You don't know, man.  You don't KNOW.
> Cause you weren't THERE.             http://bash.org/?255991
>

Re: Update more than one table

From

Bruno Wolff III

Date:

12 July 2005, 12:08:14

On Sun, Jul 10, 2005 at 15:05:30 -0300,
  David Pratt <fairwinds@eastlink.ca> wrote:
> Hi Roman.  Many thanks for your reply.  This is interesting and will I
> give this a try and let you know how it works out. With this you are
> right, application logic and transaction don't have to be separate
> which would be nice for this.  I was thinking the only way to solve was
> a function that performed an update and returned the nextval at the
> same time so that I could use that value to perform the update on next
> table,etc.

Normally you can just use currval. But in your case you insert insert two
records and currval will only return the value of the second record's key.
Assuming the first record's key is one less than the second's is not a good
idea. With the current version you can probably make this work reliably
by grabbing a block of ids for your session and making sure that the two
records get their keys from the same preallocated block.

Another option that I think could work is to make the two foreign key checks
deferrable and insert the record for table 2 before the two records in
table 1. You can use nextval(pg_get_serial_sequence('table1', 'table1key'))
twice in the insert. Then when inserting the two entries into table 1 you
can use currval to get the key value for the record in table 2 and use the
appropiate column for each of the two records. As long as you aren't
depending on the ordering of the key values for the two records in table 1
you should be OK.

Re: Update more than one table

From

Roman Neuhauser

Date:

12 July 2005, 12:35:45

# bruno@wolff.to / 2005-07-12 10:08:37 -0500:
> On Sun, Jul 10, 2005 at 15:05:30 -0300,
>   David Pratt <fairwinds@eastlink.ca> wrote:
> > Hi Roman.  Many thanks for your reply.  This is interesting and will I
> > give this a try and let you know how it works out. With this you are
> > right, application logic and transaction don't have to be separate
> > which would be nice for this.  I was thinking the only way to solve was
> > a function that performed an update and returned the nextval at the
> > same time so that I could use that value to perform the update on next
> > table,etc.
>
> Normally you can just use currval. But in your case you insert insert two
> records and currval will only return the value of the second record's key.
> Assuming the first record's key is one less than the second's is not a good
> idea. With the current version you can probably make this work reliably
> by grabbing a block of ids for your session and making sure that the two
> records get their keys from the same preallocated block.

    Notice the pseudo code I posted:

    INSERT INTO first_table ...;  <- insert one row
    SELECT currval(first_table);  <- first currval()
    INSERT INTO first_table ...;  <- insert another row
    SELECT currval(first_table);  <- second currval()
    INSERT INTO second_table ...; <- this ellipsis hides the two
                                     currval() return values

    See? I didn't assume anything.


--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991

Re: Update more than one table

From

Bruno Wolff III

Date:

12 July 2005, 14:11:24

On Tue, Jul 12, 2005 at 17:35:35 +0200,
  Roman Neuhauser <neuhauser@sigpipe.cz> wrote:
> # bruno@wolff.to / 2005-07-12 10:08:37 -0500:
> > On Sun, Jul 10, 2005 at 15:05:30 -0300,
> >   David Pratt <fairwinds@eastlink.ca> wrote:
> > > Hi Roman.  Many thanks for your reply.  This is interesting and will I
> > > give this a try and let you know how it works out. With this you are
> > > right, application logic and transaction don't have to be separate
> > > which would be nice for this.  I was thinking the only way to solve was
> > > a function that performed an update and returned the nextval at the
> > > same time so that I could use that value to perform the update on next
> > > table,etc.
> >
> > Normally you can just use currval. But in your case you insert insert two
> > records and currval will only return the value of the second record's key.
> > Assuming the first record's key is one less than the second's is not a good
> > idea. With the current version you can probably make this work reliably
> > by grabbing a block of ids for your session and making sure that the two
> > records get their keys from the same preallocated block.
>
>     Notice the pseudo code I posted:
>
>     INSERT INTO first_table ...;  <- insert one row
>     SELECT currval(first_table);  <- first currval()
>     INSERT INTO first_table ...;  <- insert another row
>     SELECT currval(first_table);  <- second currval()
>     INSERT INTO second_table ...; <- this ellipsis hides the two
>                                      currval() return values
>
>     See? I didn't assume anything.

I didn't claim that your approach was wrong. The issue is trying to reduce
the number of round trips by eliminating the two selects.

Re: Update more than one table

From

David Pratt

Date:

12 July 2005, 14:24:53

Hi Bruno and Roman. I am attempting to implement your advice. Bruno,
how do I make a foreign key deferable since this sounds like an
interesting approach.

I have got another problem on top of the first. For the first two
inserts I need to insert a multi-dimensional array into one of the
fields of the table, and the order of the arrays within the larger
array is important.  So I am in the process of making a function that
will insert the record into first table, rewrite the array and return
currval.  So problem I have run into is passing multi-dimensional array
as a parameter for a function.

To do a basic test of passing an array into a function I did this:

CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS '

    DECLARE

    test_array ALIAS FOR $1;      -- alias for input array

    BEGIN

        return array_upper(test_array,1)

    END;
' LANGUAGE 'plpgsql';

SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test
two']]) AS output;

but I am getting syntax errors and I tried a variety of ways to quote
the SELECT string and can't seem to get it take the array as an input :(

Given the fact I will now have a function returning the currval for
each insert (once I determine to pass array to function ), will the
approaches suggested still work or should I create another function for
doing the update for table 2 inserting currval each time as variable
for select statement in the function and have function for insert in
table 2 return currval as well?

Regards,
David

On Tuesday, July 12, 2005, at 12:08 PM, Bruno Wolff III wrote:

> On Sun, Jul 10, 2005 at 15:05:30 -0300,
>   David Pratt <fairwinds@eastlink.ca> wrote:
>> Hi Roman.  Many thanks for your reply.  This is interesting and will I
>> give this a try and let you know how it works out. With this you are
>> right, application logic and transaction don't have to be separate
>> which would be nice for this.  I was thinking the only way to solve
>> was
>> a function that performed an update and returned the nextval at the
>> same time so that I could use that value to perform the update on next
>> table,etc.
>
> Normally you can just use currval. But in your case you insert insert
> two
> records and currval will only return the value of the second record's
> key.
> Assuming the first record's key is one less than the second's is not a
> good
> idea. With the current version you can probably make this work reliably
> by grabbing a block of ids for your session and making sure that the
> two
> records get their keys from the same preallocated block.
>
> Another option that I think could work is to make the two foreign key
> checks
> deferrable and insert the record for table 2 before the two records in
> table 1. You can use nextval(pg_get_serial_sequence('table1',
> 'table1key'))
> twice in the insert. Then when inserting the two entries into table 1
> you
> can use currval to get the key value for the record in table 2 and use
> the
> appropiate column for each of the two records. As long as you aren't
> depending on the ordering of the key values for the two records in
> table 1
> you should be OK.
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that
> your
>        message can get through to the mailing list cleanly
>

Re: Update more than one table

From

Roman Neuhauser

Date:

12 July 2005, 16:46:14

# bruno@wolff.to / 2005-07-12 12:11:45 -0500:
> On Tue, Jul 12, 2005 at 17:35:35 +0200,
>   Roman Neuhauser <neuhauser@sigpipe.cz> wrote:
> > # bruno@wolff.to / 2005-07-12 10:08:37 -0500:
> > > On Sun, Jul 10, 2005 at 15:05:30 -0300,
> > >   David Pratt <fairwinds@eastlink.ca> wrote:
> > > > I was thinking the only way to solve was a function that
> > > > performed an update and returned the nextval at the same time so
> > > > that I could use that value to perform the update on next
> > > > table,etc.
> > >
> > > Normally you can just use currval. But in your case you insert insert two
> > > records and currval will only return the value of the second record's key.
> > > Assuming the first record's key is one less than the second's is not a good
> > > idea. With the current version you can probably make this work reliably
> > > by grabbing a block of ids for your session and making sure that the two
> > > records get their keys from the same preallocated block.
> >
> >     Notice the pseudo code I posted:
> >
> >     INSERT INTO first_table ...;  <- insert one row
> >     SELECT currval(first_table);  <- first currval()
> >     INSERT INTO first_table ...;  <- insert another row
> >     SELECT currval(first_table);  <- second currval()
> >     INSERT INTO second_table ...; <- this ellipsis hides the two
> >                                      currval() return values
> >
> >     See? I didn't assume anything.
>
> I didn't claim that your approach was wrong. The issue is trying to reduce
> the number of round trips by eliminating the two selects.

    Sorry, I reread David's email (the part in question quoted above),
    and see what you were talking about.

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991