Thread: Read only transactions - Commit or Rollback

Read only transactions - Commit or Rollback

From
Markus Schaber
Date:
Hello,

We have a database containing PostGIS MAP data, it is accessed mainly
via JDBC. There are multiple simultaneous read-only connections taken
from the JBoss connection pooling, and there usually are no active
writers. We use connection.setReadOnly(true).

Now my question is what is best performance-wise, if it does make any
difference at all:

Having autocommit on or off? (I presume "off")

Using commit or rollback?

Committing / rolling back occasionally (e. G. when returning the
connection to the pool) or not at all (until the pool closes the
connection)?

Thanks,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: Read only transactions - Commit or Rollback

From
Nörder-Tuitje, Marcus
Date:
afaik, this should be completely neglectable.

starting a transaction implies write access. if there is none, You do not need to think about transactions, because
thereare none. 

postgres needs to schedule the writing transactions with the reading ones, anyway.

But I am not that performance profession anyway ;-)


regards,
Marcus

-----Ursprüngliche Nachricht-----
Von: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von Markus
Schaber
Gesendet: Dienstag, 20. Dezember 2005 11:41
An: PostgreSQL Performance List
Betreff: [PERFORM] Read only transactions - Commit or Rollback


Hello,

We have a database containing PostGIS MAP data, it is accessed mainly
via JDBC. There are multiple simultaneous read-only connections taken
from the JBoss connection pooling, and there usually are no active
writers. We use connection.setReadOnly(true).

Now my question is what is best performance-wise, if it does make any
difference at all:

Having autocommit on or off? (I presume "off")

Using commit or rollback?

Committing / rolling back occasionally (e. G. when returning the
connection to the pool) or not at all (until the pool closes the
connection)?

Thanks,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend



Re: Read only transactions - Commit or Rollback

From
Grega Bremec
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Nörder-Tuitje wrote:
|> We have a database containing PostGIS MAP data, it is accessed
|> mainly via JDBC. There are multiple simultaneous read-only
|> connections taken from the JBoss connection pooling, and there
|> usually are no active writers. We use connection.setReadOnly(true).
|>
|> Now my question is what is best performance-wise, if it does make
|> any difference at all:
|>
|> Having autocommit on or off? (I presume "off")
|>
|> Using commit or rollback?
|>
|> Committing / rolling back occasionally (e. G. when returning the
|> connection to the pool) or not at all (until the pool closes the
|> connection)?
|>
| afaik, this should be completely neglectable.
|
| starting a transaction implies write access. if there is none, You do
| not need to think about transactions, because there are none.
|
| postgres needs to schedule the writing transactions with the reading
| ones, anyway.
|
| But I am not that performance profession anyway ;-)

Hello, Marcus, Nörder, list.

What about isolation? For several dependent calculations, MVCC doesn't
happen a bit with autocommit turned on, right?

Cheers,
- --
~    Grega Bremec
~    gregab at p0f dot net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDp+2afu4IwuB3+XoRA6j3AJ0Ri0/NrJtHg4xBNcFsVFFW0XvCoQCfereo
aX6ThZIlPL0RhETJK9IcqtU=
=xalw
-----END PGP SIGNATURE-----

Re: Read only transactions - Commit or Rollback

From
Nörder-Tuitje, Marcus
Date:
Mmmm, good question.

MVCC blocks reading processes when data is modified. using autocommit implies that each modification statement is an
atomicoperation. 

on a massive readonly table, where no data is altered, MVCC shouldn't have any effect (but this is only an assumption)
basingon 

http://en.wikipedia.org/wiki/Mvcc

using rowlevel locks with write access should make most of the mostly available to reading-only sessions, but this is
anassumption only, too. 

maybe the community knows a little more ;-)

regards,
marcus


-----Ursprüngliche Nachricht-----
Von: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von Grega Bremec
Gesendet: Dienstag, 20. Dezember 2005 12:41
An: PostgreSQL Performance List
Betreff: Re: [PERFORM] Read only transactions - Commit or Rollback


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Nörder-Tuitje wrote:
|> We have a database containing PostGIS MAP data, it is accessed
|> mainly via JDBC. There are multiple simultaneous read-only
|> connections taken from the JBoss connection pooling, and there
|> usually are no active writers. We use connection.setReadOnly(true).
|>
|> Now my question is what is best performance-wise, if it does make
|> any difference at all:
|>
|> Having autocommit on or off? (I presume "off")
|>
|> Using commit or rollback?
|>
|> Committing / rolling back occasionally (e. G. when returning the
|> connection to the pool) or not at all (until the pool closes the
|> connection)?
|>
| afaik, this should be completely neglectable.
|
| starting a transaction implies write access. if there is none, You do
| not need to think about transactions, because there are none.
|
| postgres needs to schedule the writing transactions with the reading
| ones, anyway.
|
| But I am not that performance profession anyway ;-)

Hello, Marcus, Nörder, list.

What about isolation? For several dependent calculations, MVCC doesn't
happen a bit with autocommit turned on, right?

Cheers,
- --
~    Grega Bremec
~    gregab at p0f dot net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDp+2afu4IwuB3+XoRA6j3AJ0Ri0/NrJtHg4xBNcFsVFFW0XvCoQCfereo
aX6ThZIlPL0RhETJK9IcqtU=
=xalw
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster



Re: Read only transactions - Commit or Rollback

From
Markus Schaber
Date:
Hi, Marcus,

Nörder-Tuitje wrote:
> afaik, this should be completely neglectable.
>
> starting a transaction implies write access. if there is none, You do
> not need to think about transactions, because there are none.

Hmm, I always thought that the transaction will be opened at the first
statement, because there _could_ be a parallel writing transaction
started later.

> postgres needs to schedule the writing transactions with the reading
> ones, anyway.

As I said, there usually are no writing transactions on the same database.

Btw, there's another setting that might make a difference:

Having ACID-Level SERIALIZABLE or READ COMMITED?

Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: Read only transactions - Commit or Rollback

From
Michael Riess
Date:
Markus Schaber schrieb:
> Hello,
>
> We have a database containing PostGIS MAP data, it is accessed mainly
> via JDBC. There are multiple simultaneous read-only connections taken
> from the JBoss connection pooling, and there usually are no active
> writers. We use connection.setReadOnly(true).
>
> Now my question is what is best performance-wise, if it does make any
> difference at all:
>
> Having autocommit on or off? (I presume "off")


If you are using large ResultSets, it is interesting to know that
Statement.setFetchSize() does not do anything as long as you have
autocommit on. So you might want to always disable autocommit and set a
reasonable fetch size with large results, or otherwise have serious
memory problems in Java/JDBC.

Re: Read only transactions - Commit or Rollback

From
Andreas Seltenreich
Date:
Markus Schaber writes:

> As I said, there usually are no writing transactions on the same database.
>
> Btw, there's another setting that might make a difference:
>
> Having ACID-Level SERIALIZABLE or READ COMMITED?

Well, if nonrepeatable or phantom reads would pose a problem because
of those occasional writes, you wouldn't be considering autocommit for
performance reasons either, would you?

regards,
Andreas
--

Re: Read only transactions - Commit or Rollback

From
Markus Schaber
Date:
Hello, Andreas,

Andreas Seltenreich wrote:


>>Btw, there's another setting that might make a difference:
>>Having ACID-Level SERIALIZABLE or READ COMMITED?
>
> Well, if nonrepeatable or phantom reads would pose a problem because
> of those occasional writes, you wouldn't be considering autocommit for
> performance reasons either, would you?

Yes, the question is purely performance-wise. We don't care about any
read/write conflicts in this special case.

Some time ago, I had some tests with large bulk insertions, and it
turned out that SERIALIZABLE seemed to be 30% faster, which surprised us.

That's why I ask this questions, and mainly because we currently cannot
perform a large bunch of benchmarking.

Thanks,
Markus


--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: Read only transactions - Commit or Rollback

From
Nicolas Barbier
Date:
On 12/20/05, Nörder-Tuitje, Marcus <noerder-tuitje@technology.de> wrote:

> MVCC blocks reading processes when data is modified.

That is incorrect. The main difference between 2PL and MVCC is that
readers are never blocked under MVCC.

greetings,
Nicolas

--
Nicolas Barbier
http://www.gnu.org/philosophy/no-word-attachments.html

Re: Read only transactions - Commit or Rollback

From
Tom Lane
Date:
Markus Schaber <schabi@logix-tt.com> writes:
> Some time ago, I had some tests with large bulk insertions, and it
> turned out that SERIALIZABLE seemed to be 30% faster, which surprised us.

That surprises me too --- can you provide details on the test case so
other people can reproduce it?  AFAIR the only performance difference
between SERIALIZABLE and READ COMMITTED is the frequency with which
transaction status snapshots are taken; your report suggests you were
spending 30% of the time in GetSnapshotData, which is a lot higher than
I've ever seen in a profile.

As to the original question, a transaction that hasn't modified the
database does not bother to write either a commit or abort record to
pg_xlog.  I think you'd be very hard pressed to measure any speed
difference between saying COMMIT and saying ROLLBACK after a read-only
transaction.  It'd be worth your while to let transactions run longer
to minimize their startup/shutdown overhead, but there's a point of
diminishing returns --- you don't want client code leaving transactions
open for hours, because of the negative side-effects of holding locks
that long (eg, VACUUM can't reclaim dead rows).

            regards, tom lane

Re: Read only transactions - Commit or Rollback

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> That surprises me too --- can you provide details on the test case so
> other people can reproduce it?  AFAIR the only performance difference
> between SERIALIZABLE and READ COMMITTED is the frequency with which
> transaction status snapshots are taken; your report suggests you were
> spending 30% of the time in GetSnapshotData, which is a lot higher than
> I've ever seen in a profile.

Perhaps it reduced the amount of i/o concurrent vacuums were doing?

--
greg

Re: Read only transactions - Commit or Rollback

From
Markus Schaber
Date:
Hi, Tom,

Tom Lane wrote:

>>Some time ago, I had some tests with large bulk insertions, and it
>>turned out that SERIALIZABLE seemed to be 30% faster, which surprised us.
>
> That surprises me too --- can you provide details on the test case so
> other people can reproduce it?  AFAIR the only performance difference
> between SERIALIZABLE and READ COMMITTED is the frequency with which
> transaction status snapshots are taken; your report suggests you were
> spending 30% of the time in GetSnapshotData, which is a lot higher than
> I've ever seen in a profile.

It was in my previous Job two years ago, so I don't have access to the
exact code, and my memory is foggy. It was PostGIS 0.8 and PostgreSQL 7.4.

AFAIR, it was inserting into a table with about 6 columns and some
indices, some columns having database-provided values (now() and a
SERIAL column), where the other columns (a PostGIS Point, a long, a
foreign key into another table) were set via the aplication. We tried
different insertion methods (INSERT, prepared statements, a pgjdbc patch
to allow COPY support), different bunch sizes and different number of
parallel connections to get the highest overall insert speed. However,
the project never went productive the way it was designed initially.

As you write about transaction snapshots: It may be that the PostgreSQL
config was not optimized well enough, and the hard disk was rather slow.

> As to the original question, a transaction that hasn't modified the
> database does not bother to write either a commit or abort record to
> pg_xlog.  I think you'd be very hard pressed to measure any speed
> difference between saying COMMIT and saying ROLLBACK after a read-only
> transaction.  It'd be worth your while to let transactions run longer
> to minimize their startup/shutdown overhead, but there's a point of
> diminishing returns --- you don't want client code leaving transactions
> open for hours, because of the negative side-effects of holding locks
> that long (eg, VACUUM can't reclaim dead rows).

Okay, so I'll stick with my current behaviour (Autocommit off and
ROLLBACK after each bunch of work).

Thanks,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: Read only transactions - Commit or Rollback

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> That surprises me too --- can you provide details on the test case so
>> other people can reproduce it?  AFAIR the only performance difference
>> between SERIALIZABLE and READ COMMITTED is the frequency with which
>> transaction status snapshots are taken; your report suggests you were
>> spending 30% of the time in GetSnapshotData, which is a lot higher than
>> I've ever seen in a profile.

> Perhaps it reduced the amount of i/o concurrent vacuums were doing?

Can't see how it would do that.

            regards, tom lane