Thread: start of transaction (was: Re: [PERFORM] Help with count(*))

start of transaction (was: Re: [PERFORM] Help with count(*))

From
Hannu Krosing
Date:
Redirected to -hackers

Neil Conway kirjutas L, 15.11.2003 kell 22:20:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > (I believe the previous discussion also agreed that we wanted to
> > postpone the freezing of now(), which currently also happens at
> > BEGIN rather than the first command after BEGIN.)
> 
> That doesn't make sense to me: from a user's perspective, the "start
> of the transaction" is when the BEGIN is issued, regardless of any
> tricks we may play in the backend.

For me, the "start of transaction" is not about time, but about grouping
a set of statements into one. So making the exact moment of "start" be
the first statement that actually does something with data seems
perfectly reasonable. If you really need to preserve time, do "select
current_timestamp" and use the result.

> Making now() return the time the current transaction started is
> reasonably logical; making now() return "the time when the first
> command after the BEGIN in the current transaction was issued" makes a
> lot less sense to me.

for me "the time the current transactuion is started" == "the time when
the first command after the BEGIN in the current transaction was issued"
and thus I see no conflict here ;)

Delaying the locking effects of transactions as long as possible can
increase performance overall, not just for pathological clients that sit
on idle open transactions.

Probably the latest time we can start the transaction is ath the start
of executor step after the first statement in a transaction is planned
and optimized.

---------------
Hannu




Re: start of transaction (was: Re: [PERFORM] Help with count(*))

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> Probably the latest time we can start the transaction is ath the start
> of executor step after the first statement in a transaction is planned
> and optimized.

The transaction has to exist before it can take locks, so the above
would not fly.

A complete example of what we have to think about is:
BEGIN;SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;LOCK TABLE foo;UPDATE foo ...  -- or in general a
SELECT/UPDATE/INSERT/DELETEquery... etc ...
 

The transaction snapshot *must* be set at the time of the first query
(here, the UPDATE).  It obviously can't be later, and it cannot be
earlier either, because in this sort of example you need the requested
locks to be taken before the snapshot is set.

The transaction must be created (as observed by other backends, in
particular VACUUM) not later than the LOCK statement, else there is
nothing that can own the lock.  In principle though, the effects of
BEGIN and perhaps SET could be strictly local to the current backend,
and only when we hit a LOCK or query do we create the transaction
externally.

In practice the problem we observe is clients that issue BEGIN and then
go to sleep (typically because of poorly-designed autocommit behavior in
interface libraries).  Postponing externally-visible creation of the
transaction to the first command after BEGIN would be enough to get
around the real-world issues, and it would not require code changes
nearly as extensive as trying to let other stuff like SET happen
"before" the transaction starts.

There isn't any compelling implementation reason when to freeze the
value of now().  Reasonable options are1. at BEGIN (current behavior)2. at transaction's external creation 3. at
freezingof transaction snapshot
 
#1 and #2 are actually the same at the moment, but could be decoupled
as sketched above, in which case the behavior of #2 would effectively
become "at first command afte BEGIN".

In the previous thread:
http://archives.postgresql.org/pgsql-hackers/2003-03/msg01178.php
I argued that now() should be frozen at the time of the transaction
snapshot, and I still think that that's a defensible behavior.
        regards, tom lane


Re: start of transaction (was: Re: [PERFORM] Help with

From
Dennis Bjorklund
Date:
On Sun, 16 Nov 2003, Tom Lane wrote:

> There isn't any compelling implementation reason when to freeze the
> value of now().  Reasonable options are
>     1. at BEGIN (current behavior)
>     2. at transaction's external creation 
>     3. at freezing of transaction snapshot
> #1 and #2 are actually the same at the moment, but could be decoupled
> as sketched above, in which case the behavior of #2 would effectively
> become "at first command afte BEGIN".
> 
> I argued that now() should be frozen at the time of the transaction
> snapshot, and I still think that that's a defensible behavior.

Is it important exactly what value is returned as long as it's the same in 
the whole transaction? I think not.

To me it would be just as logical to fix it at the first call to now() in
the transaction. The first time you call it you get the actual time as it
is now and the next time you get the same as before since every operation
in the transaction logically happens at the same time. If you don't call
now() at all, the system time will not be fetched at all.

-- 
/Dennis



Re: start of transaction (was: Re: [PERFORM] Help with count(*))

From
Neil Conway
Date:
Hannu Krosing <hannu@tm.ee> writes:
> For me, the "start of transaction" is not about time, but about grouping
> a set of statements into one. So making the exact moment of "start" be
> the first statement that actually does something with data seems
> perfectly reasonable.

This might be a perfectly logical change in semantics, but what
benefit does it provide over the old way of doing things?

What does BEGIN actually do now, from a user's perspective? At
present, it "starts a transaction block", which is pretty simple. If
we adopted the proposed change, it would "change the state of the
system so that the next command is part of a new transaction". This is
naturally more complex; but more importantly, what benefit does it
ACTUALLY provide to the user?

(I can't see one, but perhaps I'm missing something...)

> Delaying the locking effects of transactions as long as possible can
> increase performance overall, not just for pathological clients that sit
> on idle open transactions.

I agree, but this is irrelevant to the semantics of now().

-Neil



Re: start of transaction (was: Re: [PERFORM] Help with count(*))

From
Greg Stark
Date:
Neil Conway <neilc@samurai.com> writes:

> What does BEGIN actually do now, from a user's perspective? 

I think you're thinking about this all wrong. BEGIN doesn't "do" anything.
It's not a procedural statement, it's a declaration. It declares that the
block of statements form a transaction so reads should be consistent and
failures should be handled in a particular way to preserve data integrity.

Given that declaration and the guarantees it requires of the database it's
then up to the database to figure out what constraints that imposes on what
the database can do and still meet the guarantees the BEGIN declaration
requires. The more clever the database is about minimizing those restrictions
the better as it means the database can run more efficiently.

For what it's worth, this is how Oracle handles things too. On the
command-line issuing a BEGIN following a COMMIT is just noise; you're _always_
in a transaction. A COMMIT ends the previous the transaction and implicitly
starts the next transaction. But the snapshot isn't frozen until you first
read from a table.

I'm not sure what other databases do, but I think this is why clients behave
like this. They think of BEGIN as a declaration and therefore initiating a
COMMIT;BEGIN; at the end of every request is perfectly logical, and works fine
in at least Oracle, and probably other databases.

-- 
greg



Re: start of transaction (was: Re: [PERFORM] Help with

From
Stephan Szabo
Date:
On Sun, 17 Nov 2003, Greg Stark wrote:

> Neil Conway <neilc@samurai.com> writes:
>
> > What does BEGIN actually do now, from a user's perspective?
>
> I think you're thinking about this all wrong. BEGIN doesn't "do" anything.
> It's not a procedural statement, it's a declaration. It declares that the
> block of statements form a transaction so reads should be consistent and
> failures should be handled in a particular way to preserve data integrity.
>
> Given that declaration and the guarantees it requires of the database it's
> then up to the database to figure out what constraints that imposes on what
> the database can do and still meet the guarantees the BEGIN declaration
> requires. The more clever the database is about minimizing those restrictions
> the better as it means the database can run more efficiently.
>
> For what it's worth, this is how Oracle handles things too. On the
> command-line issuing a BEGIN following a COMMIT is just noise; you're _always_
> in a transaction. A COMMIT ends the previous the transaction and implicitly
> starts the next transaction. But the snapshot isn't frozen until you first
> read from a table.

The earlier portion of the described behavior is AFAICS not complient to
SQL99 at least. COMMIT (without AND CHAIN) terminates a transaction and
does not begin a new one. The new transaction does not begin until a
transaction initiating command (for example START TRANSACTION, CREATE
TABLE, INSERT, ...) is executed. The set of things you can do that aren't
initiating is fairly small admittedly, but it's not a null set.