Thread: start of transaction (was: Re: [PERFORM] Help with count(*))
Redirected to -hackers Neil Conway kirjutas L, 15.11.2003 kell 22:20: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > (I believe the previous discussion also agreed that we wanted to > > postpone the freezing of now(), which currently also happens at > > BEGIN rather than the first command after BEGIN.) > > That doesn't make sense to me: from a user's perspective, the "start > of the transaction" is when the BEGIN is issued, regardless of any > tricks we may play in the backend. For me, the "start of transaction" is not about time, but about grouping a set of statements into one. So making the exact moment of "start" be the first statement that actually does something with data seems perfectly reasonable. If you really need to preserve time, do "select current_timestamp" and use the result. > Making now() return the time the current transaction started is > reasonably logical; making now() return "the time when the first > command after the BEGIN in the current transaction was issued" makes a > lot less sense to me. for me "the time the current transactuion is started" == "the time when the first command after the BEGIN in the current transaction was issued" and thus I see no conflict here ;) Delaying the locking effects of transactions as long as possible can increase performance overall, not just for pathological clients that sit on idle open transactions. Probably the latest time we can start the transaction is ath the start of executor step after the first statement in a transaction is planned and optimized. --------------- Hannu
Hannu Krosing <hannu@tm.ee> writes: > Probably the latest time we can start the transaction is ath the start > of executor step after the first statement in a transaction is planned > and optimized. The transaction has to exist before it can take locks, so the above would not fly. A complete example of what we have to think about is: BEGIN;SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;LOCK TABLE foo;UPDATE foo ... -- or in general a SELECT/UPDATE/INSERT/DELETEquery... etc ... The transaction snapshot *must* be set at the time of the first query (here, the UPDATE). It obviously can't be later, and it cannot be earlier either, because in this sort of example you need the requested locks to be taken before the snapshot is set. The transaction must be created (as observed by other backends, in particular VACUUM) not later than the LOCK statement, else there is nothing that can own the lock. In principle though, the effects of BEGIN and perhaps SET could be strictly local to the current backend, and only when we hit a LOCK or query do we create the transaction externally. In practice the problem we observe is clients that issue BEGIN and then go to sleep (typically because of poorly-designed autocommit behavior in interface libraries). Postponing externally-visible creation of the transaction to the first command after BEGIN would be enough to get around the real-world issues, and it would not require code changes nearly as extensive as trying to let other stuff like SET happen "before" the transaction starts. There isn't any compelling implementation reason when to freeze the value of now(). Reasonable options are1. at BEGIN (current behavior)2. at transaction's external creation 3. at freezingof transaction snapshot #1 and #2 are actually the same at the moment, but could be decoupled as sketched above, in which case the behavior of #2 would effectively become "at first command afte BEGIN". In the previous thread: http://archives.postgresql.org/pgsql-hackers/2003-03/msg01178.php I argued that now() should be frozen at the time of the transaction snapshot, and I still think that that's a defensible behavior. regards, tom lane
On Sun, 16 Nov 2003, Tom Lane wrote: > There isn't any compelling implementation reason when to freeze the > value of now(). Reasonable options are > 1. at BEGIN (current behavior) > 2. at transaction's external creation > 3. at freezing of transaction snapshot > #1 and #2 are actually the same at the moment, but could be decoupled > as sketched above, in which case the behavior of #2 would effectively > become "at first command afte BEGIN". > > I argued that now() should be frozen at the time of the transaction > snapshot, and I still think that that's a defensible behavior. Is it important exactly what value is returned as long as it's the same in the whole transaction? I think not. To me it would be just as logical to fix it at the first call to now() in the transaction. The first time you call it you get the actual time as it is now and the next time you get the same as before since every operation in the transaction logically happens at the same time. If you don't call now() at all, the system time will not be fetched at all. -- /Dennis
Hannu Krosing <hannu@tm.ee> writes: > For me, the "start of transaction" is not about time, but about grouping > a set of statements into one. So making the exact moment of "start" be > the first statement that actually does something with data seems > perfectly reasonable. This might be a perfectly logical change in semantics, but what benefit does it provide over the old way of doing things? What does BEGIN actually do now, from a user's perspective? At present, it "starts a transaction block", which is pretty simple. If we adopted the proposed change, it would "change the state of the system so that the next command is part of a new transaction". This is naturally more complex; but more importantly, what benefit does it ACTUALLY provide to the user? (I can't see one, but perhaps I'm missing something...) > Delaying the locking effects of transactions as long as possible can > increase performance overall, not just for pathological clients that sit > on idle open transactions. I agree, but this is irrelevant to the semantics of now(). -Neil
Neil Conway <neilc@samurai.com> writes: > What does BEGIN actually do now, from a user's perspective? I think you're thinking about this all wrong. BEGIN doesn't "do" anything. It's not a procedural statement, it's a declaration. It declares that the block of statements form a transaction so reads should be consistent and failures should be handled in a particular way to preserve data integrity. Given that declaration and the guarantees it requires of the database it's then up to the database to figure out what constraints that imposes on what the database can do and still meet the guarantees the BEGIN declaration requires. The more clever the database is about minimizing those restrictions the better as it means the database can run more efficiently. For what it's worth, this is how Oracle handles things too. On the command-line issuing a BEGIN following a COMMIT is just noise; you're _always_ in a transaction. A COMMIT ends the previous the transaction and implicitly starts the next transaction. But the snapshot isn't frozen until you first read from a table. I'm not sure what other databases do, but I think this is why clients behave like this. They think of BEGIN as a declaration and therefore initiating a COMMIT;BEGIN; at the end of every request is perfectly logical, and works fine in at least Oracle, and probably other databases. -- greg
On Sun, 17 Nov 2003, Greg Stark wrote: > Neil Conway <neilc@samurai.com> writes: > > > What does BEGIN actually do now, from a user's perspective? > > I think you're thinking about this all wrong. BEGIN doesn't "do" anything. > It's not a procedural statement, it's a declaration. It declares that the > block of statements form a transaction so reads should be consistent and > failures should be handled in a particular way to preserve data integrity. > > Given that declaration and the guarantees it requires of the database it's > then up to the database to figure out what constraints that imposes on what > the database can do and still meet the guarantees the BEGIN declaration > requires. The more clever the database is about minimizing those restrictions > the better as it means the database can run more efficiently. > > For what it's worth, this is how Oracle handles things too. On the > command-line issuing a BEGIN following a COMMIT is just noise; you're _always_ > in a transaction. A COMMIT ends the previous the transaction and implicitly > starts the next transaction. But the snapshot isn't frozen until you first > read from a table. The earlier portion of the described behavior is AFAICS not complient to SQL99 at least. COMMIT (without AND CHAIN) terminates a transaction and does not begin a new one. The new transaction does not begin until a transaction initiating command (for example START TRANSACTION, CREATE TABLE, INSERT, ...) is executed. The set of things you can do that aren't initiating is fairly small admittedly, but it's not a null set.