Thread: serializable read only deferrable
I'm reviving the discussion on the subject topic because I just had an epiphany which makes it seem simple to implement. The concept of this is that if you start a SERIALIZABLE READ ONLY transaction in an SSI environment when certain conditions are true, it doesn't need to acquire predicate locks or test for rw-conflicts. This would be particularly useful for pg_dump or large reports, as it would allow them to read data which was guaranteed to be consistent with later states of the database without risking serialization failure or contributing to the failure of other transactions. They should also run a bit faster without the overhead of locking and checking. Having completed the switch from a pair of rw-conflict pointers per serializable transaction to a list of rw-conflicts, I'm working through the more aggressive transaction clean-up strategies thereby allowed in preparation for the graceful degradation code. Along the way, I noticed how easy it is to allow a READ ONLY transaction to opt out of predicate locking and conflict detection when it starts with no concurrent non READ ONLY transactions active, or even to remove READ ONLY transactions from those activities when such a state is reached during the execution of READ ONLY transactions; while properly recognizing the *additional* conditions under which this would be valid is rather painful. (Those additional conditions being that no concurrent non-read-only transaction may overlap a committed non-read-only transaction which wrote data and committed before the read-only transaction acquired its snapshot.) The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under SSI would be to have each non-read-only serializable transaction acquire a heavyweight lock which can coexist with other locks at the same level (SHARE looks good) on some common object and hold that for the duration of the transaction, while a SERIALIZABLE READ ONLY DEFERRABLE transaction would need to acquire a conflicting lock (EXCLUSIVE looks good) before it could acquire a snapshot, and release the lock immediately after acquiring the snapshot. For these purposes, it appears that advisory locks could work, as long as the lock release does not wait for the end of the transaction (which it doesn't, if I'm reading the docs right) and as long as I can pick a lock ID which won't conflict with other uses. That latter part is the only iffy aspect of the whole thing that I can see. Of course, I could add a third lock method, but that seems like overkill to be able to get one single lock. Since I'm already allowing a transaction to opt out of predicate locking and conflict detection if there are no non-read-only transactions active when it acquires its snapshot, the work needed within the SSI code is pretty trivial; it's all in adding the DEFERRABLE word as a non-standard extension to SET TRANSACTION et al, and finding a heavyweight lock to use. Thoughts? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > I'm reviving the discussion on the subject topic because I just had > an epiphany which makes it seem simple to implement. The concept of > this is that if you start a SERIALIZABLE READ ONLY transaction in an > SSI environment when certain conditions are true, it doesn't need to > acquire predicate locks or test for rw-conflicts. I assume this would have to be a "hard" definition of READ ONLY, not the rather squishy definition we use now? How would we manage the compatibility implications? regards, tom lane
Tom Lane wrote: > "Kevin Grittner" writes: >> I'm reviving the discussion on the subject topic because I just >> had an epiphany which makes it seem simple to implement. The >> concept of this is that if you start a SERIALIZABLE READ ONLY >> transaction in an SSI environment when certain conditions are >> true, it doesn't need to acquire predicate locks or test for >> rw-conflicts. > > I assume this would have to be a "hard" definition of READ ONLY, > not the rather squishy definition we use now? How would we manage > the compatibility implications? If there are issues with whether READ ONLY covers the right ground, it's likely to affect more than this particular issue -- I've taken advantage of the READ ONLY property of transactions to allow quite a few novel optimizations to SSI beyond what is presented in Cahill's doctoral work. I'm excluding temporary tables from SSI on the grounds that they are only read and written by a single transaction and therefore can't be a source of rw-dependencies, and I'm excluding system tables on the grounds that they don't follow normal snapshot isolation rules. Hint bit rewrites are not an issue for SSI. Are there any other squishy aspect I might need to consider? -Kevin
I wrote: > Tom Lane wrote: >> I assume this would have to be a "hard" definition of READ ONLY, >> not the rather squishy definition we use now? > I'm excluding temporary tables from SSI on the grounds that they > are only read and written by a single transaction and therefore > can't be a source of rw-dependencies, and I'm excluding system > tables on the grounds that they don't follow normal snapshot > isolation rules. Hint bit rewrites are not an issue for SSI. Are > there any other squishy aspect I might need to consider? I reviewed the documentation and played around with this a bit and can't find any areas where the current PostgreSQL implementation of READ ONLY is incompatible with what is needed for the SSI optimizations where it is used. There are a large number of tests which exercise this, and they're all passing. Did you have something in particular in mind which I should check? An example of code you think might break would be ideal, but anything more concrete than the word "squishy" would be welcome. Any thoughts on the original question about what to use as a heavyweight lock to support the subject feature? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > I reviewed the documentation and played around with this a bit and > can't find any areas where the current PostgreSQL implementation of > READ ONLY is incompatible with what is needed for the SSI > optimizations where it is used. There are a large number of tests > which exercise this, and they're all passing. > Did you have something in particular in mind which I should check? I did not, just thought it was a point that merited examination. regards, tom lane
On Dec5, 2010, at 16:11 , Kevin Grittner wrote: > The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under > SSI would be to have each non-read-only serializable transaction > acquire a heavyweight lock which can coexist with other locks at the > same level (SHARE looks good) on some common object and hold that for > the duration of the transaction, while a SERIALIZABLE READ ONLY > DEFERRABLE transaction would need to acquire a conflicting lock > (EXCLUSIVE looks good) before it could acquire a snapshot, and > release the lock immediately after acquiring the snapshot. Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, no other transaction would be allowed tostart until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's purposesat least, that seems undesirable, since a single long-running transaction at the time you start pg_dump would effectlyDoS your system until the long-running transaction finishes. The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by astream of overlapping non-READ ONLY transactions. Then a flag in the proc array that marks non-READ ONLY transactions shouldbe sufficient, plus a wait-and-retry loop to take snapshots for SERIALIZABLE READ ONLY DEFERRABLE transactions. best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> wrote: > On Dec5, 2010, at 16:11 , Kevin Grittner wrote: >> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE >> under SSI would be to have each non-read-only serializable >> transaction acquire a heavyweight lock which can coexist with >> other locks at the same level (SHARE looks good) on some common >> object and hold that for the duration of the transaction, while a >> SERIALIZABLE READ ONLY DEFERRABLE transaction would need to >> acquire a conflicting lock (EXCLUSIVE looks good) before it could >> acquire a snapshot, and release the lock immediately after >> acquiring the snapshot. > > Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to > acquire the lock, no other transaction would be allowed to start > until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been > able to acquire its snapshot. For pg_dump's purposes at least, > that seems undesirable, since a single long-running transaction at > the time you start pg_dump would effectly DoS your system until > the long-running transaction finishes. Well, when you put it that way, it sounds pretty grim. :-( Since one of the bragging points of SSI is that it doesn't introduce any blocking beyond current snapshot isolation, I don't want to do something here which blocks anything except the transaction which has explicitly requested the DEFERRABLE property. I guess that, simple as that technique might be, it just isn't a good idea. > The alternative seems to be to drop the guarantee that a > SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a > stream of overlapping non-READ ONLY transactions. Then a flag in > the proc array that marks non-READ ONLY transactions should be > sufficient, plus a wait-and-retry loop to take snapshots for > SERIALIZABLE READ ONLY DEFERRABLE transactions. If I can find a way to pause an active process I already have functions in which I maintain the count of active SERIALIZABLE READ WRITE transactions as they begin and end -- I could release pending DEFERRABLE transactions when the count hits zero without any separate loop. That has the added attraction of being a path to the more complex checking which could allow the deferrable process to start sooner in some circumstances. The "simple" solution with the heavyweight lock would not have been a good path to that. What would be the correct way for a process to put itself to sleep, and for another process to later wake it up? -Kevin
On 06.12.2010 22:53, Kevin Grittner wrote: > What would be the correct way for a process to put itself to sleep, > and for another process to later wake it up? See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 06.12.2010 22:53, Kevin Grittner wrote: >> What would be the correct way for a process to put itself to >> sleep, and for another process to later wake it up? > > See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code. Is there a reason to prefer one over the other? -Kevin
On Dec6, 2010, at 22:53 , Kevin Grittner wrote: >> The alternative seems to be to drop the guarantee that a >> SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a >> stream of overlapping non-READ ONLY transactions. Then a flag in >> the proc array that marks non-READ ONLY transactions should be >> sufficient, plus a wait-and-retry loop to take snapshots for >> SERIALIZABLE READ ONLY DEFERRABLE transactions. > > If I can find a way to pause an active process I already have > functions in which I maintain the count of active SERIALIZABLE READ > WRITE transactions as they begin and end -- I could release pending > DEFERRABLE transactions when the count hits zero without any > separate loop. That has the added attraction of being a path to the > more complex checking which could allow the deferrable process to > start sooner in some circumstances. The "simple" solution with the > heavyweight lock would not have been a good path to that. I'm starting to wonder if you couldn't get a weaker form of the non-starvation guarantee back by doing the waiting *after*you acquire the snapshot of a SERIALIZABLE RAD ONLY transaction instead of before. AFAICS, the main reason for a SERIALIZABLERAD ONLY transaction's snapshot to be inconsistent that it sees some transaction A as committed and B as uncommittedwhen on the other hand B must happen before A in any serial schedule. In other words, if there is no dangerousstructure even if you add an rw-dependency edge from the SERIALIZABLE RAD ONLY transaction to every concurrent transaction,the SERIALIZABLE RAD ONLY transaction's snapshot is consistent. I'm thus envisioning something along the lineof 1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY DEFERRED, and add a rw-dependency to every other runningREAD WRITE transaction 2) Wait for all these concurrent transaction to either COMMIT or ABORT 3) Check if the transaction has been marked INCONSISTENT. If not, let the transaction proceed. If it was, start over with(1) *) During conflict detection, you'd check if one of the participating transaction is flagged as SERIALIZABLE READ ONLY DEFERREDand mark it INCONSISTENT if it is. Essentially, instead of adding dependencies as you go along and abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERREDtransactions would assume the worst case from the start and thus be able to bypass the more detailed checks lateron. With this scheme, you'd at least stand some chance of eventually acquiring a consistent snapshot, even in the case of anendless stream of overlapping READ WRITE transactions. I have to admit though that I didn't really think this through thoroughly yet, it was more of a quick idea I got after ponderingthis for a bit before I went to bed yesterday. best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> wrote: > reason for a SERIALIZABLE READ ONLY transaction's snapshot to be > inconsistent that it sees some transaction A as committed and B as > uncommitted when on the other hand B must happen before A in any > serial schedule. Precisely right, and very well stated. > I'm thus envisioning something along the line of > > 1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY > DEFERRED, and add a rw-dependency to every other running READ > WRITE transaction > 2) Wait for all these concurrent transaction to either COMMIT or > ABORT > 3) Check if the transaction has been marked INCONSISTENT. If not, > let the transaction proceed. If it was, start over with (1) > > *) During conflict detection, you'd check if one of the > participating transaction is flagged as SERIALIZABLE READ ONLY > DEFERRED and mark it INCONSISTENT if it is. That is brilliant. > Essentially, instead of adding dependencies as you go along and > abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED > transactions would assume the worst case from the start and thus > be able to bypass the more detailed checks later on. Right -- such a transaction, having acquired a good snapshot, could release all SSI resources and run without any of the SSI overhead. > With this scheme, you'd at least stand some chance of eventually > acquiring a consistent snapshot, even in the case of an endless > stream of overlapping READ WRITE transactions. Yeah, I'd been twisting ideas around trying to find a good way to do this; you've got it right at the conceptual level, I think. > I have to admit though that I didn't really think this through > thoroughly yet, it was more of a quick idea I got after pondering > this for a bit before I went to bed yesterday. [reads through it a few more times, sips caffeine, and thinks] Really, what you care about is whether any of the READ WRITE transactions active at the time the snapshot was acquired commit after developing a rw-conflict with a transaction which committed before the READ ONLY DEFERRABLE snapshot was acquired. (The reader would have to appear first in any serial schedule, yet the READ ONLY transaction can see the effects of the writer but not the reader.) Which brings up another point, that reader must also write to a permanent table before it commits in order to become the pivot in the dangerous structure. Pseudo-code of idea (conveniently ignoring locking issues and non-serializable transactions): // serializable read only deferrable xact do { get a snapshot clear inconsistent flag if (no concurrent read writexacts) break; // we got it the easy way associate all active read write xacts with this xact block until toldto wake } while (inconsistent); clear xact from any SSI structures its in run with the snapshot // each xact associatedwith the above on transaction completion if (commit and has written and has conflict out to xact committed before deferrable snapshot) { flag deferrable as inconsistent unblock deferrable xact } else if this is termination of last associated read write xact unblock deferrable xact Seem sane? -Kevin
Tom Lane <tgl@sss.pgh.pa.us> wrote: > I assume this would have to be a "hard" definition of READ ONLY, > not the rather squishy definition we use now? Oh, I just went through the code on setting READ ONLY and discovered that contrary to the standard *and* the PostgreSQL documentation, you can change the status of a transaction between READ ONLY and READ WRITE at will. Yeah, that's a problem for my intended use. Many optimizations would need to go right out the window, and the false positive rate under SSI would be high. > How would we manage the compatibility implications? Comply with the standard. The bright side of this is that it wouldn't require any change to our user docs. http://www.postgresql.org/docs/current/interactive/sql-start-transaction.html | This command begins a new transaction block. If the isolation | level or read/write mode is specified, the new transaction has | those characteristics, as if SET TRANSACTION was executed. This is | the same as the BEGIN command. and on the same page: | Compatibility | | In the standard, it is not necessary to issue START TRANSACTION to | start a transaction block: any SQL command implicitly begins a | block. PostgreSQL's behavior can be seen as implicitly issuing a | COMMIT after each command that does not follow START TRANSACTION | (or BEGIN), and it is therefore often called "autocommit". Other | relational database systems might offer an autocommit feature as a | convenience. No mention of "and you can change back and forth between READ ONLY and READ WRITE any time during the transaction, including between reads and writes, as many times as you like." Was there a justification for this behavior, or was it just not implemented carefully? Does anyone currently depend on the current behavior? test=# create table asdf (id int not null primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "asdf_pkey" for table "asdf" CREATE TABLE test=# set default_transaction_isolation = serializable; SET test=# set transaction read only; SET BEGIN test=# set transaction read only; SET test=# select 1;?column? ---------- 1 (1 row) test=# set transaction read write; SET test=# insert into asdf values (1); INSERT 0 1 test=# set transaction read only; SET test=# select * from asdf;id ---- 1 (1 row) test=# set transaction read write; SET test=# insert into asdf values (2); INSERT 0 1 test=# commit; COMMIT I find that to be a huge POLA violation. I will happily prepare a patch to fix this if there is agreement that we want it. I really need READ ONLY *transactions*, not READ ONLY *moments* within transactions to do any optimization based on the property. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Oh, I just went through the code on setting READ ONLY and discovered > that contrary to the standard *and* the PostgreSQL documentation, > you can change the status of a transaction between READ ONLY and > READ WRITE at will. Yeah, that's a problem for my intended use. > Many optimizations would need to go right out the window, and the > false positive rate under SSI would be high. I believe you had better support the locution begin;set transaction read only;... I agree that letting it be changed back to read/write after that is surprising and unnecessary. Perhaps locking down the setting at the time of first grabbing a snapshot would be appropriate. IIRC that's how it works for transaction isolation level, and this seems like it ought to work the same. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > I agree that letting it be changed back to read/write after that > is surprising and unnecessary. Perhaps locking down the setting > at the time of first grabbing a snapshot would be appropriate. > IIRC that's how it works for transaction isolation level, and this > seems like it ought to work the same. Agreed. I can create a patch today to implement this. The thing which jumps out first is that assign_transaction_read_only probably needs to move to variable.c so that it can reference FirstSnapshotSet as the transaction isolation code does. The alternative would be to include snapmgr.h in guc.c, which seems less appealing. Agreed? Other ideas? -Kevin
I wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I agree that letting it be changed back to read/write after that >> is surprising and unnecessary. Perhaps locking down the setting >> at the time of first grabbing a snapshot would be appropriate. >> IIRC that's how it works for transaction isolation level, and >> this seems like it ought to work the same. > > Agreed. I can create a patch today to implement this. Attached. Accomplished more through mimicry (based on setting transaction isolation level) than profound understanding of the code involved; but it passes all regression tests on both `make check` and `make installcheck-world`. This includes a new regression test that an attempt to change it after a query fails. I've poked at it with various ad hoc tests, and it is behaving as expected in those. I wasn't too confident how to word the new failure messages. -Kevin
Attachment
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Attached. > Accomplished more through mimicry (based on setting transaction > isolation level) than profound understanding of the code involved; > but it passes all regression tests on both `make check` and `make > installcheck-world`. This includes a new regression test that an > attempt to change it after a query fails. I've poked at it with > various ad hoc tests, and it is behaving as expected in those. Hmm. This patch disallows the case of creating a read-only subtransaction of a read-write parent. That's a step backwards. I'm not sure how we could enforce that the property not change after the first query of a subxact, but maybe we don't care that much? Do your optimizations pay attention to local read-only in a subxact? regards, tom lane
Tom Lane < tgl@sss.pgh.pa.us > wrote: > Hmm. This patch disallows the case of creating a read-only > subtransaction of a read-write parent. That's a step backwards. > I'm not sure how we could enforce that the property not change > after the first query of a subxact, but maybe we don't care that > much? Do your optimizations pay attention to local read-only in a > subxact? No, it's all about the top level transaction, as long as the subtransaction doesn't do anything which violates the requirements of the top level. (That is, if the top level is not READ ONLY, I can't do the optimizations, but it would cause no problem if a subtransaction was READ ONLY -- it just wouldn't allow any special optimizations.) I noticed that the standard seems (if I'm reading it correctly) to allow subtransactions to switch to more restrictive settings for both transaction isolation and read only status than the enclosing transaction, but not looser. I don't think it makes sense in PostgreSQL to say (for example) that the top level transaction is READ COMMITTED but the subtransaction is SERIALIZABLE, but it might make sense to say that the top level transaction is READ WRITE but the subtransaction is READ ONLY. And I see where I broke support for that in the patch. I can fix up the patch if to support it again if you like. (I think it's just a matter of replacing a few lines that I replaced in the original patch.) If you'd rather do it, I'll stay out of your way. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > I noticed that the standard seems (if I'm reading it correctly) to > allow subtransactions to switch to more restrictive settings for > both transaction isolation and read only status than the enclosing > transaction, but not looser. Yeah. My recollection is that we've discussed exactly this point with respect to isolation level, and decided that we couldn't (or it wasn't worthwhile to) allow serializable subxacts inside repeatable read. I don't know whether your patch will change that tradeoff. But I don't think it's really been discussed much with respect to read-only, perhaps because nobody's paid all that much attention to read-only at all. In any case, the behavior you state seems obviously correct, so let's see what we can do about getting closer to that. My guess is that a reasonable fix is to remember the read-only setting as of snapshot lockdown, and thereafter to allow changing from read-write to read-only but not vice versa. One thing to watch for is allowing subxact exit to restore the previous read-write state. (BTW it looks like assign_XactIsoLevel emits a rather useless debug message in that case, so that code could stand some cleanup too. Also, that code is set so that it will throw an error even if you're assigning the currently active setting, which maybe is overly strict? Not sure.) > I can fix up the patch if to support it again if you like. (I think > it's just a matter of replacing a few lines that I replaced in the > original patch.) If you'd rather do it, I'll stay out of your way. Feel free to have a go at it. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > One thing to watch for is allowing subxact exit to restore the > previous read-write state. OK. > BTW it looks like assign_XactIsoLevel emits a rather useless debug > message in that case, so that code could stand some cleanup too. I'll take a look. > Also, that code is set so that it will throw an error even if > you're assigning the currently active setting, which maybe is > overly strict? Not sure. The standard is tricky to read, but my reading of it is that only "LOCAL" changes are allowed after the transaction is underway (which I *think* effectively means a subtransaction), and those can't make the setting less strict -- you're allowed to specify the same level or more strict. There would be no harm from the perspective of anything I'm working on to allow an in-progress transaction to be set to what it already has, but that seems to invite confusion and error more than provide a helpful feature, as far as I can tell. I'm inclined not to allow it except at the start of a subtransaction, but don't feel strongly about it. >> I can fix up the patch if to support it again if you like. > Feel free to have a go at it. Will do. I notice that I also removed the check for RecoveryInProgress(), which I will also restore. And maybe a regression test or two for the subtransaction usages.... -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Also, that code is set so that it will throw an error even if >> you're assigning the currently active setting, which maybe is >> overly strict? Not sure. > The standard is tricky to read, but my reading of it is that only > "LOCAL" changes are allowed after the transaction is underway (which > I *think* effectively means a subtransaction), and those can't make > the setting less strict -- you're allowed to specify the same level > or more strict. There would be no harm from the perspective of > anything I'm working on to allow an in-progress transaction to be > set to what it already has, but that seems to invite confusion and > error more than provide a helpful feature, as far as I can tell. > I'm inclined not to allow it except at the start of a > subtransaction, but don't feel strongly about it. Hmm ... (1) If the standard says that you're allowed to apply a redundant setting, I think we'd better accept that. (2) I'm not thrilled about the idea of tracking the equivalent of FirstSnapshotSet for each subtransaction, which I think would be necessary infrastructure to error-check this as tightly as you seem to have in mind. I'd prefer to be a bit laxer in order to have less overhead for what is in the end just nanny-ism. So the implementation I had in mind would allow SET TRANSACTION operations to occur later in a subxact, as long as they were redundant and weren't actually trying to change the active value. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > If the standard says that you're allowed to apply a redundant > setting, I think we'd better accept that. OK > So the implementation I had in mind would allow SET TRANSACTION > operations to occur later in a subxact, as long as they were > redundant and weren't actually trying to change the active value. It's easy to see how I can allow changes in the subtransaction as long as they don't specify READ WRITE when the top level is READ ONLY, but it isn't obvious to me how to only allow it at the start of the subtransaction. I'm OK with taking the easy route on this aspect of things, but if someone needs READ ONLY to "stick" for the duration of a subtransaction, I'm not sure how to do that. (And I'm not sure you were actually suggesting that, either.) To restate, since I'm not sure how clear that is, what I have at the moment is: (1) A top level transaction can only set READ ONLY or READ WRITE until it has acquired its first snapshot. (2) A subtransaction can set it at will, as many times as desired, to match the top level or specify READ ONLY. (3) During recovery the setting cannot be changed from READ ONLY to READ WRITE. I'm not clear whether #2 is OK or how to do it only at the start. I haven't yet looked at the other issues you mentioned. -Kevin
On Dec8, 2010, at 20:39 , Kevin Grittner wrote: > The standard is tricky to read, but my reading of it is that only > "LOCAL" changes are allowed after the transaction is underway (which > I *think* effectively means a subtransaction), and those can't make > the setting less strict -- you're allowed to specify the same level > or more strict. There would be no harm from the perspective of > anything I'm working on to allow an in-progress transaction to be > set to what it already has, but that seems to invite confusion and > error more than provide a helpful feature, as far as I can tell. > I'm inclined not to allow it except at the start of a > subtransaction, but don't feel strongly about it. Hm, I think being able to assert that the isolation level really is SERIALIZABLE by simply doing "SET TRANSACTION ISOLATIONLEVEL SERIALIZABLE" would be a great feature for SSI. Say you've written a trigger which enforces some complex constraint, but is correct only for SERIALIZABLE transactions. Bysimply sticking a "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE" at the top of the trigger you'd both document that factit is correct only for SERIALIZABLE transactions *and* prevent corruption should the isolation level be something elsedue to a pilot error. Nice, simply and quite effective. BTW, I hope to find some time this evening to review your more detailed proposal for "serializable read only deferrable" best regards, Florian Pflug
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So the implementation I had in mind would allow SET TRANSACTION >> operations to occur later in a subxact, as long as they were >> redundant and weren't actually trying to change the active value. > It's easy to see how I can allow changes in the subtransaction as > long as they don't specify READ WRITE when the top level is READ > ONLY, but it isn't obvious to me how to only allow it at the start > of the subtransaction. I'm OK with taking the easy route on this > aspect of things, but if someone needs READ ONLY to "stick" for the > duration of a subtransaction, I'm not sure how to do that. (And I'm > not sure you were actually suggesting that, either.) What I suggested was to not allow read-only -> read-write state transitions except (1) before first snapshot in the main xact and (2) at subxact exit (the OVERRIDE case). That seems to accomplish the goal. Now it will also allow dropping down to read-only mid-subtransaction, but I don't think forbidding that case is worth extra complexity. regards, tom lane
Florian Pflug <fgp@phlo.org> wrote: > Hm, I think being able to assert that the isolation level really > is SERIALIZABLE by simply doing "SET TRANSACTION ISOLATION LEVEL > SERIALIZABLE" would be a great feature for SSI. > > Say you've written a trigger which enforces some complex > constraint, but is correct only for SERIALIZABLE transactions. By > simply sticking a "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE" > at the top of the trigger you'd both document that fact it is > correct only for SERIALIZABLE transactions *and* prevent > corruption should the isolation level be something else due to > a pilot error. Nice, simply and quite effective. It would be great to have a way within a trigger, or possibly other functions, to assert that the transaction isolation level is serializable. What gives me pause here is that the standard allows you to specify a more strict transaction isolation level within a subtransaction without error, so this way of spelling the feature is flirting with rather nonstandard behavior. Is there maybe a better way to check this? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Florian Pflug <fgp@phlo.org> wrote: >> Say you've written a trigger which enforces some complex >> constraint, but is correct only for SERIALIZABLE transactions. By >> simply sticking a "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE" >> at the top of the trigger you'd both document that fact it is >> correct only for SERIALIZABLE transactions *and* prevent >> corruption should the isolation level be something else due to >> a pilot error. Nice, simply and quite effective. > It would be great to have a way within a trigger, or possibly other > functions, to assert that the transaction isolation level is > serializable. What gives me pause here is that the standard allows > you to specify a more strict transaction isolation level within a > subtransaction without error, so this way of spelling the feature is > flirting with rather nonstandard behavior. Yes. This is not the way to provide a feature like that. > Is there maybe a better way to check this? You can always read the current setting and throw an error if you don't like it. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > What I suggested was to not allow read-only -> read-write state > transitions except (1) before first snapshot in the main xact > and (2) at subxact exit (the OVERRIDE case). That seems to > accomplish the goal. Now it will also allow dropping down to > read-only mid-subtransaction, but I don't think forbidding that > case is worth extra complexity. Attached is version 2. I think it does everything we discussed, except that I'm not sure whether I addressed the assign_XactIsoLevel issue you mentioned, since you mentioned a debug message and I only see things that look like errors to me. If I did miss something, I'll be happy to take another look if you can point me to the right place. -Kevin
Attachment
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > except that I'm not sure whether I addressed the assign_XactIsoLevel > issue you mentioned, since you mentioned a debug message and I only > see things that look like errors to me. If I did miss something, > I'll be happy to take another look if you can point me to the right > place. GUC_complaint_elevel() can return DEBUGn, and in fact will do so in the PGC_S_OVERRIDE case. I'm thinking that the code ought to look more like /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */if (source != PGC_S_OVERRIDE){ check and report allthe complaint-worthy cases;} The original coding was probably sane for initial development, but now it just results in useless debug-log traffic for predictable perfectly valid cases. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > GUC_complaint_elevel() can return DEBUGn, and in fact will do so in > the PGC_S_OVERRIDE case. I'm thinking that the code ought to look > more like > > /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */ > if (source != PGC_S_OVERRIDE) > { > check and report all the complaint-worthy cases; > } Done. Version 3 attached. I think I've covered all bases now. -Kevin
Attachment
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > Done. Version 3 attached. My final tweaks didn't make it into that diff. Attached is 3a as a patch over the version 3 patch. -Kevin
Attachment
On Tue, Dec 07, 2010 at 10:14:24AM -0600, Kevin Grittner wrote: > > Essentially, instead of adding dependencies as you go along and > > abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED > > transactions would assume the worst case from the start and thus > > be able to bypass the more detailed checks later on. > > Right -- such a transaction, having acquired a good snapshot, could > release all SSI resources and run without any of the SSI overhead. Yes, this makes sense. If no running transaction has ever read, and will never read before COMMIT, any value that's modified by a concurrent transaction, then they will not create snapshot anomalies, and the current snapshot has a place in the serial ordering. > > With this scheme, you'd at least stand some chance of eventually > > acquiring a consistent snapshot, even in the case of an endless > > stream of overlapping READ WRITE transactions. > > Yeah, I'd been twisting ideas around trying to find a good way to do > this; you've got it right at the conceptual level, I think. The only thing I'm worried about here is how much risk of starvation remains. You'd need to wait until there are no running r/w transactions accessing overlapping data sets; for some applications that might not be any better than waiting for the system to be idle. But I think there's no way around that, it's just the price you have to pay to get a snapshot that can never see an anomaly. > Pseudo-code of idea (conveniently ignoring locking issues and > non-serializable transactions): This seems reasonable to me. Let me know if you need help implementing it; I have some spare cycles right now. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Dan Ports wrote: > On Tue, Dec 07, 2010 at 10:14:24AM -0600, Kevin Grittner wrote: > The only thing I'm worried about here is how much risk of > starvation remains. You'd need to wait until there are no running > r/w transactions accessing overlapping data sets; for some > applications that might not be any better than waiting for the > system to be idle. But I think there's no way around that, it's > just the price you have to pay to get a snapshot that can never see > an anomaly. Right -- this can't be a default behavior because of that, but it rounds out the options for backups and big reports. Without it you have the choice between the potential for other transactions to cancel because a cycle was completed by the READ ONLY transaction or getting a view of data which may not be consistent with the later state of the database[1]. This guarantees consistency without causing rollbacks, with the additional benefit of faster runtime by skipping SSI logic. >> Pseudo-code of idea (conveniently ignoring locking issues and >> non-serializable transactions): > > This seems reasonable to me. Let me know if you need help > implementing it; I have some spare cycles right now. That would be great. I'm getting on a train today to go spend a week on vacation in New Orleans, and I've been fretting about where this patch is at compared to the release cycle. :-( I can suck down my hurricanes with a calmer mind if I know you're on this. :-) In conjunction with this feature, it would be great if you could take a look at how to recognize these conditions for a READ ONLY transaction which is running under SSI, and back it out of SSI when it hits that condition. SIRead predicate locks, conflicts, and other structures can be released, and we can stop checking the MVCC data on reads. Basically, we should be able to get to the DEFERRABLE type of state while running -- it's just that we might cause some number of transactions to cancel along the way before we hit that state. (These two seem likely to be less work if done at the same time.) -Kevin [1] It has struck me that the receipting example is one case of a more general pattern which I've frequently seen in business software which is vulnerable to this sort of anomaly -- batch processing. Basically, any time you have a control record which controls the batch into which detail is placed, if the control information is updated and that is committed while detail is still in flight, you can have this class of anomaly.