Thread: serializable read only deferrable

serializable read only deferrable

From
"Kevin Grittner"
Date:
I'm reviving the discussion on the subject topic because I just had
an epiphany which makes it seem simple to implement.  The concept of
this is that if you start a SERIALIZABLE READ ONLY transaction in an
SSI environment when certain conditions are true, it doesn't need to
acquire predicate locks or test for rw-conflicts.  This would be
particularly useful for pg_dump or large reports, as it would allow
them to read data which was guaranteed to be consistent with later
states of the database without risking serialization failure or
contributing to the failure of other transactions.  They should also
run a bit faster without the overhead of locking and checking.
Having completed the switch from a pair of rw-conflict pointers per
serializable transaction to a list of rw-conflicts, I'm working
through the more aggressive transaction clean-up strategies thereby
allowed in preparation for the graceful degradation code.  Along the
way, I noticed how easy it is to allow a READ ONLY transaction to opt
out of predicate locking and conflict detection when it starts with
no concurrent non READ ONLY transactions active, or even to remove
READ ONLY transactions from those activities when such a state is
reached during the execution of READ ONLY transactions; while
properly recognizing the *additional* conditions under which this
would be valid is rather painful.  (Those additional conditions being
that no concurrent non-read-only transaction may overlap a committed
non-read-only transaction which wrote data and committed before the
read-only transaction acquired its snapshot.)
The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under
SSI would be to have each non-read-only serializable transaction
acquire a heavyweight lock which can coexist with other locks at the
same level (SHARE looks good) on some common object and hold that for
the duration of the transaction, while a SERIALIZABLE READ ONLY
DEFERRABLE transaction would need to acquire a conflicting lock
(EXCLUSIVE looks good) before it could acquire a snapshot, and
release the lock immediately after acquiring the snapshot.
For these purposes, it appears that advisory locks could work, as
long as the lock release does not wait for the end of the transaction
(which it doesn't, if I'm reading the docs right) and as long as I
can pick a lock ID which won't conflict with other uses.  That latter
part is the only iffy aspect of the whole thing that I can see.  Of
course, I could add a third lock method, but that seems like overkill
to be able to get one single lock.
Since I'm already allowing a transaction to opt out of predicate
locking and conflict detection if there are no non-read-only
transactions active when it acquires its snapshot, the work needed
within the SSI code is pretty trivial; it's all in adding the
DEFERRABLE word as a non-standard extension to SET TRANSACTION et al,
and finding a heavyweight lock to use.
Thoughts?
-Kevin




Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> I'm reviving the discussion on the subject topic because I just had
> an epiphany which makes it seem simple to implement.  The concept of
> this is that if you start a SERIALIZABLE READ ONLY transaction in an
> SSI environment when certain conditions are true, it doesn't need to
> acquire predicate locks or test for rw-conflicts.

I assume this would have to be a "hard" definition of READ ONLY, not
the rather squishy definition we use now?  How would we manage the
compatibility implications?
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane  wrote:
> "Kevin Grittner"  writes:
>> I'm reviving the discussion on the subject topic because I just
>> had an epiphany which makes it seem simple to implement. The
>> concept of this is that if you start a SERIALIZABLE READ ONLY
>> transaction in an SSI environment when certain conditions are
>> true, it doesn't need to acquire predicate locks or test for
>> rw-conflicts.
> 
> I assume this would have to be a "hard" definition of READ ONLY,
> not the rather squishy definition we use now? How would we manage
> the compatibility implications?
If there are issues with whether READ ONLY covers the right ground,
it's likely to affect more than this particular issue -- I've taken
advantage of the READ ONLY property of transactions to allow quite a
few novel optimizations to SSI beyond what is presented in Cahill's
doctoral work.
I'm excluding temporary tables from SSI on the grounds that they are
only read and written by a single transaction and therefore can't be
a source of rw-dependencies, and I'm excluding system tables on the
grounds that they don't follow normal snapshot isolation rules.  Hint
bit rewrites are not an issue for SSI.  Are there any other squishy
aspect I might need to consider?
-Kevin


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
I wrote:
> Tom Lane  wrote:
>> I assume this would have to be a "hard" definition of READ ONLY,
>> not the rather squishy definition we use now?
> I'm excluding temporary tables from SSI on the grounds that they
> are only read and written by a single transaction and therefore
> can't be a source of rw-dependencies, and I'm excluding system
> tables on the grounds that they don't follow normal snapshot
> isolation rules. Hint bit rewrites are not an issue for SSI.  Are
> there any other squishy aspect I might need to consider?
I reviewed the documentation and played around with this a bit and
can't find any areas where the current PostgreSQL implementation of
READ ONLY is incompatible with what is needed for the SSI
optimizations where it is used.  There are a large number of tests
which exercise this, and they're all passing.
Did you have something in particular in mind which I should check? 
An example of code you think might break would be ideal, but
anything more concrete than the word "squishy" would be welcome.
Any thoughts on the original question about what to use as a
heavyweight lock to support the subject feature?
-Kevin


Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> I reviewed the documentation and played around with this a bit and
> can't find any areas where the current PostgreSQL implementation of
> READ ONLY is incompatible with what is needed for the SSI
> optimizations where it is used.  There are a large number of tests
> which exercise this, and they're all passing.
> Did you have something in particular in mind which I should check? 

I did not, just thought it was a point that merited examination.
        regards, tom lane


Re: serializable read only deferrable

From
Florian Pflug
Date:
On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under
> SSI would be to have each non-read-only serializable transaction
> acquire a heavyweight lock which can coexist with other locks at the
> same level (SHARE looks good) on some common object and hold that for
> the duration of the transaction, while a SERIALIZABLE READ ONLY
> DEFERRABLE transaction would need to acquire a conflicting lock
> (EXCLUSIVE looks good) before it could acquire a snapshot, and
> release the lock immediately after acquiring the snapshot.

Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, no other transaction would be allowed
tostart until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's
purposesat least, that seems undesirable, since a single long-running transaction at the time you start pg_dump would
effectlyDoS your system until the long-running transaction finishes. 

The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by
astream of overlapping non-READ ONLY transactions. Then a flag in the proc array that marks non-READ ONLY transactions
shouldbe sufficient, plus a wait-and-retry loop to take snapshots for SERIALIZABLE READ ONLY DEFERRABLE transactions. 

best regards,
Florian Pflug



Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Florian Pflug <fgp@phlo.org> wrote:
> On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
>> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE
>> under SSI would be to have each non-read-only serializable
>> transaction acquire a heavyweight lock which can coexist with
>> other locks at the same level (SHARE looks good) on some common
>> object and hold that for the duration of the transaction, while a
>> SERIALIZABLE READ ONLY DEFERRABLE transaction would need to
>> acquire a conflicting lock (EXCLUSIVE looks good) before it could
>> acquire a snapshot, and release the lock immediately after
>> acquiring the snapshot.
> 
> Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to
> acquire the lock, no other transaction would be allowed to start
> until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been
> able to acquire its snapshot. For pg_dump's purposes at least,
> that seems undesirable, since a single long-running transaction at
> the time you start pg_dump would effectly DoS your system until
> the long-running transaction finishes.
Well, when you put it that way, it sounds pretty grim.  :-(  Since
one of the bragging points of SSI is that it doesn't introduce any
blocking beyond current snapshot isolation, I don't want to do
something here which blocks anything except the transaction which
has explicitly requested the DEFERRABLE property.  I guess that,
simple as that technique might be, it just isn't a good idea.
> The alternative seems to be to drop the guarantee that a
> SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a
> stream of overlapping non-READ ONLY transactions. Then a flag in
> the proc array that marks non-READ ONLY transactions should be
> sufficient, plus a wait-and-retry loop to take snapshots for
> SERIALIZABLE READ ONLY DEFERRABLE transactions.
If I can find a way to pause an active process I already have
functions in which I maintain the count of active SERIALIZABLE READ
WRITE transactions as they begin and end -- I could release pending
DEFERRABLE transactions when the count hits zero without any
separate loop.  That has the added attraction of being a path to the
more complex checking which could allow the deferrable process to
start sooner in some circumstances.  The "simple" solution with the
heavyweight lock would not have been a good path to that.
What would be the correct way for a process to put itself to sleep,
and for another process to later wake it up?
-Kevin


Re: serializable read only deferrable

From
Heikki Linnakangas
Date:
On 06.12.2010 22:53, Kevin Grittner wrote:
> What would be the correct way for a process to put itself to sleep,
> and for another process to later wake it up?

See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> On 06.12.2010 22:53, Kevin Grittner wrote:
>> What would be the correct way for a process to put itself to
>> sleep, and for another process to later wake it up?
> 
> See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.
Is there a reason to prefer one over the other?
-Kevin


Re: serializable read only deferrable

From
Florian Pflug
Date:
On Dec6, 2010, at 22:53 , Kevin Grittner wrote:
>> The alternative seems to be to drop the guarantee that a
>> SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a
>> stream of overlapping non-READ ONLY transactions. Then a flag in
>> the proc array that marks non-READ ONLY transactions should be
>> sufficient, plus a wait-and-retry loop to take snapshots for
>> SERIALIZABLE READ ONLY DEFERRABLE transactions.
>
> If I can find a way to pause an active process I already have
> functions in which I maintain the count of active SERIALIZABLE READ
> WRITE transactions as they begin and end -- I could release pending
> DEFERRABLE transactions when the count hits zero without any
> separate loop.  That has the added attraction of being a path to the
> more complex checking which could allow the deferrable process to
> start sooner in some circumstances.  The "simple" solution with the
> heavyweight lock would not have been a good path to that.

I'm starting to wonder if you couldn't get a weaker form of the non-starvation guarantee back by doing the waiting
*after*you acquire the snapshot of a SERIALIZABLE RAD ONLY transaction instead of before. AFAICS, the main reason for a
SERIALIZABLERAD ONLY transaction's snapshot to be inconsistent that it sees some transaction A as committed and B as
uncommittedwhen on the other hand B must happen before A in any serial schedule. In other words, if there is no
dangerousstructure even if you add an rw-dependency edge from the SERIALIZABLE RAD ONLY transaction to every concurrent
transaction,the SERIALIZABLE RAD ONLY transaction's snapshot is consistent. I'm thus envisioning something along the
lineof 

1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY DEFERRED, and add a rw-dependency to every other
runningREAD WRITE transaction 
2) Wait for all these concurrent transaction to either COMMIT or ABORT
3) Check if the transaction has been marked INCONSISTENT. If not, let the transaction proceed. If it was, start over
with(1) 

*) During conflict detection, you'd check if one of the participating transaction is flagged as SERIALIZABLE READ ONLY
DEFERREDand mark it INCONSISTENT if it is. 

Essentially, instead of adding dependencies as you go along and abort once you hit a conflict, SERIALIZABLE READ ONLY
DEFERREDtransactions would assume the worst case from the start and thus be able to bypass the more detailed checks
lateron. 

With this scheme, you'd at least stand some chance of eventually acquiring a consistent snapshot, even in the case of
anendless stream of overlapping READ WRITE transactions. 

I have to admit though that I didn't really think this through thoroughly yet, it was more of a quick idea I got after
ponderingthis for a bit before I went to bed yesterday. 

best regards,
Florian Pflug



Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Florian Pflug <fgp@phlo.org> wrote:
> reason for a SERIALIZABLE READ ONLY transaction's snapshot to be
> inconsistent that it sees some transaction A as committed and B as
> uncommitted when on the other hand B must happen before A in any
> serial schedule.
Precisely right, and very well stated.
> I'm thus envisioning something along the line of
> 
> 1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY
> DEFERRED, and add a rw-dependency to every other running READ
> WRITE transaction
> 2) Wait for all these concurrent transaction to either COMMIT or
> ABORT
> 3) Check if the transaction has been marked INCONSISTENT. If not,
> let the transaction proceed. If it was, start over with (1)
> 
> *) During conflict detection, you'd check if one of the
> participating transaction is flagged as SERIALIZABLE READ ONLY
> DEFERRED and mark it INCONSISTENT if it is.
That is brilliant.
> Essentially, instead of adding dependencies as you go along and
> abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED
> transactions would assume the worst case from the start and thus
> be able to bypass the more detailed checks later on.
Right -- such a transaction, having acquired a good snapshot, could
release all SSI resources and run without any of the SSI overhead.
> With this scheme, you'd at least stand some chance of eventually
> acquiring a consistent snapshot, even in the case of an endless
> stream of overlapping READ WRITE transactions.
Yeah, I'd been twisting ideas around trying to find a good way to do
this; you've got it right at the conceptual level, I think.
> I have to admit though that I didn't really think this through
> thoroughly yet, it was more of a quick idea I got after pondering
> this for a bit before I went to bed yesterday.
[reads through it a few more times, sips caffeine, and thinks]
Really, what you care about is whether any of the READ WRITE
transactions active at the time the snapshot was acquired commit
after developing a rw-conflict with a transaction which committed
before the READ ONLY DEFERRABLE snapshot was acquired.  (The reader
would have to appear first in any serial schedule, yet the READ ONLY
transaction can see the effects of the writer but not the reader.)
Which brings up another point, that reader must also write to a
permanent table before it commits in order to become the pivot in
the dangerous structure.

Pseudo-code of idea (conveniently ignoring locking issues and
non-serializable transactions):
 // serializable read only deferrable xact  do {   get a snapshot   clear inconsistent flag   if (no concurrent read
writexacts)     break;  // we got it the easy way   associate all active read write xacts with this xact   block until
toldto wake } while (inconsistent); clear xact from any SSI structures its in run with the snapshot // each xact
associatedwith the above on transaction completion   if (commit       and has written       and has conflict out
to xact committed before deferrable snapshot)   {     flag  deferrable as inconsistent     unblock deferrable xact   }
else     if this is termination of last associated read write xact       unblock deferrable xact
 
Seem sane?
-Kevin


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I assume this would have to be a "hard" definition of READ ONLY,
> not the rather squishy definition we use now?

Oh, I just went through the code on setting READ ONLY and discovered
that contrary to the standard *and* the PostgreSQL documentation,
you can change the status of a transaction between READ ONLY and
READ WRITE at will.  Yeah, that's a problem for my intended use.
Many optimizations would need to go right out the window, and the
false positive rate under SSI would be high.

> How would we manage the compatibility implications?

Comply with the standard.  The bright side of this is that it
wouldn't require any change to our user docs.

http://www.postgresql.org/docs/current/interactive/sql-start-transaction.html

| This command begins a new transaction block. If the isolation
| level or read/write mode is specified, the new transaction has
| those characteristics, as if SET TRANSACTION was executed. This is
| the same as the BEGIN command.

and on the same page:

| Compatibility
|
| In the standard, it is not necessary to issue START TRANSACTION to
| start a transaction block: any SQL command implicitly begins a
| block. PostgreSQL's behavior can be seen as implicitly issuing a
| COMMIT after each command that does not follow START TRANSACTION
| (or BEGIN), and it is therefore often called "autocommit". Other
| relational database systems might offer an autocommit feature as a
| convenience.

No mention of "and you can change back and forth between READ ONLY
and READ WRITE any time during the transaction, including between
reads and writes, as many times as you like."

Was there a justification for this behavior, or was it just not
implemented carefully?  Does anyone currently depend on the current
behavior?

test=# create table asdf (id int not null primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"asdf_pkey" for table "asdf"
CREATE TABLE
test=# set default_transaction_isolation = serializable;
SET
test=# set transaction read only;
SET
BEGIN
test=# set transaction read only;
SET
test=# select 1;?column?
----------       1
(1 row)

test=# set transaction read write;
SET
test=# insert into asdf values (1);
INSERT 0 1
test=# set transaction read only;
SET
test=# select * from asdf;id
---- 1
(1 row)

test=# set transaction read write;
SET
test=# insert into asdf values (2);
INSERT 0 1
test=# commit;
COMMIT
I find that to be a huge POLA violation.  I will happily prepare a
patch to fix this if there is agreement that we want it.  I really
need READ ONLY *transactions*, not READ ONLY *moments* within
transactions to do any optimization based on the property.
-Kevin


Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Oh, I just went through the code on setting READ ONLY and discovered
> that contrary to the standard *and* the PostgreSQL documentation,
> you can change the status of a transaction between READ ONLY and
> READ WRITE at will.  Yeah, that's a problem for my intended use.
> Many optimizations would need to go right out the window, and the
> false positive rate under SSI would be high.

I believe you had better support the locution
begin;set transaction read only;...

I agree that letting it be changed back to read/write after that is
surprising and unnecessary.  Perhaps locking down the setting at the
time of first grabbing a snapshot would be appropriate.  IIRC that's
how it works for transaction isolation level, and this seems like it
ought to work the same.
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I agree that letting it be changed back to read/write after that
> is surprising and unnecessary.  Perhaps locking down the setting
> at the time of first grabbing a snapshot would be appropriate. 
> IIRC that's how it works for transaction isolation level, and this
> seems like it ought to work the same.
Agreed.  I can create a patch today to implement this.  The thing
which jumps out first is that assign_transaction_read_only probably
needs to move to variable.c so that it can reference
FirstSnapshotSet as the transaction isolation code does.  The
alternative would be to include snapmgr.h in guc.c, which seems less
appealing.  Agreed?  Other ideas?
-Kevin


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
I wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> I agree that letting it be changed back to read/write after that
>> is surprising and unnecessary.  Perhaps locking down the setting
>> at the time of first grabbing a snapshot would be appropriate.
>> IIRC that's how it works for transaction isolation level, and
>> this seems like it ought to work the same.
>
> Agreed.  I can create a patch today to implement this.

Attached.

Accomplished more through mimicry (based on setting transaction
isolation level) than profound understanding of the code involved;
but it passes all regression tests on both `make check` and `make
installcheck-world`.  This includes a new regression test that an
attempt to change it after a query fails.  I've poked at it with
various ad hoc tests, and it is behaving as expected in those.

I wasn't too confident how to word the new failure messages.

-Kevin


Attachment

Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Attached.
> Accomplished more through mimicry (based on setting transaction
> isolation level) than profound understanding of the code involved;
> but it passes all regression tests on both `make check` and `make
> installcheck-world`.  This includes a new regression test that an
> attempt to change it after a query fails.  I've poked at it with
> various ad hoc tests, and it is behaving as expected in those.

Hmm.  This patch disallows the case of creating a read-only
subtransaction of a read-write parent.  That's a step backwards.
I'm not sure how we could enforce that the property not change
after the first query of a subxact, but maybe we don't care that much?
Do your optimizations pay attention to local read-only in a subxact?
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane <  tgl@sss.pgh.pa.us  > wrote:
> Hmm.  This patch disallows the case of creating a read-only
> subtransaction of a read-write parent.  That's a step backwards.
> I'm not sure how we could enforce that the property not change
> after the first query of a subxact, but maybe we don't care that
> much?  Do your optimizations pay attention to local read-only in a
> subxact?
No, it's all about the top level transaction, as long as the
subtransaction doesn't do anything which violates the requirements
of the top level.  (That is, if the top level is not READ ONLY, I
can't do the optimizations, but it would cause no problem if a
subtransaction was READ ONLY -- it just wouldn't allow any special
optimizations.)
I noticed that the standard seems (if I'm reading it correctly) to
allow subtransactions to switch to more restrictive settings for
both transaction isolation and read only status than the enclosing
transaction, but not looser.  I don't think it makes sense in
PostgreSQL to say (for example) that the top level transaction is
READ COMMITTED but the subtransaction is SERIALIZABLE, but it might
make sense to say that the top level transaction is READ WRITE but
the subtransaction is READ ONLY.  And I see where I broke support
for that in the patch.
I can fix up the patch if to support it again if you like.  (I think
it's just a matter of replacing a few lines that I replaced in the
original patch.)  If you'd rather do it, I'll stay out of your way.
-Kevin


Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> I noticed that the standard seems (if I'm reading it correctly) to
> allow subtransactions to switch to more restrictive settings for
> both transaction isolation and read only status than the enclosing
> transaction, but not looser.

Yeah.  My recollection is that we've discussed exactly this point with
respect to isolation level, and decided that we couldn't (or it wasn't
worthwhile to) allow serializable subxacts inside repeatable read.
I don't know whether your patch will change that tradeoff.  But I don't
think it's really been discussed much with respect to read-only, perhaps
because nobody's paid all that much attention to read-only at all.
In any case, the behavior you state seems obviously correct, so let's
see what we can do about getting closer to that.

My guess is that a reasonable fix is to remember the read-only setting
as of snapshot lockdown, and thereafter to allow changing from
read-write to read-only but not vice versa.  One thing to watch for
is allowing subxact exit to restore the previous read-write state.
(BTW it looks like assign_XactIsoLevel emits a rather useless debug
message in that case, so that code could stand some cleanup too.  Also,
that code is set so that it will throw an error even if you're assigning
the currently active setting, which maybe is overly strict?  Not sure.)

> I can fix up the patch if to support it again if you like.  (I think
> it's just a matter of replacing a few lines that I replaced in the
> original patch.)  If you'd rather do it, I'll stay out of your way.

Feel free to have a go at it.
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> One thing to watch for is allowing subxact exit to restore the
> previous read-write state.
OK.
> BTW it looks like assign_XactIsoLevel emits a rather useless debug
> message in that case, so that code could stand some cleanup too.
I'll take a look.
> Also, that code is set so that it will throw an error even if
> you're assigning the currently active setting, which maybe is
> overly strict?  Not sure.
The standard is tricky to read, but my reading of it is that only
"LOCAL" changes are allowed after the transaction is underway (which
I *think* effectively means a subtransaction), and those can't make
the setting less strict -- you're allowed to specify the same level
or more strict.  There would be no harm from the perspective of
anything I'm working on to allow an in-progress transaction to be
set to what it already has, but that seems to invite confusion and
error more than provide a helpful feature, as far as I can tell. 
I'm inclined not to allow it except at the start of a
subtransaction, but don't feel strongly about it.
>> I can fix up the patch if to support it again if you like.
> Feel free to have a go at it.
Will do.  I notice that I also removed the check for
RecoveryInProgress(), which I will also restore.  And maybe a
regression test or two for the subtransaction usages....
-Kevin


Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Also, that code is set so that it will throw an error even if
>> you're assigning the currently active setting, which maybe is
>> overly strict?  Not sure.
> The standard is tricky to read, but my reading of it is that only
> "LOCAL" changes are allowed after the transaction is underway (which
> I *think* effectively means a subtransaction), and those can't make
> the setting less strict -- you're allowed to specify the same level
> or more strict.  There would be no harm from the perspective of
> anything I'm working on to allow an in-progress transaction to be
> set to what it already has, but that seems to invite confusion and
> error more than provide a helpful feature, as far as I can tell. 
> I'm inclined not to allow it except at the start of a
> subtransaction, but don't feel strongly about it.

Hmm ...

(1) If the standard says that you're allowed to apply a redundant setting,
I think we'd better accept that.

(2) I'm not thrilled about the idea of tracking the equivalent of
FirstSnapshotSet for each subtransaction, which I think would be
necessary infrastructure to error-check this as tightly as you seem to
have in mind.  I'd prefer to be a bit laxer in order to have less
overhead for what is in the end just nanny-ism.  So the implementation
I had in mind would allow SET TRANSACTION operations to occur later in a
subxact, as long as they were redundant and weren't actually trying to
change the active value.
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If the standard says that you're allowed to apply a redundant
> setting, I think we'd better accept that.
OK
> So the implementation I had in mind would allow SET TRANSACTION
> operations to occur later in a subxact, as long as they were
> redundant and weren't actually trying to change the active value.
It's easy to see how I can allow changes in the subtransaction as
long as they don't specify READ WRITE when the top level is READ
ONLY, but it isn't obvious to me how to only allow it at the start
of the subtransaction.  I'm OK with taking the easy route on this
aspect of things, but if someone needs READ ONLY to "stick" for the
duration of a subtransaction, I'm not sure how to do that.  (And I'm
not sure you were actually suggesting that, either.)
To restate, since I'm not sure how clear that is, what I have at the
moment is:
(1) A top level transaction can only set READ ONLY or READ WRITE
until it has acquired its first snapshot.
(2) A subtransaction can set it at will, as many times as desired,
to match the top level or specify READ ONLY.
(3) During recovery the setting cannot be changed from READ ONLY to
READ WRITE.
I'm not clear whether #2 is OK or how to do it only at the start.
I haven't yet looked at the other issues you mentioned.
-Kevin


Re: serializable read only deferrable

From
Florian Pflug
Date:
On Dec8, 2010, at 20:39 , Kevin Grittner wrote:
> The standard is tricky to read, but my reading of it is that only
> "LOCAL" changes are allowed after the transaction is underway (which
> I *think* effectively means a subtransaction), and those can't make
> the setting less strict -- you're allowed to specify the same level
> or more strict.  There would be no harm from the perspective of
> anything I'm working on to allow an in-progress transaction to be
> set to what it already has, but that seems to invite confusion and
> error more than provide a helpful feature, as far as I can tell.
> I'm inclined not to allow it except at the start of a
> subtransaction, but don't feel strongly about it.

Hm, I think being able to assert that the isolation level really is SERIALIZABLE by simply doing "SET TRANSACTION
ISOLATIONLEVEL SERIALIZABLE" would be a great feature for SSI. 

Say you've written a trigger which enforces some complex constraint, but is correct only for SERIALIZABLE transactions.
Bysimply sticking a "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE" at the top of the trigger you'd both document that
factit is correct only for SERIALIZABLE transactions *and* prevent corruption should the isolation level be something
elsedue to a pilot error. Nice, simply and quite effective. 

BTW, I hope to find some time this evening to review your more detailed proposal for "serializable read only
deferrable"

best regards,
Florian Pflug



Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So the implementation I had in mind would allow SET TRANSACTION
>> operations to occur later in a subxact, as long as they were
>> redundant and weren't actually trying to change the active value.
> It's easy to see how I can allow changes in the subtransaction as
> long as they don't specify READ WRITE when the top level is READ
> ONLY, but it isn't obvious to me how to only allow it at the start
> of the subtransaction.  I'm OK with taking the easy route on this
> aspect of things, but if someone needs READ ONLY to "stick" for the
> duration of a subtransaction, I'm not sure how to do that.  (And I'm
> not sure you were actually suggesting that, either.)

What I suggested was to not allow read-only -> read-write state
transitions except (1) before first snapshot in the main xact
and (2) at subxact exit (the OVERRIDE case).  That seems to accomplish
the goal.  Now it will also allow dropping down to read-only
mid-subtransaction, but I don't think forbidding that case is worth
extra complexity.
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Florian Pflug <fgp@phlo.org> wrote:
> Hm, I think being able to assert that the isolation level really
> is SERIALIZABLE by simply doing "SET TRANSACTION ISOLATION LEVEL
> SERIALIZABLE" would be a great feature for SSI.
> 
> Say you've written a trigger which enforces some complex
> constraint, but is correct only for SERIALIZABLE transactions. By
> simply sticking a "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE"
> at the top of the trigger you'd both document that fact it is
> correct only for SERIALIZABLE transactions *and* prevent
> corruption should the isolation level be something else due to 
> a pilot error. Nice, simply and quite effective.
It would be great to have a way within a trigger, or possibly other
functions, to assert that the transaction isolation level is
serializable.  What gives me pause here is that the standard allows
you to specify a more strict transaction isolation level within a
subtransaction without error, so this way of spelling the feature is
flirting with rather nonstandard behavior.
Is there maybe a better way to check this?
-Kevin


Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Florian Pflug <fgp@phlo.org> wrote:
>> Say you've written a trigger which enforces some complex
>> constraint, but is correct only for SERIALIZABLE transactions. By
>> simply sticking a "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE"
>> at the top of the trigger you'd both document that fact it is
>> correct only for SERIALIZABLE transactions *and* prevent
>> corruption should the isolation level be something else due to 
>> a pilot error. Nice, simply and quite effective.
> It would be great to have a way within a trigger, or possibly other
> functions, to assert that the transaction isolation level is
> serializable.  What gives me pause here is that the standard allows
> you to specify a more strict transaction isolation level within a
> subtransaction without error, so this way of spelling the feature is
> flirting with rather nonstandard behavior.

Yes.  This is not the way to provide a feature like that.
> Is there maybe a better way to check this?

You can always read the current setting and throw an error if you
don't like it.
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> What I suggested was to not allow read-only -> read-write state
> transitions except (1) before first snapshot in the main xact
> and (2) at subxact exit (the OVERRIDE case).  That seems to
> accomplish the goal.  Now it will also allow dropping down to
> read-only mid-subtransaction, but I don't think forbidding that
> case is worth extra complexity.

Attached is version 2.  I think it does everything we discussed,
except that I'm not sure whether I addressed the assign_XactIsoLevel
issue you mentioned, since you mentioned a debug message and I only
see things that look like errors to me.  If I did miss something,
I'll be happy to take another look if you can point me to the right
place.

-Kevin

Attachment

Re: serializable read only deferrable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> except that I'm not sure whether I addressed the assign_XactIsoLevel
> issue you mentioned, since you mentioned a debug message and I only
> see things that look like errors to me.  If I did miss something,
> I'll be happy to take another look if you can point me to the right
> place.

GUC_complaint_elevel() can return DEBUGn, and in fact will do so in
the PGC_S_OVERRIDE case.  I'm thinking that the code ought to look
more like
/* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */if (source != PGC_S_OVERRIDE){    check and report
allthe complaint-worthy cases;}
 

The original coding was probably sane for initial development, but
now it just results in useless debug-log traffic for predictable
perfectly valid cases.
        regards, tom lane


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> GUC_complaint_elevel() can return DEBUGn, and in fact will do so in
> the PGC_S_OVERRIDE case.  I'm thinking that the code ought to look
> more like
>
>     /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort
*/
>     if (source != PGC_S_OVERRIDE)
>     {
>         check and report all the complaint-worthy cases;
>     }

Done.  Version 3 attached.  I think I've covered all bases now.

-Kevin


Attachment

Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

> Done.  Version 3 attached.

My final tweaks didn't make it into that diff.  Attached is 3a as a
patch over the version 3 patch.

-Kevin


Attachment

Re: serializable read only deferrable

From
Dan Ports
Date:
On Tue, Dec 07, 2010 at 10:14:24AM -0600, Kevin Grittner wrote:
> > Essentially, instead of adding dependencies as you go along and
> > abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED
> > transactions would assume the worst case from the start and thus
> > be able to bypass the more detailed checks later on.
>  
> Right -- such a transaction, having acquired a good snapshot, could
> release all SSI resources and run without any of the SSI overhead.

Yes, this makes sense. If no running transaction has ever read, and
will never read before COMMIT, any value that's modified by a
concurrent transaction, then they will not create snapshot anomalies,
and the current snapshot has a place in the serial ordering.

> > With this scheme, you'd at least stand some chance of eventually
> > acquiring a consistent snapshot, even in the case of an endless
> > stream of overlapping READ WRITE transactions.
>  
> Yeah, I'd been twisting ideas around trying to find a good way to do
> this; you've got it right at the conceptual level, I think.

The only thing I'm worried about here is how much risk of starvation
remains. You'd need to wait until there are no running r/w transactions
accessing overlapping data sets; for some applications that might not
be any better than waiting for the system to be idle. But I think
there's no way around that, it's just the price you have to pay to get
a snapshot that can never see an anomaly.

> Pseudo-code of idea (conveniently ignoring locking issues and
> non-serializable transactions):

This seems reasonable to me. Let me know if you need help implementing
it; I have some spare cycles right now.

Dan

-- 
Dan R. K. Ports              MIT CSAIL                http://drkp.net/


Re: serializable read only deferrable

From
"Kevin Grittner"
Date:
Dan Ports  wrote:
> On Tue, Dec 07, 2010 at 10:14:24AM -0600, Kevin Grittner wrote:
> The only thing I'm worried about here is how much risk of
> starvation remains. You'd need to wait until there are no running
> r/w transactions accessing overlapping data sets; for some
> applications that might not be any better than waiting for the
> system to be idle. But I think there's no way around that, it's
> just the price you have to pay to get a snapshot that can never see
> an anomaly.
Right -- this can't be a default behavior because of that, but it
rounds out the options for backups and big reports.  Without it you
have the choice between the potential for other transactions to
cancel because a cycle was completed by the READ ONLY transaction or
getting a view of data which may not be consistent with the later
state of the database[1].  This guarantees consistency without
causing rollbacks, with the additional benefit of faster runtime by
skipping SSI logic.
>> Pseudo-code of idea (conveniently ignoring locking issues and
>> non-serializable transactions):
>
> This seems reasonable to me. Let me know if you need help
> implementing it; I have some spare cycles right now.
That would be great.  I'm getting on a train today to go spend a week
on vacation in New Orleans, and I've been fretting about where this
patch is at compared to the release cycle.  :-(  I can suck down my
hurricanes with a calmer mind if I know you're on this.  :-)
In conjunction with this feature, it would be great if you could take
a look at how to recognize these conditions for a READ ONLY
transaction which is running under SSI, and back it out of SSI when
it hits that condition.  SIRead predicate locks, conflicts, and other
structures can be released, and we can stop checking the MVCC data on
reads.  Basically, we should be able to get to the DEFERRABLE type of
state while running -- it's just that we might cause some number of
transactions to cancel along the way before we hit that state. (These
two seem likely to be less work if done at the same time.)
-Kevin
[1] It has struck me that the receipting example is one case of a
more general pattern which I've frequently seen in business software
which is vulnerable to this sort of anomaly -- batch processing.
Basically, any time you have a control record which controls the
batch into which detail is placed, if the control information is
updated and that is committed while detail is still in flight, you
can have this class of anomaly.