Thread: SKIP LOCKED DATA (work in progress)
Hi
A couple of years ago I posted an outline of a plan [1] and an
initial patch [2] for implementing SKIP LOCKED DATA. I have
recently come back to this idea, rebased the patch and added a
simple isolation test -- please see attached.
simple isolation test -- please see attached.
However, heap_lock_tuple is clearly not for the faint hearted,
and I freely admit that I don't understand half of the things
going on in there yet. My general approach has been to follow
the example set by NOWAIT, generalising that flag into a 3-way
policy... but of course NOWAIT doesn't have to worry about
cleaning anything up, because it uses ereport, hence my TODOs.
As always, I would be grateful for any feedback.
Thanks!
Thanks!
Thomas Munro
Attachment
On 05/14/2014 07:06 AM, Thomas Munro wrote: > Hi > > A couple of years ago I posted an outline of a plan [1] and an > initial patch [2] for implementing SKIP LOCKED DATA. I have > recently come back to this idea, rebased the patch and added a > simple isolation test -- please see attached. Simon did some similar work a while ago, so I've cc'd him for his input. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 13 May 2014 23:06, Thomas Munro <munro@ip9.org> wrote:
A couple of years ago I posted an outline of a plan [1] and aninitial patch [2] for implementing SKIP LOCKED DATA. I haverecently come back to this idea, rebased the patch and added a
simple isolation test -- please see attached.
I have attached a new version which copies the skipLocked member
in _copyPlanRowMark, _copyRowMarkClause and _copyLockClause.
Without these, SKIP LOCKED DATA didn't work in pl/pgsql (as reported
by David Rowley off-list). (Also some whitespace changes.)
Best regards,
Thomas Munro
in _copyPlanRowMark, _copyRowMarkClause and _copyLockClause.
Without these, SKIP LOCKED DATA didn't work in pl/pgsql (as reported
by David Rowley off-list). (Also some whitespace changes.)
Best regards,
Thomas Munro
Attachment
I'm rebasing another implementation of this against current HEAD at the moment. It was well tested but has bitrotted a bit, in particular it needs merging with the multixact changes (eep). That should provide a useful basis for comparison and a chance to share ideas. I'll follow up with the patch and a git tree when it's ready, hopefully tonight. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 05/16/2014 04:46 PM, Craig Ringer wrote: > > I'll follow up with the patch and a git tree when it's ready, hopefully > tonight. Here's a rebased version of Simon's original patch that runs on current master. I still need to merge the isolation tests for it merged and sorted out, and after re-reading it I'd like to change "waitMode" into an enum, not just some #defines . Hope it's useful for comparison and ideas. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 16 May 2014 13:21, Craig Ringer <craig@2ndquadrant.com> wrote:
Thank you! At first glance they're sort of similar which is reassuring. I'm especially interested in the buffer release semantics which I was confused about and will look into that (to resolve the TODO notes in my patch).
I noticed that in applyLockingClause, Simon has "rc->waitMode |= waitMode". Is that right? The values are 0, 1, and 2, but if you had both NOWAIT and SKIP LOCKED somewhere in your query you could up with rc->waitMode == 3 unless I'm mistaken. In my patch I have code that would give precedence to NOWAIT, though looking at it again it could be simpler. (One reviewer pointed out, that it should really be a single unified enum. In fact I have been a bit unsure about what scope such an enumeration should have in the application -- could it even be used in parser code? I tried to follow existing examples which is why I used #define macros in gram.y).
From a bikeshed colour point of view:
* I used SKIP LOCKED DATA like DB2, and Simon used SKIP LOCKED like Oracle, and I guess shorter is sweeter
* I used the term wait_policy and an enum, Simon used waitMode and an int
* I had noWait and skipLocked travelling separately in some places, Simon had a single parameter, which is much better
Best regards,
Thomas Munro
On 05/16/2014 04:46 PM, Craig Ringer wrote:Here's a rebased version of Simon's original patch that runs on current
>
> I'll follow up with the patch and a git tree when it's ready, hopefully
> tonight.
master.
I still need to merge the isolation tests for it merged and sorted out,
and after re-reading it I'd like to change "waitMode" into an enum, not
just some #defines .
Hope it's useful for comparison and ideas.
Thank you! At first glance they're sort of similar which is reassuring. I'm especially interested in the buffer release semantics which I was confused about and will look into that (to resolve the TODO notes in my patch).
I noticed that in applyLockingClause, Simon has "rc->waitMode |= waitMode". Is that right? The values are 0, 1, and 2, but if you had both NOWAIT and SKIP LOCKED somewhere in your query you could up with rc->waitMode == 3 unless I'm mistaken. In my patch I have code that would give precedence to NOWAIT, though looking at it again it could be simpler. (One reviewer pointed out, that it should really be a single unified enum. In fact I have been a bit unsure about what scope such an enumeration should have in the application -- could it even be used in parser code? I tried to follow existing examples which is why I used #define macros in gram.y).
From a bikeshed colour point of view:
* I used SKIP LOCKED DATA like DB2, and Simon used SKIP LOCKED like Oracle, and I guess shorter is sweeter
* I used the term wait_policy and an enum, Simon used waitMode and an int
* I had noWait and skipLocked travelling separately in some places, Simon had a single parameter, which is much better
Best regards,
Thomas Munro
On 05/17/2014 05:24 AM, Thomas Munro wrote: > I noticed that in applyLockingClause, Simon has "rc->waitMode |= > waitMode". Is that right? The values are 0, 1, and 2, but if you had > both NOWAIT and SKIP LOCKED somewhere in your query you could up with > rc->waitMode == 3 unless I'm mistaken. I do not think that |= is correct there. It may be that no case can arise where you get the bogus value, but since in all other places the values are tested for equalty not as bit fields ( waitMode == NOWAIT not waitMode & NOWAIT ) it doesn't make sense to |= it there. ? In my patch I have code that > would give precedence to NOWAIT, though looking at it again it could be > simpler. I agree; if NOWAIT is present anywhere it should be preferred to SKIP_LOCKED, as it's OK to apply NOWAIT where SKIP LOCKED appears, but possibly semantically incorrect to apply SKIP LOCKED where only NOWAIT was expected. > (One reviewer pointed out, that it should really be a single > unified enum. In fact I have been a bit unsure about what scope such an > enumeration should have in the application -- could it even be used in > parser code? I tried to follow existing examples which is why I used > #define macros in gram.y). Not sure there. > From a bikeshed colour point of view: > * I used SKIP LOCKED DATA like DB2, and Simon used SKIP LOCKED like > Oracle, and I guess shorter is sweeter We have a long tradition of trying to allow noise keywords where it's harmless. So the clause should probably be SKIP LOCKED [DATA] in much the same way we have BEGIN [ WORK | TRANSACTION ] ... There won't be any ambiguity there. > * I used the term wait_policy and an enum, Simon used waitMode and an int I prefer an enum and intended to change Simon's patch but didn't have the time. I have some isolation tester and regression tests that are still to follow. > * I had noWait and skipLocked travelling separately in some places, > Simon had a single parameter, which is much better Yes, I strongly prefer that. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 17 May 2014 05:02, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 05/17/2014 05:24 AM, Thomas Munro wrote:
> > I noticed that in applyLockingClause, Simon has "rc->waitMode |= > > waitMode". Is that right? The values are 0, 1, and 2, but if you had
> > both NOWAIT and SKIP LOCKED somewhere in your query you could up with
> > rc->waitMode == 3 unless I'm mistaken.
>
> I do not think that |= is correct there.
>
> It may be that no case can arise where you get the bogus value, but
> since in all other places the values are tested for equalty not as bit
> fields ( waitMode == NOWAIT not waitMode & NOWAIT ) it doesn't make
> sense to |= it there.
There are indeed ways (subselects etc) to have more than one
locking clause applying to the same table, which are supposed to
be merged into the appropriate wait policy by that line.
> ? In my patch I have code that
> > would give precedence to NOWAIT, though looking at it again it could be
> > simpler.
>
> I agree; if NOWAIT is present anywhere it should be preferred to > SKIP_LOCKED, as it's OK to apply NOWAIT where SKIP LOCKED appears, but
> possibly semantically incorrect to apply SKIP LOCKED where only NOWAIT
> was expected.
In the new v5 patch (attached) I used the same approach as
the "lock strength" support, that is, relying on the values of
the enumeration being arranged so that the enumerators that
should take precendence have higher numerical values so I could
just write:
rc->strength = Max(rc->strength, strength);
+ rc->waitPolicy = Max(rc->waitPolicy, waitPolicy);
> > (One reviewer pointed out, that it should really be a single
> > unified enum. In fact I have been a bit unsure about what scope such an
> > enumeration should have in the application -- could it even be used in
> > parser code? I tried to follow existing examples which is why I used
> > #define macros in gram.y).
>
> Not sure there.
I removed those macros and went for an enumeration -- see below.
> > From a bikeshed colour point of view:
> > * I used SKIP LOCKED DATA like DB2, and Simon used SKIP LOCKED like
> > Oracle, and I guess shorter is sweeter
>
> We have a long tradition of trying to allow noise keywords where it's
> harmless.
>
> So the clause should probably be
>
> SKIP LOCKED [DATA]
Ok, done in the v5 patch.
> I have some isolation tester and regression tests that are still to follow.
Great, I am looking forward to seeing those. I have added an
isolation test for NOWAIT, and I am planning to design some tests
that will test the various affected branched of
heap_lock_tuple (eg multi-transaction ID etc), and test
interaction with various other features (various lock stregths,
UPDATEs, combinations of SKIP LOCKED DATA, NOWAIT and default
policies etc).
> > * I had noWait and skipLocked travelling separately in some places,
> > Simon had a single parameter, which is much better
>
> Yes, I strongly prefer that.
I agree in principal but couldn't decide *how many* enumerations
to use. In v5 I have added two more:
* one called LockClauseWaitPolicy defined in parsenodes.h that is
used in LockClause and RowMarkCause
* another called RowWaitPolicy defined in plannodes.h that is
used in PlanRowMark and ExecRowMark
The parser produces LockClauseWaitPolicy values, which InitPlan
converts to RowWaitPolicy values by simple assignment (the values
are the same, which obviously wouldn't work if this were compiled
as C++). Then ExecLockRows converts those into LockWaitPolicy
values when calling heal_lock_tuple.
Obviously there is an opportunity to unify all three, or at least
LockClauseWaitPolicy (parser) and RowWaitPolicy (plan and
execution), but I didn't know how appropriate that would be, and
followed the new lock strength feature for inspiration.
One difference between my patch and Simon's is that mine
introduces a new return value for heap_lock_type called
HeapTupleWouldBlock, whereas his returns the existin
HeapTupleBeingUpdated and then handles it differently in
ExecLockRows if in SKIP LOCKED mode.
Best regards,
Thomas Munro
Attachment
Hello
As a simple example for people wondering what the point of this
feature is, I created a table work (id, data, status)
and then create 10,000 items with status 'NEW' and then started
a number of worker threads that did the following pair of
As a simple example for people wondering what the point of this
feature is, I created a table work (id, data, status)
and then create 10,000 items with status 'NEW' and then started
a number of worker threads that did the following pair of
transactions, with and without SKIP LOCKED DATA on the end of the
SELECT statement, until all rows were deleted:
BEGIN
SELECT id, data FROM work WHERE status = 'NEW' LIMIT 1 FOR UPDATE
-- if no rows returned, then finish
UPDATE work SET status = 'WORK' WHERE id = $id
COMMIT
BEGIN
DELETE FROM work WHERE id = $id
COMMIT
Here are the times taken to process all items, in elapsed seconds, on
a slow laptop (i5 4 core with an SSD, with default postgresql.conf
except for checkpoint_segments set to 300):
| Threads | default | SKIP |
| 1 | 26.78 | 27.02 |
| 2 | 23.46 | 22.00 |
| 3 | 22.02 | 14.83 |
| 4 | 22.59 | 11.16 |
| 5 | 22.37 | 9.05 |
| 6 | 22.55 | 7.66 |
| 7 | 22.46 | 6.69 |
| 8 | 22.57 | 8.39 |
| 9 | 22.40 | 8.38 |
| 10 | 22.38 | 7.93 |
| 11 | 22.43 | 6.86 |
| 12 | 22.34 | 6.77 |
I am not experienced at benchmarking and I don't claim that this
particular workload or configuration is particularly sensible or
representative of anything but it might give some idea of the
motivation.
Best regards,
Thomas Munro
On Sat, May 17, 2014 at 1:02 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > We have a long tradition of trying to allow noise keywords where it's > harmless. > > So the clause should probably be > > SKIP LOCKED [DATA] > > in much the same way we have > > BEGIN [ WORK | TRANSACTION ] ... > > There won't be any ambiguity there. We've had some problems in the past where allowing optional noise words resulted in grammar conflicts that made future features harder to add. See previous discussions about LOCK TABLE, wherein we almost went to the extreme of adding a completely separate ACQUIRE LOCK command. A lot of these things seem harmless when you first do them, and then later they seem less harmless. Anyway, +1 for the general idea of this feature. It's come up a number of times on this mailing list, and we've had customer requests for it, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, May 17, 2014 at 1:02 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> We have a long tradition of trying to allow noise keywords where it's >> harmless. >> >> So the clause should probably be >> >> SKIP LOCKED [DATA] >> >> in much the same way we have >> >> BEGIN [ WORK | TRANSACTION ] ... >> >> There won't be any ambiguity there. > We've had some problems in the past where allowing optional noise > words resulted in grammar conflicts that made future features harder > to add. In this particular case, I'd be worried about whether we'd not end up having to fully reserve DATA in order to allow it to be optional here. That would be necessary if this clause could be followed immediately by an identifier, either now or in the future. That would be a mighty high price to pay for failing to make up our minds about which syntax to use. (How many tables out there do you think have "data" as a column name?) A different concern is that this patch adds not one but two new unreserved keywords, ie SKIP and LOCKED. That bloats our parser tables, which are too darn large already, and it has a nonzero compatibility cost (since we only allow AS-less column aliases when they are no keyword at all). If we're pulling syntax out of the air it'd be nice if we could avoid adding new keywords to the grammar. regards, tom lane
<div dir="ltr">On 23 May 2014 15:40, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br/>> A different concern is that this patch adds not one but two new unreserved<br />> keywords, ie SKIP andLOCKED. That bloats our parser tables, which are<br /> > too darn large already, and it has a nonzero compatibilitycost (since<br />> we only allow AS-less column aliases when they are no keyword at all).<br />> If we'repulling syntax out of the air it'd be nice if we could avoid<br /> > adding new keywords to the grammar.<br /> <br/>How about some of these combinations of existing words:<br /><br />EXCLUDE LOCK<br />NOWAIT EXCLUDE<br />NOWAIT NEXT<br/>NOWAIT FOLLOWING<br />NOWAIT DISCARD<br /><br />Of those I think I prefer NOWAIT EXCLUDE (perhaps with NOWAIT ABORTas a long version of the existing NOWAIT behaviour for contrast).<br /><br />Or adding just one new keyword:<br /><br/>NOWAIT SKIP<br />SKIP LOCK<br /><br />Regards,<br />Thomas Munro</div>
On 23 May 2014 10:40, Tom Lane <tgl@sss.pgh.pa.us> wrote: > If we're pulling syntax out of the air it'd be nice if we could avoid > adding new keywords to the grammar. Oracle, SQLServer and DB2 have this capability. MySQL does not. SQLServer implements that using the table hint of READPAST. Since that whole syntax area is radically different to what we have, it isn't easy to maintain code compatibility. DB2 z/OS 10 provides SKIP LOCKED DATA clause to allow moving past already locked rows. That's fairly recent and I don't believe there will be many programs using that. DB2 UDB supports some complex functionality using DB2_SKIPINSERTED, DB2_EVALUNCOMMITTED and DB2_SKIPDELETED, all of which is complex and mostly exists for benchmarks, AFAICS. Oracle uses both SKIP LOCKED and NOWAIT. PostgreSQL already chose to follow the Oracle syntax when we implemented NOWAIT. So my proposal is that we follow the Oracle syntax again and use the words SKIP LOCKED. I don't see any advantage in inventing new syntax that leaves us incompatible with Oracle, nor do I see any need to be compatible with both Oracle and DB2 since the latter is much less likely to gain us anything in practice. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2014-05-23 21:24 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:
+1
On 23 May 2014 10:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we're pulling syntax out of the air it'd be nice if we could avoid
> adding new keywords to the grammar.
Oracle, SQLServer and DB2 have this capability. MySQL does not.
SQLServer implements that using the table hint of READPAST. Since that
whole syntax area is radically different to what we have, it isn't
easy to maintain code compatibility.
DB2 z/OS 10 provides SKIP LOCKED DATA clause to allow moving past
already locked rows. That's fairly recent and I don't believe there
will be many programs using that. DB2 UDB supports some complex
functionality using DB2_SKIPINSERTED, DB2_EVALUNCOMMITTED and
DB2_SKIPDELETED, all of which is complex and mostly exists for
benchmarks, AFAICS.
Oracle uses both SKIP LOCKED and NOWAIT.
PostgreSQL already chose to follow the Oracle syntax when we
implemented NOWAIT. So my proposal is that we follow the Oracle syntax
again and use the words SKIP LOCKED.
I don't see any advantage in inventing new syntax that leaves us
incompatible with Oracle, nor do I see any need to be compatible with
both Oracle and DB2 since the latter is much less likely to gain us
anything in practice.
+1
Pavel
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 23, 2014 at 3:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > PostgreSQL already chose to follow the Oracle syntax when we > implemented NOWAIT. So my proposal is that we follow the Oracle syntax > again and use the words SKIP LOCKED. > > I don't see any advantage in inventing new syntax that leaves us > incompatible with Oracle, nor do I see any need to be compatible with > both Oracle and DB2 since the latter is much less likely to gain us > anything in practice. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 23 May 2014 21:18, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, May 23, 2014 at 3:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> PostgreSQL already chose to follow the Oracle syntax when we >> implemented NOWAIT. So my proposal is that we follow the Oracle syntax >> again and use the words SKIP LOCKED. >> >> I don't see any advantage in inventing new syntax that leaves us >> incompatible with Oracle, nor do I see any need to be compatible with >> both Oracle and DB2 since the latter is much less likely to gain us >> anything in practice. > > +1. Hello Please find attached a rebased version of my SKIP LOCKED patch (formerly SKIP LOCKED DATA), updated to support only the Oracle-like syntax. I posted a small test program here: https://github.com/macdice/skip-locked-tests Would anyone who is interested in a SKIP LOCKED feature and attending the CHAR(14)/PGDay UK conference next week be interested in a birds-of-a-feather discussion? Best regards, Thomas Munro
Attachment
On 29 June 2014 10:01, Thomas Munro <munro@ip9.org> wrote: > Would anyone who is interested in a SKIP LOCKED feature and > attending the CHAR(14)/PGDay UK conference next week be > interested in a birds-of-a-feather discussion? Sounds like a plan. I'll check my schedule. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Jun 29, 2014 at 9:01 PM, Thomas Munro <munro@ip9.org> wrote:
Please find attached a rebased version of my SKIP LOCKED
patch (formerly SKIP LOCKED DATA), updated to support only the
Oracle-like syntax.
Hi Thomas,
Apologies for taking this long to get to reviewing this, I'd gotten a bit side tracked with my own patches during this commitfest.
I'm really glad to see this patch is back again. I think it will be very useful for processing queues. I could have made good use of it in my last work, using it for sending unsent emails which were "queued" up in a table in the database.
I've so far read over the patch and done only some brief tests of the functionality.
Here's what I've picked up on so far:
* In heap_lock_tuple() there's a few places where you test the wait_policy, but you're not always doing this in the same order. The previous code did if (nolock) first each time, but since there's now 3 values I think if (wait_policy == LockWaitError) should be first each time as likely this is the most common case.
* The following small whitespace change can be removed in gram.y:
@@ -119,7 +119,6 @@ typedef struct PrivTarget
#define CAS_NOT_VALID 0x10
#define CAS_NO_INHERIT 0x20
-
#define parser_yyerror(msg) scanner_yyerror(msg, yyscanner)
#define parser_errposition(pos) scanner_errposition(pos, yyscanner)
* analyze.c. rc->waitPolicy = Max(rc->waitPolicy, waitPolicy);
I'm not quite sure I understand this yet, but I'm guessing the original code was there so that something like:
SELECT * FROM (SELECT * FROM a FOR UPDATE) a FOR UPDATE NOWAIT;
Would give:
ERROR: could not obtain lock on row in relation "a"
So it seems that with the patch as you've defined in by the order of the enum values in LockClauseWaitPolicy that SKIP LOCKED overrides FOR UPDATE. I'm wondering if this is dangerous.
Should the following really skip locked tuples?
SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE LIMIT 1;
But on the other hand perhaps I've missed a discussion on this, if so then I think the following comment should be updated to explain it all:
* We also consider that NOWAIT wins if it's specified both ways. This
* is a bit more debatable but raising an error doesn't seem helpful.
* (Consider for instance SELECT FOR UPDATE NOWAIT from a view that
* internally contains a plain FOR UPDATE spec.)
*
* plannodes.h -> RowWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED DATA options */
Should be "NOWAIT and SKIP LOCKED options" since DATA has now been removed from the syntax.
* nodeLockRow.c has extra space in if condition: else if (erm->waitPolicy == RWP_SKIP )
I'm also wondering about this block of code in general:
if (erm->waitPolicy == RWP_WAIT)
wait_policy = LockWaitBlock;
else if (erm->waitPolicy == RWP_SKIP )
wait_policy = LockWaitSkip;
else /* erm->waitPolicy == RWP_NOWAIT */
wait_policy = LockWaitError;
Just after this code heap_lock_tuple() is called, and if that returns HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that whole block again. I'm wondering why there's 2 enums that are for the same thing? if you just had 1 then you'd not need this block of code at all, you could just pass erm->waitPolicy to heap_lock_tuple().
* parsenodes.h comment does not meet project standards (http://www.postgresql.org/docs/devel/static/source-format.html)
typedef enum LockClauseWaitPolicy
{
/* order is important (see applyLockingClause which takes the greatest
value when several wait policies have been specified), and values must
match RowWaitPolicy from plannodes.h */
* parsenode.h remove "DATA" from LockClauseWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED DATA */
I have noticed the /* TODO -- work out what needs to be released here */ comments in head_lock_tuple(), and perhaps the lack of cleaning up is what is causing the following:
create table skiptest (id int primary key); insert into skiptest (id) select x.x from generate_series(1,1000000) x(x);
Session 1: begin work; select * from skiptest for update limit 999999;
Session 2: select * from skiptest for update skip locked limit 1;
WARNING: out of shared memory
ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
Yet if I do:
session 1: begin work; select * from skiptest where id > 1 for update;
session 2: select * from skiptest for update skip locked limit 1;
id
----
1
(1 row)
That test makes me think your todo comments are in the right place, something is missing there for sure!
* plays about with patch for a bit *
I don't quite understand how heap_lock_tuple works, as if I debug session 2 from the above set of queries the first call to ConditionalLockTupleTuplock() (heapam.c line 4236) succeeds where I'd have thought it would fail, since session 1 should be locking this tuple? Later at line 4527 on the line if (!ConditionalXactLockTableWait(xwait)), it fails to grab the lock and returns HeapTupleWouldBlock. The above query seems to work ok if I just apply the following patch:
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index b661d0e..b3e9dcc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4525,8 +4525,10 @@ l3:
else /* wait_policy == LockWaitSkip */
{
if (!ConditionalXactLockTableWait(xwait))
- /* TODO -- work out what needs to be released here */
+ {
+ UnlockTupleTuplock(relation, tid, mode);
return HeapTupleWouldBlock;
+ }
}
/* if there are updates, follow the update chain */
Running the test query again, I get:
select * from skiptest for update skip locked limit 1;
id
---------
1000000
(1 row)
Would you also be able to supply a rebased patch with current master? The applying the patch is a bit noisy and the isolation tests part does not seem to apply at all now, so I've not managed to look at those yet.
I've still got more to look at, but hopefully this will get things moving again.
Regards
David Rowley
On 23 July 2014 13:15, David Rowley <dgrowleyml@gmail.com> wrote: > * In heap_lock_tuple() there's a few places where you test the wait_policy, > but you're not always doing this in the same order. The previous code did if > (nolock) first each time, but since there's now 3 values I think if > (wait_policy == LockWaitError) should be first each time as likely this is > the most common case. Ok, I have made the them consistent -- though I put LockWaitBlock first as that is surely the most common case (plain old FOR UPDATE). > * The following small whitespace change can be removed in gram.y: > [snip] Fixed. > * analyze.c. rc->waitPolicy = Max(rc->waitPolicy, waitPolicy); > I'm not quite sure I understand this yet, but I'm guessing the original code > was there so that something like: > > SELECT * FROM (SELECT * FROM a FOR UPDATE) a FOR UPDATE NOWAIT; > Would give: > ERROR: could not obtain lock on row in relation "a" Surely only if another session already has one or more tuples locked? > So it seems that with the patch as you've defined in by the order of the > enum values in LockClauseWaitPolicy that SKIP LOCKED overrides FOR UPDATE. > I'm wondering if this is dangerous. My understanding was that we have to choose a single policy for attempts to lock rows in that relation, and if you've stated in at least one place that you don't want to wait, then it makes sense for NOWAIT to take precedence. I had to figure out how to fit SKIP LOCKED into that hierarchy, and figured that NOWAIT should still be the strongest policy, overriding all others, and SKIP LOCKED should overrule default FOR UPDATE (ie blocking). > Should the following really skip locked tuples? > SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE LIMIT 1; I agree that it's somewhat arbitrary... I suppose the two most obvious options are either to reduce to a single policy using some rule (such as the NOWAIT > SKIP LOCKED > default hierarchy I propose), or simply reject such queries (which I'm guessing wouldn't fly at this point since existing valid queries would become invalid). > But on the other hand perhaps I've missed a discussion on this, if so then I > think the following comment should be updated to explain it all: > > * We also consider that NOWAIT wins if it's specified both ways. This > * is a bit more debatable but raising an error doesn't seem helpful. > * (Consider for instance SELECT FOR UPDATE NOWAIT from a view that > * internally contains a plain FOR UPDATE spec.) Modified to address SKIP LOCKED (there are also couple of words to the same effect in select.sgml). > * plannodes.h -> RowWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED DATA > options */ > Should be "NOWAIT and SKIP LOCKED options" since DATA has now been removed > from the syntax. > > * nodeLockRow.c has extra space in if condition: else if (erm->waitPolicy == > RWP_SKIP ) Fixed. > I'm also wondering about this block of code in general: > > if (erm->waitPolicy == RWP_WAIT) > wait_policy = LockWaitBlock; > else if (erm->waitPolicy == RWP_SKIP ) > wait_policy = LockWaitSkip; > else /* erm->waitPolicy == RWP_NOWAIT */ > wait_policy = LockWaitError; > > Just after this code heap_lock_tuple() is called, and if that returns > HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that > whole block again. I'm wondering why there's 2 enums that are for the same > thing? if you just had 1 then you'd not need this block of code at all, you > could just pass erm->waitPolicy to heap_lock_tuple(). True. Somewhere upthread I mentioned the difficulty I had deciding how many enumerations were needed, for the various subsystems, ie which headers and type they were allowed to share. Then I put off working on this for so long that a nice example came along that showed me the way: the lock strength enums LockTupleMode (heapam.h) and RowMarkType (plannodes.h). The wait policy enums LockWaitPolicy (heapam.h) and RowWaitPolicy (plannodes.h) mirror them closely, and the same type of enumeration translation takes place in nodeLockRows.c immediately above the code you pasted. I don't have any more principled argument than monkey-see-monkey-do for that one... > * parsenodes.h comment does not meet project standards > (http://www.postgresql.org/docs/devel/static/source-format.html) > > typedef enum LockClauseWaitPolicy > { > /* order is important (see applyLockingClause which takes the greatest > value when several wait policies have been specified), and values must > match RowWaitPolicy from plannodes.h */ Fixed. > * parsenode.h remove "DATA" from LockClauseWaitPolicy waitPolicy; /* NOWAIT > and SKIP LOCKED DATA */ Fixed. > I have noticed the /* TODO -- work out what needs to be released here */ > comments in head_lock_tuple(), and perhaps the lack of cleaning up is what > is causing the following: > > create table skiptest (id int primary key); insert into skiptest (id) select > x.x from generate_series(1,1000000) x(x); > > Session 1: begin work; select * from skiptest for update limit 999999; > Session 2: select * from skiptest for update skip locked limit 1; > WARNING: out of shared memory > ERROR: out of shared memory > HINT: You might need to increase max_locks_per_transaction. > > Yet if I do: > > session 1: begin work; select * from skiptest where id > 1 for update; > session 2: select * from skiptest for update skip locked limit 1; > id > ---- > 1 > (1 row) > > That test makes me think your todo comments are in the right place, > something is missing there for sure! > > * plays about with patch for a bit * > > I don't quite understand how heap_lock_tuple works, as if I debug session 2 > from the above set of queries the first call to > ConditionalLockTupleTuplock() (heapam.c line 4236) succeeds where I'd have > thought it would fail, since session 1 should be locking this tuple? Later > at line 4527 on the line if (!ConditionalXactLockTableWait(xwait)), it fails > to grab the lock and returns HeapTupleWouldBlock. The above query seems to > work ok if I just apply the following patch: > > diff --git a/src/backend/access/heap/heapam.c > b/src/backend/access/heap/heapam.c > index b661d0e..b3e9dcc 100644 > --- a/src/backend/access/heap/heapam.c > +++ b/src/backend/access/heap/heapam.c > @@ -4525,8 +4525,10 @@ l3: > else /* wait_policy == LockWaitSkip */ > { > if > (!ConditionalXactLockTableWait(xwait)) > - /* TODO -- work out what > needs to be released here */ > + { > + UnlockTupleTuplock(relation, > tid, mode); > return HeapTupleWouldBlock; > + } > } > > /* if there are updates, follow the update > chain */ > > Running the test query again, I get: > select * from skiptest for update skip locked limit 1; > id > --------- > 1000000 > (1 row) Ahh! Ok, take a look now, I included that change, but also made the unlocking conditional on have_tuple_lock, in the second two cases where HeapTupleWouldBlock is returned. In the first case it doesn't look possible for it to be true. > Would you also be able to supply a rebased patch with current master? The > applying the patch is a bit noisy and the isolation tests part does not seem > to apply at all now, so I've not managed to look at those yet. Done, and attached with the above changes. > I've still got more to look at, but hopefully this will get things moving > again. Thank you very much for taking the time to look at this! Best regards, Thomas Munro
Attachment
On 24 July 2014 00:52, Thomas Munro <munro@ip9.org> wrote: > On 23 July 2014 13:15, David Rowley <dgrowleyml@gmail.com> wrote: >> I'm also wondering about this block of code in general: >> >> if (erm->waitPolicy == RWP_WAIT) >> wait_policy = LockWaitBlock; >> else if (erm->waitPolicy == RWP_SKIP ) >> wait_policy = LockWaitSkip; >> else /* erm->waitPolicy == RWP_NOWAIT */ >> wait_policy = LockWaitError; >> >> Just after this code heap_lock_tuple() is called, and if that returns >> HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that >> whole block again. I'm wondering why there's 2 enums that are for the same >> thing? if you just had 1 then you'd not need this block of code at all, you >> could just pass erm->waitPolicy to heap_lock_tuple(). > > True. Somewhere upthread I mentioned the difficulty I had deciding > how many enumerations were needed, for the various subsystems, ie > which headers and type they were allowed to share. Then I put off > working on this for so long that a nice example came along that showed > me the way: the lock strength enums LockTupleMode (heapam.h) and > RowMarkType (plannodes.h). The wait policy enums LockWaitPolicy > (heapam.h) and RowWaitPolicy (plannodes.h) mirror them closely, and > the same type of enumeration translation takes place in nodeLockRows.c > immediately above the code you pasted. I don't have any more > principled argument than monkey-see-monkey-do for that one... On reflection, I agree that this sucks, and I would like to unify the three new enums in the current patch (see below for recap) into one that can be passed between parser, planner, executor and heap access manager code as I think you are suggesting. My only question is: in which header should the enum be defined, that all those modules could include? Best regards, Thomas Munro Enumeration explosion recap: * parsenode.h defines enum LockClauseWaitPolicy, which is used in the structs LockClause and RowMarkClause, for use by theparser code * plannodes.h defines enum RowWaitPolicy, which is used in the structs PlanRowMark and ExecRowMark, for use by the plannerand executor code (numbers coincide with LockClauseWaitPolicy) * heapam.h defines enum LockWaitPolicy, which is used as a parameter to heap_lock_tuple, for use by heap access code The parser produces LockClauseWaitPolicy values. InitPlan converts these to RowWaitPolicy values in execMain.c. Then nodeLockRows.c converts RowWaitPolicy values to LockWaitPolicy values (by if-then-else) so it can call heap_lock_tuple. This roughly mirrors what happens to lock strength information. The unpatched code simply passes a boolean 'nowait' flag around. An earlier version of my patch passed a pair of booleans around. Simon's independent patch[1] uses an int in the various node structs and the heap_lock_tuple function, and in execNode.h it has macros to give names to the values, and that is included by access/heapm.h. [1] http://www.postgresql.org/message-id/537610D5.3090405@2ndquadrant.com
On 24 July 2014 00:52, Thomas Munro <munro@ip9.org> wrote: > On 23 July 2014 13:15, David Rowley <dgrowleyml@gmail.com> wrote: >> I'm also wondering about this block of code in general: >> >> if (erm->waitPolicy == RWP_WAIT) >> wait_policy = LockWaitBlock; >> else if (erm->waitPolicy == RWP_SKIP ) >> wait_policy = LockWaitSkip; >> else /* erm->waitPolicy == RWP_NOWAIT */ >> wait_policy = LockWaitError; >> >> Just after this code heap_lock_tuple() is called, and if that returns >> HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that >> whole block again. I'm wondering why there's 2 enums that are for the same >> thing? if you just had 1 then you'd not need this block of code at all, you >> could just pass erm->waitPolicy to heap_lock_tuple(). > > True. Somewhere upthread I mentioned the difficulty I had deciding > how many enumerations were needed, for the various subsystems, ie > which headers and type they were allowed to share. [...] I tried getting rid of the offending if-then-else enum conversion code and replaced it with a simple assignment -- please see attached. I also added compile time assertions that the enum values line up to make that work, and are correctly ordered for use in that 'Max' expression. Please let me know if you think this is an improvement or an abomination. I couldn't find an existing reasonable place to share a single wait policy enumeration between parser/planner/executor and the heap access module, and I get the feeling that it would be unacceptable to introduce one. I suppose that the LockClauseWaitPolicy and RowWaitPolicy could at least be merged into a single enum defined in nodes.h following the example of CmdType, which is also used by both parsenodes.h and plannnode.h, but do I detect a tiny hint of reluctance in its comment, "so put it here..."? (The attached patch also has a couple of trivial typo fixes in documentation and comments). Best regards, Thomas Munro
Attachment
On Sat, Jul 26, 2014 at 9:34 PM, Thomas Munro <munro@ip9.org> wrote:
I couldn't find an existing reasonable place to share a single waitpolicy enumeration between parser/planner/executor and the heap access
module, and I get the feeling that it would be unacceptable to
introduce one.
I guess the way I justify it in my head is something like, "the 3 enums are for the same purpose, so having 3 exist all with different names is confusing and it makes the code harder to follow". So to fix that up I think, "oh we can just give them all the same name... But then, how can be we be sure each definition matches the other 2?" ... hmm, "just merge it into one and put it somewhere that can be accessed from everywhere."
Saying that I don't know what the project best practises are for locations for sharing such things, but if nothing exists then maybe this would be a good time to invent somewhere.
Maybe someone with more experience can chime in and give advice on this?
Regards
David Rowley
Thomas Munro <munro@ip9.org> writes: > I couldn't find an existing reasonable place to share a single wait > policy enumeration between parser/planner/executor and the heap access > module, and I get the feeling that it would be unacceptable to > introduce one. There is a precedent in the form of AclMode, which is needed throughout the system and is currently declared in parsenodes.h. I can't say I've ever been particularly pleased with that arrangement though, since it forces inclusion of parsenodes.h in many places that might not otherwise have any interest in parse nodes. It might be better if we'd declared AclMode in a single-purpose header, say utils/aclmode.h, and then #include'd that into parsenodes.h. There's certainly plenty of other single-datatype headers laying about. regards, tom lane
On 26 July 2014 15:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <munro@ip9.org> writes: >> I couldn't find an existing reasonable place to share a single wait >> policy enumeration between parser/planner/executor and the heap access >> module, and I get the feeling that it would be unacceptable to >> introduce one. > > There is a precedent in the form of AclMode, which is needed throughout > the system and is currently declared in parsenodes.h. I can't say I've > ever been particularly pleased with that arrangement though, since it > forces inclusion of parsenodes.h in many places that might not otherwise > have any interest in parse nodes. > > It might be better if we'd declared AclMode in a single-purpose header, > say utils/aclmode.h, and then #include'd that into parsenodes.h. > There's certainly plenty of other single-datatype headers laying about. Here is a new version of the patch with a single enum LockWaitPolicy defined in utils/lockwaitpolicy.h. Best regards, Thomas Munro
Attachment
On Sun, Jul 27, 2014 at 4:49 AM, Thomas Munro <munro@ip9.org> wrote:
defined in utils/lockwaitpolicy.h.Here is a new version of the patch with a single enum LockWaitPolicy
That seems much cleaner
A few more comments:
You seem to have lost the comment which indicates that the values of the enum are important due to the code in applyLockingClause(), but I see now instead that you've added some assert checks in applyLockingClause(), likely these should use Assert() rather than StaticAssertExpr().
I've also been looking at the isolation tests and I see that you've added a series of tests for NOWAIT. I was wondering why you did that as that's really existing code, probably if you thought the tests were a bit thin around NOWAIT then maybe that should be a separate patch?
In ExecLockRows(), is there a need to define the wait_policy variable now? It's just used once so you could probably just pass erm->waitPolicy directly to heap_lock_tuple().
I'm a bit confused at some of the code in heap_lock_tuple(). If I'm not wrong then after the line that does have_tuple_lock = true; it's never possible for have_tuple_lock to be false, but I see you've added checks to ensure we only unlock if have_tuple_lock is true. I'm thinking you probably did this because in the goto failed situation the check is done, but I was thinking that was perhaps put there in case a goto jump was added before have_tuple_lock is set to true. I'm wondering if it would be ok just to replace the test with an Assert() instead, or maybe just no check.
Also, I'm just looking at some of the changes that you've done to function signatures... I see quite a few of them are now beyond 80 chars wide (see http://www.postgresql.org/docs/devel/static/source-format.html).
Regards
David Rowley
On 27 July 2014 14:31, David Rowley <dgrowleyml@gmail.com> wrote: > On Sun, Jul 27, 2014 at 4:49 AM, Thomas Munro <munro@ip9.org> wrote: >> >> Here is a new version of the patch with a single enum LockWaitPolicy >> defined in utils/lockwaitpolicy.h. >> > > That seems much cleaner > > A few more comments: > You seem to have lost the comment which indicates that the values of the > enum are important due to the code in applyLockingClause(), but I see now > instead that you've added some assert checks in applyLockingClause(), likely > these should use Assert() rather than StaticAssertExpr(). Here's a new version with explicit numerical values and a comment in lockwaitpolicy.h to explain that the order is important and point to the relevant code. The assertions are about the relationship between constant values known at compile time, so why would we want a runtime assertion? I have changed it from StaticAssertExpr to StaticAssertStmt though. > I've also been looking at the isolation tests and I see that you've added a > series of tests for NOWAIT. I was wondering why you did that as that's > really existing code, probably if you thought the tests were a bit thin > around NOWAIT then maybe that should be a separate patch? Since I was meddling with code that controls both NOWAIT and SKIP LOCKED, I wanted to convince myself that I had not broken NOWAIT using at least a basic smoke test . I suppose by the same logic I should also wite isolation tests for default blocking FOR UPDATE... Ok, I've taken nowait.spec out for now. On the subject of isolation tests, I think skip-locked.spec is only producing schedules that reach third of the three 'return HeapTupleWouldBlock' statements in heap_lock_tuple. I will follow up with some more thorough isolation tests in the next week or so to cover the other two, and some other scenarios and interactions with other feature. > In ExecLockRows(), is there a need to define the wait_policy variable now? > It's just used once so you could probably just pass erm->waitPolicy directly > to heap_lock_tuple(). Fixed. > I'm a bit confused at some of the code in heap_lock_tuple(). If I'm not > wrong then after the line that does have_tuple_lock = true; it's never > possible for have_tuple_lock to be false, but I see you've added checks to > ensure we only unlock if have_tuple_lock is true. I'm thinking you probably > did this because in the goto failed situation the check is done, but I was > thinking that was perhaps put there in case a goto jump was added before > have_tuple_lock is set to true. I'm wondering if it would be ok just to > replace the test with an Assert() instead, or maybe just no check. Right, I have removed the redundant conditionals. > Also, I'm just looking at some of the changes that you've done to function > signatures... I see quite a few of them are now beyond 80 chars wide (see > http://www.postgresql.org/docs/devel/static/source-format.html). Fixed. Best regards, Thomas Munro
Attachment
On 27 July 2014 23:19, Thomas Munro <munro@ip9.org> wrote: > On the subject of isolation tests, I think skip-locked.spec is only > producing schedules that reach third of the three 'return > HeapTupleWouldBlock' statements in heap_lock_tuple. I will follow up > with some more thorough isolation tests in the next week or so to > cover the other two, and some other scenarios and interactions with > other feature. Now with extra isolation tests so that the three different code branches that can skip rows are covered. I temporarily added some logging lines to double check that the expected branches are reached by each permutation while developing the specs. They change the output and are not part of the patch -- attaching separately.
Attachment
Tom Lane wrote: > It might be better if we'd declared AclMode in a single-purpose header, > say utils/aclmode.h, and then #include'd that into parsenodes.h. > There's certainly plenty of other single-datatype headers laying about. Do you mean src/include/datatype/aclmode.h? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
David Rowley wrote: > I've also been looking at the isolation tests and I see that you've added a > series of tests for NOWAIT. I was wondering why you did that as that's > really existing code, probably if you thought the tests were a bit thin > around NOWAIT then maybe that should be a separate patch? The isolation tester is new so we don't have nearly enough tests for it. Adding more meaningful tests is good even if they're unrelated to the patch at hand. FWIW you can use configure --enable-coverage and "make coverage-html" to get coverage reports. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> It might be better if we'd declared AclMode in a single-purpose header, >> say utils/aclmode.h, and then #include'd that into parsenodes.h. >> There's certainly plenty of other single-datatype headers laying about. > Do you mean src/include/datatype/aclmode.h? I was thinking src/include/utils/, actually, but maybe datatype/ would be a good choice. OTOH, what we've got in there now is just timestamp.h, and IIRC it was put there because it needed to be accessible from both frontend and backend contexts. That would not be true of aclmode.h, so perhaps aclmode.h doesn't belong there. regards, tom lane
On 29 July 2014 02:35, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > David Rowley wrote: > >> I've also been looking at the isolation tests and I see that you've added a >> series of tests for NOWAIT. I was wondering why you did that as that's >> really existing code, probably if you thought the tests were a bit thin >> around NOWAIT then maybe that should be a separate patch? > > The isolation tester is new so we don't have nearly enough tests for it. > Adding more meaningful tests is good even if they're unrelated to the > patch at hand. Here are my isolation tests for NOWAIT as a separate patch, independent of SKIP LOCKED. They cover the tuple lock, regular row lock and multixact row lock cases. I guess this might be called white box testing, since it usese knowledge of how to construct schedules that hit the three interesting code paths that trigger the error, even though you can't see from the output why the error was raised in each case without extra instrumentation (though it did cross my mind that it could be interesting at the very least for testing if the error message were different in each case). If there are no objections I will add this to the next commitfest. Best regards Thomas Munro
Attachment
On Tue, Jul 29, 2014 at 1:35 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
David Rowley wrote:The isolation tester is new so we don't have nearly enough tests for it.
> I've also been looking at the isolation tests and I see that you've added a
> series of tests for NOWAIT. I was wondering why you did that as that's
> really existing code, probably if you thought the tests were a bit thin
> around NOWAIT then maybe that should be a separate patch?
Adding more meaningful tests is good even if they're unrelated to the
patch at hand.
I completely agree that some more isolation tests coverage would be a good thing. I just saw it as something not directly related to this feature, so thought it would be better as a separate patch. From my experience with the project, normally when I try to sneak something extra in, it either does not make the final commit, or gets added in a separate commit.
Regards
David Rowley
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Jul 29, 2014 at 9:48 AM, Thomas Munro <span dir="ltr"><<ahref="mailto:munro@ip9.org" target="_blank">munro@ip9.org</a>></span> wrote:<br /><blockquote class="gmail_quote"style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="">On27 July 2014 23:19, Thomas Munro <<a href="mailto:munro@ip9.org">munro@ip9.org</a>> wrote:<br /> > Onthe subject of isolation tests, I think skip-locked.spec is only<br /> > producing schedules that reach third of thethree 'return<br /> > HeapTupleWouldBlock' statements in heap_lock_tuple. I will follow up<br /> > with some morethorough isolation tests in the next week or so to<br /> > cover the other two, and some other scenarios and interactionswith<br /> > other feature.<br /><br /></div>Now with extra isolation tests so that the three different code<br/> branches that can skip rows are covered. I temporarily added some<br /> logging lines to double check that theexpected branches are reached<br /> by each permutation while developing the specs. They change the<br /> output andare not part of the patch -- attaching separately.<br /></blockquote></div><br /></div><div class="gmail_extra">I've hada look over the isolation tests now and I can't see too much wrong, just a couple of typos... </div><div class="gmail_extra"><br/></div><div class="gmail_extra"><div class="gmail_extra">* skip-locked-2.spec</div><div class="gmail_extra"><br/></div><div class="gmail_extra"># s2 againt skips because it can't acquired a multi-xact lock</div><divclass="gmail_extra"><br /></div><div class="gmail_extra"> "againt" should be "again"</div><div class="gmail_extra"><br/></div><div class="gmail_extra">also mixed use of multixact and multi-xact, probably would be betterto stick to just 1.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">Also this file would probablybe slightly easier to read with a new line after each "permutation" line.</div><div class="gmail_extra"><br /></div><divclass="gmail_extra">* skip_locked-3.spec</div><div class="gmail_extra"><br /></div><div class="gmail_extra">#s3 skips to second record due to tuple lock held by s2</div><div class="gmail_extra"><br /></div><divclass="gmail_extra">There's a missing "the" after "skips to"</div><div class="gmail_extra"><br /></div><div class="gmail_extra">Also,won't the lock be held by s1 not s2?</div><div class="gmail_extra"><br /></div><div class="gmail_extra">There'sjust a couple of other tiny things.</div><div class="gmail_extra"><br /></div><div class="gmail_extra"><divclass="gmail_extra">* Some whitespace issues shown by git diff --check</div><div class="gmail_extra"><br/></div><div class="gmail_extra">src/backend/parser/gram.y:9221: trailing whitespace.</div><div class="gmail_extra">+opt_nowait_or_skip:</div><divclass="gmail_extra">src/backend/rewrite/rewriteHandler.c:65: trailing whitespace.</div><divclass="gmail_extra">+ LockClauseStrengthstrength, LockWaitPolicy waitPolicy,</div><div class="gmail_extra"><br /></div><div class="gmail_extra">*analyze.c</div><div class="gmail_extra"><br /></div><div class="gmail_extra">The StaticAssertStmt'sI think these should be removed. The only place where this behaviour can be changed</div><div class="gmail_extra">isin lockwaitpolicy.h, I think it would be better to just strengthen the comment on the enum's definition.</div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Perhaps something more along the lines of:</div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Policy for what to do when a row lock cannot be obtainedimmediately.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">The enum values defined here havecritical importance to how the parser</div><div class="gmail_extra">treats multiple FOR UPDATE/SHARE statements in differentnested levels of</div><div class="gmail_extra">the query. Effectively if multiple locking clauses are defined inthe query</div><div class="gmail_extra"> then the one with the highest enum value takes precedence over the others.</div><divclass="gmail_extra"><br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Apart fromthis I can't see any other problems with the patch and I'd be very inclined, once the above are fixed up to mark thepatch ready for commiter.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">Good work</div><div class="gmail_extra"><br/></div><div class="gmail_extra">Regards</div><div class="gmail_extra"><br /></div><div class="gmail_extra">DavidRowley </div></div></div></div>
On 1 August 2014 10:37, David Rowley <dgrowleyml@gmail.com> wrote: > * skip-locked-2.spec > > # s2 againt skips because it can't acquired a multi-xact lock > > "againt" should be "again" Fixed. > also mixed use of multixact and multi-xact, probably would be better to > stick to just 1. Changed to multixact as seen in other places. > Also this file would probably be slightly easier to read with a new line > after each "permutation" line. Done. > * skip_locked-3.spec > > # s3 skips to second record due to tuple lock held by s2 > > There's a missing "the" after "skips to" Fixed. > Also, won't the lock be held by s1 not s2? s1 holds the row lock, and s2 holds the tuple lock (because it is at the head of the queue waiting for the row lock). s3 skips because it couldn't acquire the tuple lock held by s2. The other two specs are about not being able to acquire a row lock and only need two sessions. This one tests the code path when there is already a queue forming and you can't even get the tuple lock, which requires an extra session. I have updated the comment to make that clearer. > There's just a couple of other tiny things. > > * Some whitespace issues shown by git diff --check > > src/backend/parser/gram.y:9221: trailing whitespace. > +opt_nowait_or_skip: > src/backend/rewrite/rewriteHandler.c:65: trailing whitespace. > + > LockClauseStrength strength, LockWaitPolicy waitPolicy, Fixed. > * analyze.c > > The StaticAssertStmt's I think these should be removed. The only place where > this behaviour can be changed > is in lockwaitpolicy.h, I think it would be better to just strengthen the > comment on the enum's definition. Removed. > Perhaps something more along the lines of: > > Policy for what to do when a row lock cannot be obtained immediately. > > The enum values defined here have critical importance to how the parser > treats multiple FOR UPDATE/SHARE statements in different nested levels of > the query. Effectively if multiple locking clauses are defined in the query > then the one with the highest enum value takes precedence over the others. Added something along those lines. > Apart from this I can't see any other problems with the patch and I'd be > very inclined, once the above are fixed up to mark the patch ready for > commiter. > > Good work Thanks for all the guidance, I appreciate it! My review karma account is now well overdrawn. Best regards, Thomas Munro
Attachment
On Sat, Aug 2, 2014 at 3:55 AM, Thomas Munro <munro@ip9.org> wrote:
On 1 August 2014 10:37, David Rowley <dgrowleyml@gmail.com> wrote:> Apart from this I can't see any other problems with the patch and I'd beThanks for all the guidance, I appreciate it! My review karma account
> very inclined, once the above are fixed up to mark the patch ready for
> commiter.
>
> Good work
is now well overdrawn.
Ok, then I have nothing more so it's time to pass this one along.
The only notes I can think to leave for the commiter would be around the precedence order of the lock policy, especially around a query such as:
SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE; -- skip locked wins
Of course the current behaviour is that NOWAIT wins over the standard FOR UPDATE, but with NOWAIT, there's only a chance of an error, there's no chance of giving incorrect results.
I checked what Oracle did in this situation and I see that they completely disallow FOR UPDATE inside of views and subqueries.
I could see an argument here that the outer most FOR UPDATE clause should be used, but I guess that ship has sailed when NOWAIT was added.
Marking as ready for commiter.
Regards
David Rowley
David Rowley wrote: > The only notes I can think to leave for the commiter would be around the > precedence order of the lock policy, especially around a query such as: > > SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE; -- > skip locked wins > > Of course the current behaviour is that NOWAIT wins over the standard FOR > UPDATE, but with NOWAIT, there's only a chance of an error, there's no > chance of giving incorrect results. Another option is to throw an error at parse analysis time if there is a conflict in the specified locking policies, as in the above case. Are there cases in which it would make sense to have one clause trump the other? It seems reasonable to have NOWAIT trump regular FOR UPDATE (as it already does), since, as you say, there's chance of error being thrown at runtime, but not of incorrect result. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
One thing I just noticed is that we uselessly set an error context callback when "waiting" in ConditionalMultiXactIdWait, which is pretty useless (because we don't actually wait there at all) -- we don't set one in ConditionalXactLockTableWait, which makes sense, but for some reason I failed to realize this in review of f88d4cfc9. I think I will take that out instead of cluttering this patch with it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
heap_lock_tuple() has the following comment on top: * In the failure cases, the routine fills *hufd with the tuple's t_ctid, * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax * (the last only for HeapTupleSelfUpdated, since we * cannot obtain cmax from a combocid generated by another transaction). * See comments for struct HeapUpdateFailureData for additional info. With the patch as submitted, this struct is not filled in the HeapTupleWouldBlock case. I'm not sure this is okay, though I admit the only caller that passes LockWaitSkip doesn't care, so maybe we could just ignore the issue (after properly modifying the comment). It seems easy to add a LockBuffer() and "goto failed" rather than a return; that would make that failure case conform to HeapTupleUpdated and HeapTupleSelfUpdated. OTOH it might be pointless to worry about what would be essentially dead code currently ... Did you consider heap_lock_updated_tuple? A rationale for saying it doesn't need to pay attention to the wait policy is: if you're trying to lock-skip-locked an updated tuple, then you either skip it because its updater is running, or you return it because it's no longer running; and if you return it, it is not possible for the updater to be locking the updated version. However, what if there's a third transaction that locked the updated version? It might be difficult to hit this case or construct an isolationtester spec file though, since there's a narrow window you need to race to. I also pgindented. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 22 August 2014 23:02, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > heap_lock_tuple() has the following comment on top: > > * In the failure cases, the routine fills *hufd with the tuple's t_ctid, > * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax > * (the last only for HeapTupleSelfUpdated, since we > * cannot obtain cmax from a combocid generated by another transaction). > * See comments for struct HeapUpdateFailureData for additional info. > > With the patch as submitted, this struct is not filled in the > HeapTupleWouldBlock case. I'm not sure this is okay, though I admit the > only caller that passes LockWaitSkip doesn't care, so maybe we could > just ignore the issue (after properly modifying the comment). It seems > easy to add a LockBuffer() and "goto failed" rather than a return; that > would make that failure case conform to HeapTupleUpdated and > HeapTupleSelfUpdated. OTOH it might be pointless to worry about what > would be essentially dead code currently ... Thanks Alvaro. Forgive me if I have misunderstood but it looks like your incremental patch included a couple of unrelated changes, namely s/0/InvalidCommandId/ and a reversion of ConditionalMultiXactIdWait. Undoing those gave me skip-locked-v12-b.patch, attached for reference, and I've included those changes in a new full patch skip-locked-v13.patch (also rebased). +1 for the change from if-then-else to switch statements. I was less sure about the 'goto failed' change, but I couldn't measure any change in concurrent throughput in my simple benchmark, so I guess that extra buffer lock/unlock doesn't matter and I can see why you prefer that control flow. > Did you consider heap_lock_updated_tuple? A rationale for saying it > doesn't need to pay attention to the wait policy is: if you're trying to > lock-skip-locked an updated tuple, then you either skip it because its > updater is running, or you return it because it's no longer running; and > if you return it, it is not possible for the updater to be locking the > updated version. However, what if there's a third transaction that > locked the updated version? It might be difficult to hit this case or > construct an isolationtester spec file though, since there's a narrow > window you need to race to. Hmm. I will look into this, thanks. Best regards, Thomas Munro
Attachment
On 24 August 2014 22:04, Thomas Munro <munro@ip9.org> wrote: > On 22 August 2014 23:02, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> Did you consider heap_lock_updated_tuple? A rationale for saying it >> doesn't need to pay attention to the wait policy is: if you're trying to >> lock-skip-locked an updated tuple, then you either skip it because its >> updater is running, or you return it because it's no longer running; and >> if you return it, it is not possible for the updater to be locking the >> updated version. However, what if there's a third transaction that >> locked the updated version? It might be difficult to hit this case or >> construct an isolationtester spec file though, since there's a narrow >> window you need to race to. > > Hmm. I will look into this, thanks. While trying to produce the heap_lock_updated_tuple_rec case you describe (so far unsuccessfully), I discovered I could make SELECT ... FOR UPDATE NOWAIT block indefinitely on unpatched 9.3 in a different code path after heap_lock_tuple returns: in another session, UPDATE, COMMIT, then UPDATE, all after the first session has taken its snapshot but before it tries to lock a given row. The code in EvalPlanQualFetch (reached from the HeapTupleUpdated case in ExecLockRow) finishes up waiting for the uncommitted transaction. I think I see how to teach EvalPlanQualFetch how to handle wait policies: for NOWAIT it should ereport (fixing a pre-existing bug (?)), and I guess it should handle SKIP LOCKED by returning NULL, similarly to the way it handles deleted rows, and of course in all cases passing the wait policy forward to heap_lock_tuple, which it eventually calls. Still looking at heap_lock_updated_tuple. The difficulty of course will be testing all these racy cases reproducibly... Best regards, Thomas Munro
On 08/25/2014 09:44 AM, Thomas Munro wrote: > On 24 August 2014 22:04, Thomas Munro <munro@ip9.org> wrote: >> On 22 August 2014 23:02, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >>> Did you consider heap_lock_updated_tuple? A rationale for saying it >>> doesn't need to pay attention to the wait policy is: if you're trying to >>> lock-skip-locked an updated tuple, then you either skip it because its >>> updater is running, or you return it because it's no longer running; and >>> if you return it, it is not possible for the updater to be locking the >>> updated version. However, what if there's a third transaction that >>> locked the updated version? It might be difficult to hit this case or >>> construct an isolationtester spec file though, since there's a narrow >>> window you need to race to. >> >> Hmm. I will look into this, thanks. > > While trying to produce the heap_lock_updated_tuple_rec case you > describe (so far unsuccessfully), I discovered I could make SELECT ... > FOR UPDATE NOWAIT block indefinitely on unpatched 9.3 in a different > code path after heap_lock_tuple returns: in another session, UPDATE, > COMMIT, then UPDATE, all after the first session has taken its > snapshot but before it tries to lock a given row. The code in > EvalPlanQualFetch (reached from the HeapTupleUpdated case in > ExecLockRow) finishes up waiting for the uncommitted transaction. I think that's the issue Andres and I patched for 9.3, but I don't know if it got committed. I'll need to have a look. A search in the archives for heap_lock_tuple and nowait might be informative. > The difficulty of course will be testing all these racy cases reproducibly... Yep, especially as isolationtester can only really work at the statement level, and can only handle one blocked connection at a time. It's possible a helper extension could be used - set up some locks in shmem, register two sessions for different "test roles" within a given test to install appropriate hooks, wait until they're both blocked on the locks the hooks acquire, then release the locks. That might land up with hook points scattered everywhere, but they could be some pretty minimal macros at least. IMO that's a separate project, though. For this kind of testing I've tended to just set conditional breakpoints in gdb, wait until they trap, then once everything's lined up release the breakpoints in both sessions as simultaneously as possible. Not perfect, but has tended to work. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Thomas Munro wrote: > While trying to produce the heap_lock_updated_tuple_rec case you > describe (so far unsuccessfully), I discovered I could make SELECT ... > FOR UPDATE NOWAIT block indefinitely on unpatched 9.3 in a different > code path after heap_lock_tuple returns: in another session, UPDATE, > COMMIT, then UPDATE, all after the first session has taken its > snapshot but before it tries to lock a given row. The code in > EvalPlanQualFetch (reached from the HeapTupleUpdated case in > ExecLockRow) finishes up waiting for the uncommitted transaction. Interesting -- perhaps this helps explain the deadlock issue reported in http://www.postgresql.org/message-id/CAKrjmhdN+GhAjNwqfHsOtGp+7YN27zR79m99RcAJMNazt5NJrA@mail.gmail.com > I think I see how to teach EvalPlanQualFetch how to handle wait > policies: for NOWAIT it should ereport (fixing a pre-existing bug > (?)), and I guess it should handle SKIP LOCKED by returning NULL, > similarly to the way it handles deleted rows, and of course in all > cases passing the wait policy forward to heap_lock_tuple, which it > eventually calls. > > Still looking at heap_lock_updated_tuple. > > The difficulty of course will be testing all these racy cases reproducibly... Does this help? http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com The useful trick there is forcing a query to get its snapshot and then go to sleep before actually doing anything, by way of an advisory lock. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Thomas Munro wrote: > On 22 August 2014 23:02, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Forgive me if I have misunderstood but it looks like your incremental > patch included a couple of unrelated changes, namely > s/0/InvalidCommandId/ and a reversion of ConditionalMultiXactIdWait. Yeah, sorry about those, will push separately. > Undoing those gave me skip-locked-v12-b.patch, attached for reference, > and I've included those changes in a new full patch > skip-locked-v13.patch (also rebased). > > +1 for the change from if-then-else to switch statements. > > I was less sure about the 'goto failed' change, but I couldn't measure > any change in concurrent throughput in my simple benchmark, so I guess > that extra buffer lock/unlock doesn't matter and I can see why you > prefer that control flow. I was also thinking in reducing the lock level acquired to shared rather than exclusive in all the paths that "goto failed". Since the lock is only used to read a couple of fields from the tuple, shared is enough and should give slightly better concurrency. Per buffer access rules in src/backend/storage/buffer/README: : 1. To scan a page for tuples, one must hold a pin and either shared or : exclusive content lock. To examine the commit status (XIDs and status bits) : of a tuple in a shared buffer, one must likewise hold a pin and either shared : or exclusive lock. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 07/31/2014 12:29 AM, Thomas Munro wrote: > On 29 July 2014 02:35, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> David Rowley wrote: >> >>> I've also been looking at the isolation tests and I see that you've added a >>> series of tests for NOWAIT. I was wondering why you did that as that's >>> really existing code, probably if you thought the tests were a bit thin >>> around NOWAIT then maybe that should be a separate patch? >> >> The isolation tester is new so we don't have nearly enough tests for it. >> Adding more meaningful tests is good even if they're unrelated to the >> patch at hand. > > Here are my isolation tests for NOWAIT as a separate patch, > independent of SKIP LOCKED. They cover the tuple lock, regular row > lock and multixact row lock cases. Thanks, committed. > I guess this might be called white > box testing, since it usese knowledge of how to construct schedules > that hit the three interesting code paths that trigger the error, even > though you can't see from the output why the error was raised in each > case without extra instrumentation (though it did cross my mind that > it could be interesting at the very least for testing if the error > message were different in each case). Yeah, seems reasonable. This kind of tests might become obsolete in the future if the internals change a lot, so that e.g. we don't use multixids anymore. But even if that happens, there's little harm in keeping the tests, and meanwhile it's good to have coverage of these cases. - Heikki
On 25 August 2014 02:57, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Thomas Munro wrote: >> The difficulty of course will be testing all these racy cases reproducibly... > > Does this help? > http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com > The useful trick there is forcing a query to get its snapshot and then > go to sleep before actually doing anything, by way of an advisory lock. Yes it does, thanks Alvaro and Craig. I think the attached spec reproduces the problem using that trick, ie shows NOWAIT blocking, presumably in EvalPlanQualFetch (though I haven't stepped through it with a debugger yet). I'm afraid I'm out of Postgres hacking cycles for a few days, but next weekend I should have a new patch that fixes this by teaching EvalPlanQualFetch about wait policies, with isolation tests for NOWAIT and SKIP LOCKED. Best regards, Thomas Munro
Attachment
Thomas Munro wrote: > On 25 August 2014 02:57, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Thomas Munro wrote: > >> The difficulty of course will be testing all these racy cases reproducibly... > > > > Does this help? > > http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com > > The useful trick there is forcing a query to get its snapshot and then > > go to sleep before actually doing anything, by way of an advisory lock. > > Yes it does, thanks Alvaro and Craig. I think the attached spec > reproduces the problem using that trick, ie shows NOWAIT blocking, > presumably in EvalPlanQualFetch (though I haven't stepped through it > with a debugger yet). I'm afraid I'm out of Postgres hacking cycles > for a few days, but next weekend I should have a new patch that fixes > this by teaching EvalPlanQualFetch about wait policies, with isolation > tests for NOWAIT and SKIP LOCKED. Hmm, http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 27 August 2014 17:18, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Thomas Munro wrote: >> On 25 August 2014 02:57, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> > Thomas Munro wrote: >> >> The difficulty of course will be testing all these racy cases reproducibly... >> > >> > Does this help? >> > http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com >> > The useful trick there is forcing a query to get its snapshot and then >> > go to sleep before actually doing anything, by way of an advisory lock. >> >> Yes it does, thanks Alvaro and Craig. I think the attached spec >> reproduces the problem using that trick, ie shows NOWAIT blocking, >> presumably in EvalPlanQualFetch (though I haven't stepped through it >> with a debugger yet). I'm afraid I'm out of Postgres hacking cycles >> for a few days, but next weekend I should have a new patch that fixes >> this by teaching EvalPlanQualFetch about wait policies, with isolation >> tests for NOWAIT and SKIP LOCKED. > > Hmm, http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com Thanks, I hadn't seen this, I should have checked the archives better. I have actually already updated my patch to handle EvalPlanQualFetch with NOWAIT and SKIP LOCKED with isolation specs, see attached. I will compare with Craig's and see if I screwed anything up... of course I am happy to merge and submit a new patch on top of Craig's if it's going to be committed. I haven't yet figured out how to get get into a situation where heap_lock_updated_tuple_rec waits. Best regards, Thomas Munro
Attachment
Thomas Munro wrote: > On 27 August 2014 17:18, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Thomas Munro wrote: > >> Yes it does, thanks Alvaro and Craig. I think the attached spec > >> reproduces the problem using that trick, ie shows NOWAIT blocking, > >> presumably in EvalPlanQualFetch (though I haven't stepped through it > >> with a debugger yet). I'm afraid I'm out of Postgres hacking cycles > >> for a few days, but next weekend I should have a new patch that fixes > >> this by teaching EvalPlanQualFetch about wait policies, with isolation > >> tests for NOWAIT and SKIP LOCKED. > > > > Hmm, http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com > > Thanks, I hadn't seen this, I should have checked the archives better. > I have actually already updated my patch to handle EvalPlanQualFetch > with NOWAIT and SKIP LOCKED with isolation specs, see attached. I > will compare with Craig's and see if I screwed anything up... of > course I am happy to merge and submit a new patch on top of Craig's if > it's going to be committed. I tried Craig's patch with your test case and found that it stalls in XactLockTableWait inside EPQFetch because it doesn't throw an error in the noWait case before waiting. I think I will fix that and push, including both test cases. Then we can see about rebasing your patch. I am wondering about backpatching Craig's fix. It looks to me like it should be backpatched as far back as NOWAIT exists, but that was in 8.1 and we haven't ever gotten a complaint until Craig's report AFAIK, which I understand wasn't coming from a user finding a problem but rather some new development. So I hesitate. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Thomas Munro wrote: > Thanks, I hadn't seen this, I should have checked the archives better. > I have actually already updated my patch to handle EvalPlanQualFetch > with NOWAIT and SKIP LOCKED with isolation specs, see attached. I > will compare with Craig's and see if I screwed anything up... of > course I am happy to merge and submit a new patch on top of Craig's if > it's going to be committed. Thanks, please rebase. > I haven't yet figured out how to get get into a situation where > heap_lock_updated_tuple_rec waits. Well, as I think I said in the first post I mentioned this, maybe there is no such situation. In any case, like the EvalPlanQualFetch issue, we can fix it later if we find it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 28 August 2014 00:25, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Thomas Munro wrote: > >> Thanks, I hadn't seen this, I should have checked the archives better. >> I have actually already updated my patch to handle EvalPlanQualFetch >> with NOWAIT and SKIP LOCKED with isolation specs, see attached. I >> will compare with Craig's and see if I screwed anything up... of >> course I am happy to merge and submit a new patch on top of Craig's if >> it's going to be committed. > > Thanks, please rebase. Thanks -- new patch attached. Best regards, Thomas Munro
Attachment
On 28 August 2014 00:25, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Thomas Munro wrote: >> I haven't yet figured out how to get get into a situation where >> heap_lock_updated_tuple_rec waits. > > Well, as I think I said in the first post I mentioned this, maybe there > is no such situation. In any case, like the EvalPlanQualFetch issue, we > can fix it later if we find it. I finally came up with a NOWAIT spec that reaches heap_lock_updated_rec and then blocks. I can't explain why exactly... but please see attached. The fix seems fairly straightforward. Do you think I should submit an independent patch to fix this case (well there are really two cases, since there is a separate multixact path) for the existing NOWAIT support and then tackle the SKIP LOCKED equivalent separately? Best regards, Thomas Munro
Attachment
On 31 August 2014 01:36, Thomas Munro <munro@ip9.org> wrote: > On 28 August 2014 00:25, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> Thomas Munro wrote: >>> I haven't yet figured out how to get get into a situation where >>> heap_lock_updated_tuple_rec waits. >> >> Well, as I think I said in the first post I mentioned this, maybe there >> is no such situation. In any case, like the EvalPlanQualFetch issue, we >> can fix it later if we find it. > > I finally came up with a NOWAIT spec that reaches > heap_lock_updated_rec and then blocks. I can't explain why exactly... > but please see attached. The fix seems fairly straightforward. Do > you think I should submit an independent patch to fix this case (well > there are really two cases, since there is a separate multixact path) > for the existing NOWAIT support and then tackle the SKIP LOCKED > equivalent separately? Oops, that isolation wasn't right, please disregard. Unless you think this obscure case needs to be addressed before the SKIP LOCKED patch can be committed, I will investigate it separately and maybe submit something to a future commitfest. Best regards, Thomas Munro
Thomas Munro wrote: > @@ -2022,7 +2030,7 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, bool noWait, > */ > test = heap_lock_tuple(relation, &tuple, > estate->es_output_cid, > - lockmode, noWait, > + lockmode, wait_policy, > false, &buffer, &hufd); > /* We now have two pins on the buffer, get rid of one */ > ReleaseBuffer(buffer); Doesn't this heap_lock_tuple() need to check for a WouldBlock result? Not sure that this is the same case that you were trying to test in heap_lock_updated_tuple; I think this requires an update chain (so that EPQFetch is invoked) and some tuple later in the chain is locked by a third transaction. I attach some additional minor suggestions to your patch. Please feel free to reword comments differently if you think my wording isn't an improvements (or I've maked an english mistakes). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Sep 10, 2014 at 9:47 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I attach some additional minor suggestions to your patch. I don't see any attachment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Wed, Sep 10, 2014 at 9:47 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > I attach some additional minor suggestions to your patch. > > I don't see any attachment. Uh. Here it is. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 10 September 2014 14:47, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Thomas Munro wrote: > >> @@ -2022,7 +2030,7 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, bool noWait, >> */ >> test = heap_lock_tuple(relation, &tuple, >> estate->es_output_cid, >> - lockmode, noWait, >> + lockmode, wait_policy, >> false, &buffer, &hufd); >> /* We now have two pins on the buffer, get rid of one */ >> ReleaseBuffer(buffer); > > Doesn't this heap_lock_tuple() need to check for a WouldBlock result? > Not sure that this is the same case that you were trying to test in > heap_lock_updated_tuple; I think this requires an update chain (so that > EPQFetch is invoked) and some tuple later in the chain is locked by a > third transaction. You're right, that case was clearly lacking. The new patch, attached, releases the buffer and returns NULL in that case. But I have no idea how to reach it in an isolation test! skip-locked-4.spec gets pretty close, it creates the right kind of update chain to get into EvalPlanQualFetch and then enter the conditional block beginning: if (TransactionIdIsValid(SnapshotDirty.xmax)) But to reach the case you mentioned, it would need to get past that (xmax is not a valid transaction) but then the tuple would need to be locked by another session before heap_lock_tuple is called a few lines below. That's a race scenario that I don't believe we can create using advisory lock tricks in an isolation test. > I attach some additional minor suggestions to your patch. Please feel > free to reword comments differently if you think my wording isn't an > improvements (or I've maked an english mistakes). Thanks, these are incorporated in the new version (also rebased). Best regards, Thomas Munro
Attachment
Thomas Munro wrote: > > Doesn't this heap_lock_tuple() need to check for a WouldBlock result? > > Not sure that this is the same case that you were trying to test in > > heap_lock_updated_tuple; I think this requires an update chain (so that > > EPQFetch is invoked) and some tuple later in the chain is locked by a > > third transaction. > > You're right, that case was clearly lacking. The new patch, attached, > releases the buffer and returns NULL in that case. But I have no idea > how to reach it in an isolation test! skip-locked-4.spec gets pretty > close, it creates the right kind of update chain to get into > EvalPlanQualFetch and then enter the conditional block beginning: > > if (TransactionIdIsValid(SnapshotDirty.xmax)) > > But to reach the case you mentioned, it would need to get past that > (xmax is not a valid transaction) but then the tuple would need to be > locked by another session before heap_lock_tuple is called a few lines > below. That's a race scenario that I don't believe we can create > using advisory lock tricks in an isolation test. Hm, are you able to reproduce it using GDB? Craig Ringer was saying elsewhere that there are other cases that are impossible to test reliably and was proposing addings hooks or something to block backends at convenient times. Not an easy problem ... > > I attach some additional minor suggestions to your patch. Please feel > > free to reword comments differently if you think my wording isn't an > > improvements (or I've maked an english mistakes). > > Thanks, these are incorporated in the new version (also rebased). Great, thanks; I'll look at it again soon to commit, as I think we're done now. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 12 September 2014 03:56, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Thomas Munro wrote: >> But to reach the case you mentioned, it would need to get past that >> (xmax is not a valid transaction) but then the tuple would need to be >> locked by another session before heap_lock_tuple is called a few lines >> below. That's a race scenario that I don't believe we can create >> using advisory lock tricks in an isolation test. > > Hm, are you able to reproduce it using GDB? > > Craig Ringer was saying elsewhere that there are other cases that are > impossible to test reliably and was proposing addings hooks or > something to block backends at convenient times. Not an easy problem ... +1, I think that is a great idea. FWIW here's some throwaway code that I used to do that: diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 79667f1..fbb3b55 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -54,6 +54,7 @@ #include "storage/lmgr.h" #include "tcop/utility.h" #include "utils/acl.h" +#include "utils/builtins.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/snapmgr.h" @@ -2029,6 +2030,20 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, } /* + * Begin wait point debugging hack... + * TODO: Only in a special build mode... + * We tell anyone waiting that we have reached wait point #42. + * We wait for permission to proceed from wait point #43. + */ + elog(WARNING, "XXX reached point 42, waiting at point 43"); + DirectFunctionCall1(pg_advisory_unlock_int8, Int64GetDatum(42)); + DirectFunctionCall1(pg_advisory_lock_int8, Int64GetDatum(43)); + elog(WARNING, "XXX continuing after point 43"); + /* + * End wait point debugging hack. + */ + + /* * This is a live tuple, so now try to lock it. */ test = heap_lock_tuple(relation, &tuple, Using the attached isolation spec, that race case is reached. Yeah, it's crude and confusing having those three advisory locks (one to allow an update chain to be created after s1 takes a snapshot, and the other two so that s2 can block s1 at the right point to produce that race case), but I found this less messy than trying to reproduce complicated concurrency scenarios with GDB. IMHO it would be great if there were a tidy and supported way to do this kind of thing, perhaps with a formal notion of named wait points which are only compiled in in special test builds, and an optional set of extra isolation specs that use them. >> > I attach some additional minor suggestions to your patch. Please feel >> > free to reword comments differently if you think my wording isn't an >> > improvements (or I've maked an english mistakes). >> >> Thanks, these are incorporated in the new version (also rebased). > > Great, thanks; I'll look at it again soon to commit, as I think we're > done now. Thanks! Thomas Munro
Attachment
Thomas Munro wrote: > > I attach some additional minor suggestions to your patch. Please feel > > free to reword comments differently if you think my wording isn't an > > improvements (or I've maked an english mistakes). > > Thanks, these are incorporated in the new version (also rebased). Pushed, thanks. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services