Thread: foreign key locks

foreign key locks

From
Alvaro Herrera
Date:
Hi,

This is v12 of the foreign key locks patch.

I gave a talk about this patch at PGCon:
http://www.pgcon.org/2012/schedule/events/483.en.html
There is an article there that explains this patch.

I described the original goal of this in the thread starting here:
http://archives.postgresql.org/message-id/1290721684-sup-3951@alvh.no-ip.org

The patch itself was first submitted here:
http://archives.postgresql.org/message-id/1320343602-sup-2290@alvh.no-ip.org

There were many problems that we had to shake off during the 9.2
development cycle; this new version covers most of them.  There is a
github clone here: http://github.com/alvherre/postgres branch fklocks
If you see the "compare" view, changes in this submission that weren't
present in v11 start with commit 19dc85f1, here:
https://github.com/alvherre/postgres/compare/master...fklocks

I know of one significant issue left, possible cause of nasty bugs,
which is the way this patch interacts with EvalPlanQual.  I mentioned
erroneously in my PGcon talk that the way we handle this is by shutting
off EPQ recursion -- this was wrong.  What I did was shut off
heap_lock_tuple from following the update chain, letting it walk only in
the cases where it comes from high-level callers.  Others such as EPQ,
which do their own update-chain walking, do not request locks to follow
update.  I believe this is correct.  This was changed in this commit:
https://github.com/alvherre/postgres/commit/e00e6827c55128ed737172a6dd35c581d437f969

There is also a matter of a performance regression we measured in stock
pgbench.  I have profiled this to come from GetMultiXactMembers and
AFAICS the way to fix it is to improve multixact.c's cache of old
multis.  Right now we ditch the cache at end of transaction, which was
already considered wrong in comments in the code but never fixed.  I am
going to make the cache entries live until the oldest Xid in each multi
is below its visibility horizon (so RecentXmin for multis that only
contain locks, and something like FreezeLimit for multis that contain
updates).

I apologize for the size of this patch.  I am not really sure that there
are pieces to split out for separate review.

--
Álvaro Herrera <alvherre@alvh.no-ip.org>

Attachment

Re: foreign key locks

From
Jaime Casanova
Date:
On Thu, Jun 14, 2012 at 11:41 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Hi,
>
> This is v12 of the foreign key locks patch.
>

Hi Álvaro,

Just noticed that this patch needs a rebase because of the refactoring
Tom did in ri_triggers.c

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


Re: foreign key locks

From
Tom Lane
Date:
Jaime Casanova <jaime@2ndquadrant.com> writes:
> On Thu, Jun 14, 2012 at 11:41 AM, Alvaro Herrera
> <alvherre@alvh.no-ip.org> wrote:
>> This is v12 of the foreign key locks patch.

> Just noticed that this patch needs a rebase because of the refactoring
> Tom did in ri_triggers.c

Hold on a bit before you work on that code --- I've got one more bit of
hacking I want to try before walking away from it.  I did some oprofile
work on Dean's example from
<CAEZATCWm8M00RA814o4DC2cD_aj44gQLb0tDdxMHA312qg7HCQ@mail.gmail.com>
and noticed that it looks like ri_FetchConstraintInfo is eating a
noticeable fraction of the runtime, which is happening because it is
called to deconstruct the relevant pg_constraint row for each tuple
we consider firing the trigger for (and then again, if we do fire the
trigger).  I'm thinking it'd be worth maintaining a backend-local cache
for the deconstructed data, and am going to go try that.
        regards, tom lane


Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of mié jun 20 12:54:24 -0400 2012:
> Jaime Casanova <jaime@2ndquadrant.com> writes:
> > On Thu, Jun 14, 2012 at 11:41 AM, Alvaro Herrera
> > <alvherre@alvh.no-ip.org> wrote:
> >> This is v12 of the foreign key locks patch.
>
> > Just noticed that this patch needs a rebase because of the refactoring
> > Tom did in ri_triggers.c
>
> Hold on a bit before you work on that code --- I've got one more bit of
> hacking I want to try before walking away from it.  I did some oprofile
> work on Dean's example from
> <CAEZATCWm8M00RA814o4DC2cD_aj44gQLb0tDdxMHA312qg7HCQ@mail.gmail.com>
> and noticed that it looks like ri_FetchConstraintInfo is eating a
> noticeable fraction of the runtime, which is happening because it is
> called to deconstruct the relevant pg_constraint row for each tuple
> we consider firing the trigger for (and then again, if we do fire the
> trigger).  I'm thinking it'd be worth maintaining a backend-local cache
> for the deconstructed data, and am going to go try that.

I had merged already when I got your email, but merging that additional
change did not cause any conflicts.  So here's the rebased version.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: foreign key locks

From
"Kevin Grittner"
Date:
Alvaro Herrera  wrote:

> So here's the rebased version.

I found a couple problems on `make check-world`.  Attached is a patch
to fix one of them.  The other is on pg_upgrade, pasted below.

+ pg_upgrade -d
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/data.old -D
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/data -b
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/install//home/
kevin/pg/master/Debug/bin -B
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/install//home/
kevin/pg/master/Debug/bin
Performing Consistency Checks
-----------------------------
Checking current, bin, and data directories                 ok
Checking cluster versions                                   ok
Checking database user is a superuser                       ok
Checking for prepared transactions                          ok
Checking for reg* system OID user data types                ok
Checking for contrib/isn with bigint-passing mismatch       ok
Creating catalog dump                                       ok
Checking for presence of required libraries                 ok
Checking database user is a superuser                       ok
Checking for prepared transactions                          ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows on the new cluster                        ok
Deleting files from new pg_clog                             ok
Copying old pg_clog to new server                           ok
Setting next transaction ID for new cluster                 ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        sh:
/home/kevin/pg/master/contrib/pg_upgrade/tmp_check/install//home/
kevin/pg/master/Debug/bin: Permission denied
*failure*

Consult the last few lines of ""%s/pg_resetxlog" -O %u -m %u,%u "%s"
> /dev/null" for the probable cause of the failure.
Failure, exiting
make[2]: *** [check] Error 1
make[2]: Leaving directory `/home/kevin/pg/master/contrib/pg_upgrade'
make[1]: *** [check-pg_upgrade-recurse] Error 2
make[1]: Leaving directory `/home/kevin/pg/master/contrib'
make: *** [check-world-contrib-recurse] Error 2

I'm marking it as "Waiting for Author" since this seems like a "must
fix", but I'll keep looking at it,

-Kevin



Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Kevin Grittner's message of sáb jun 23 18:38:10 -0400 2012:
> Alvaro Herrera  wrote:
>
> > So here's the rebased version.
>
> I found a couple problems on `make check-world`.  Attached is a patch
> to fix one of them.  The other is on pg_upgrade, pasted below.

Ah, I mismerged pg_upgrade.  The fix is trivial, AFAICS -- a function
call is missing a log file name to be specified.  Strange that the
compiler didn't warn me of a broken printf format specifier.  However,
pg_upgrade itself has been broken by an unrelated commit and that fix is
not so trivial, so I'm stuck waiting for that to be fixed.  (We really
need automated pg_upgrade testing.)

Thanks for the patch for the other problem, I'll push it out.

> I'm marking it as "Waiting for Author" since this seems like a "must
> fix", but I'll keep looking at it,

Great, thanks.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Alvaro Herrera's message of dom jun 24 23:24:47 -0400 2012:
> Excerpts from Kevin Grittner's message of sáb jun 23 18:38:10 -0400 2012:
> > Alvaro Herrera  wrote:
> >
> > > So here's the rebased version.
> >
> > I found a couple problems on `make check-world`.  Attached is a patch
> > to fix one of them.  The other is on pg_upgrade, pasted below.
>
> Ah, I mismerged pg_upgrade.  The fix is trivial, AFAICS -- a function
> call is missing a log file name to be specified.  Strange that the
> compiler didn't warn me of a broken printf format specifier.  However,
> pg_upgrade itself has been broken by an unrelated commit and that fix is
> not so trivial, so I'm stuck waiting for that to be fixed.  (We really
> need automated pg_upgrade testing.)

Well, I ended up pushing the pg_upgrade fix anyway even though I
couldn't test it, because I had already committed it to my tree before
your patch.  If it still doesn't work after the other guys fix the
outstanding problem I'll just push a new patch.

In the meantime, here's fklocks v14, which also merges to new master as
there were several other conflicts.  It passes make installcheck-world
now.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: foreign key locks

From
"Kevin Grittner"
Date:
Alvaro Herrera  wrote:

> here's fklocks v14, which also merges to new master as there were
> several other conflicts. It passes make installcheck-world now.

Recent commits broke it again, so here's a rebased v15.  (No changes
other than attempting to merge your work with the pg_controldata and
pg_resetxlog changes and wrapping that FIXME in a comment.)

Using that patch, pg_upgrade completes, but pg_dumpall fails:


Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    delete_old_cluster.sh
+ pg_ctl start -l
/home/kevin/pg/master/contrib/pg_upgrade/log/postmaster2.log -w
waiting for server to start.... done
server started
+ pg_dumpall
pg_dump: Dumping the contents of table "hslot" failed: PQgetResult()
failed.
pg_dump: Error message from server: ERROR:  MultiXactId 115 does no
longer exist -- apparent wraparound
pg_dump: The command was: COPY public.hslot (slotname, hubname,
slotno, slotlink) TO stdout;
pg_dumpall: pg_dump failed on database "regression", exiting
+ pg_dumpall2_status=1
+ pg_ctl -m fast stop
waiting for server to shut down.... done
server stopped
+ [ -n 1 ]
+ echo pg_dumpall of post-upgrade database cluster failed
pg_dumpall of post-upgrade database cluster failed
+ exit 1
make[2]: *** [check] Error 1
make[2]: Leaving directory `/home/kevin/pg/master/contrib/pg_upgrade'
make[1]: *** [check-pg_upgrade-recurse] Error 2
make[1]: Leaving directory `/home/kevin/pg/master/contrib'
make: *** [check-world-contrib-recurse] Error 2


Did I merge it wrong?

-Kevin



Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Kevin Grittner's message of mié jun 27 08:40:58 -0400 2012:
> Alvaro Herrera  wrote:
>
> > here's fklocks v14, which also merges to new master as there were
> > several other conflicts. It passes make installcheck-world now.
>
> Recent commits broke it again, so here's a rebased v15.  (No changes
> other than attempting to merge your work with the pg_controldata and
> pg_resetxlog changes and wrapping that FIXME in a comment.)

Oooh, so sorry -- I was supposed to have git-stashed that before
producing the patch.  This change cannot have caused the pg_dumpall
problem below though, because it only touched the multixact cache code.

> Using that patch, pg_upgrade completes, but pg_dumpall fails:

Hmm, something changed enough that requires more than just a code merge.
I'll look into it.

Thanks for the merge.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Kevin Grittner's message of mié jun 27 08:40:58 -0400 2012:
> Alvaro Herrera  wrote:
>
> > here's fklocks v14, which also merges to new master as there were
> > several other conflicts. It passes make installcheck-world now.
>
> Recent commits broke it again, so here's a rebased v15.  (No changes
> other than attempting to merge your work with the pg_controldata and
> pg_resetxlog changes and wrapping that FIXME in a comment.)

Here's v16, merged to latest stuff, before attempting to fix this:

> Using that patch, pg_upgrade completes, but pg_dumpall fails:

> + pg_ctl start -l
> /home/kevin/pg/master/contrib/pg_upgrade/log/postmaster2.log -w
> waiting for server to start.... done
> server started
> + pg_dumpall
> pg_dump: Dumping the contents of table "hslot" failed: PQgetResult()
> failed.
> pg_dump: Error message from server: ERROR:  MultiXactId 115 does no
> longer exist -- apparent wraparound

I was only testing migrating from an old version into patched, not
same-version upgrades.

I think I see what's happening here.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Alvaro Herrera's message of vie jun 29 19:17:02 -0400 2012:
> Excerpts from Kevin Grittner's message of mié jun 27 08:40:58 -0400 2012:
> > Alvaro Herrera  wrote:
> >
> > > here's fklocks v14, which also merges to new master as there were
> > > several other conflicts. It passes make installcheck-world now.
> >
> > Recent commits broke it again, so here's a rebased v15.  (No changes
> > other than attempting to merge your work with the pg_controldata and
> > pg_resetxlog changes and wrapping that FIXME in a comment.)
>
> Here's v16, merged to latest stuff,

Really attached now.

BTW you can get this on github:
https://github.com/alvherre/postgres/tree/fklocks

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Alvaro Herrera's message of vie jun 29 19:17:02 -0400 2012:

> I was only testing migrating from an old version into patched, not
> same-version upgrades.
>
> I think I see what's happening here.

Okay, I have pushed the fix to github -- as I suspected, code-wise the
fix was simple.  I will post an updated consolidated patch later today.

make check of pg_upgrade now passes for me.

It would be nice that "make installcheck" would notify me that some
modules' tests are being skipped ...

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: foreign key locks

From
Alvaro Herrera
Date:
Updated patch attached.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Alvaro Herrera's message of jue jul 05 18:44:11 -0400 2012:
>
> Updated patch attached.
>

Patch rebased to latest sources.  This is also on Github:
https://github.com/alvherre/postgres/tree/fklocks


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Patch v19 contains some tweaks.  Most notably,

1. if an Xid requests a lock A, and then a lock B which is stronger than
A, then record only lock B and forget lock A.  This is important for
performance, because EvalPlanQual obtains ForUpdate locks on the tuples
that it chases on an update chain.  If EPQ was called by an update or
delete, previously a MultiXact was being created.

In a stock pgbench run this was causing lots of multis to be created,
even when there are no FKs.

This was most likely involved in the 9% performance decrease that was
previously reported.

2. Save a few TransactionIdIsInProgress calls in MultiXactIdWait.  We
were calling it to determine how many transactions were still running,
but not all callers were interested in that info.  XidIsInProgress is
not particularly cheap, so it seems good to skip it if possible.  I
didn't try to measure the difference, however.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Robert Haas
Date:
On Wed, Aug 22, 2012 at 5:31 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Patch v19 contains some tweaks.  Most notably,
>
> 1. if an Xid requests a lock A, and then a lock B which is stronger than
> A, then record only lock B and forget lock A.  This is important for
> performance, because EvalPlanQual obtains ForUpdate locks on the tuples
> that it chases on an update chain.  If EPQ was called by an update or
> delete, previously a MultiXact was being created.
>
> In a stock pgbench run this was causing lots of multis to be created,
> even when there are no FKs.
>
> This was most likely involved in the 9% performance decrease that was
> previously reported.

Ah-ha!  Neat.  I'll try to find some time to re-benchmark this during
the next CF, unless you did that already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: foreign key locks

From
Alvaro Herrera
Date:
Excerpts from Alvaro Herrera's message of mié ago 22 17:31:28 -0400 2012:
>
> Patch v19 contains some tweaks.  Most notably,

v20 is merged to latest master; the only change other than automatic
merge is to update for pg_upgrade API fixes.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
v21 is merged to latest master.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Andres Freund
Date:
On Friday, August 31, 2012 06:59:51 AM Alvaro Herrera wrote:
> v21 is merged to latest master.
Ok, I am starting to look at this.

(working with a git merge of alvherre/fklocks into todays master)

In a very first pass as somebody who hasn't followed the discussions in the
past I took notice of the following things:

* compiles and survives make check
* README.tuplock jumps in quite fast without any introduction

* the reasons for the redesign aren't really included in the patch but just in
the - very nice - pgcon paper.

* "This is a bit of a performance regression, but since we expect FOR SHARE
locks to be seldom used, we don’t feel this is a serious problem." makes me a
bit uneasy, will look at performance separately

* Should RelationGetIndexattrBitmap check whether a unique index is referenced
from a pg_constraint row? Currently we pay part of the overhead independent
from the existance of foreign keys.

* mode == LockTupleKeyShare, == LockTupleShare, == LockTupleUpdate in
heap_lock_tuple's BeingUpdated branch look like they should be an if/else if

* I don't really like HEAP_UPDATE_KEY_REVOKED as a name, what about
HEAP_KEYS_UPDATED (akin to HEAP_HOT_UPDATED)

* how can we get infomask2 & HEAP_UPDATE_KEY_REVOKED && infomask &
HEAP_XMAX_LOCK_ONLY?

* heap_lock_tuple with mode == LockTupleKeyShare && nowait looks like it would
wait anyway in heap_lock_updated_tuple_rec

* rename HeapSatisfiesHOTUpdate, adjust comments?

* I wonder whether HeapSatisfiesHOTUpdate could be made to compute the result
for key_attrs and hot_attrs at the same time, often enough they will do the
same thing on the same column

* you didn't start it but I find that all those l$i jump labels make the code
harder to follow

* shouldn't XmaxGetUpdateXid be rather called MultiXactIdGetUpdateXid or such?

*    /*     * If the existing lock mode is identical to or weaker than the new     * one, we can act as though there is
noexisting lock and have the     * upper block handle this.     */ 
"block"?

* I am not yet sure whether the (xmax == add_to_xmax) optimization in
compute_new_xmax_infomask is actually safe

* ConditionalMultiXactIdWait and MultiXactIdWait should probably rely on a
common implementation

* I am not that convinced that moving the waiting functions away from
multixact.c is a good idea, but I guess the required knowledge about lockmodes
makes it hard not to do so

* Haven't looked yet, but I have a slightly uneasy feeling about Hot Standby
interactions. Did you look at that?

* Hm?
HeapTupleSatisfiesVacuum
#if 0            ResetMultiHintBit(tuple, buffer, xmax, true);
#endif

This obviously is not a real review, but just learning what the problem & patch
actually try to do. This is quite a bit to take in ;). I will let it sink in
ant try to have a look at the architecture and performance next.


Greetings,

Andres

.oO(and people call catalog timetravel complicated)

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Andres Freund wrote:
> On Friday, August 31, 2012 06:59:51 AM Alvaro Herrera wrote:
> > v21 is merged to latest master.
> Ok, I am starting to look at this.
>
> (working with a git merge of alvherre/fklocks into todays master)
>
> In a very first pass as somebody who hasn't followed the discussions in the
> past I took notice of the following things:
>
> * compiles and survives make check

Please verify src/test/isolation's make installcheck as well.

> * README.tuplock jumps in quite fast without any introduction

Hmm .. I'll give that a second look.  Maybe some material from the pgcon
paper should be added as introduction.

> * "This is a bit of a performance regression, but since we expect FOR SHARE
> locks to be seldom used, we don’t feel this is a serious problem." makes me a
> bit uneasy, will look at performance separately

Thanks.  Keep in mind though that the bits we really want good
performance for is FOR KEY SHARE, which is going to be used in foreign
keys.  FOR SHARE is staying just for hypothetical backwards
compatibility.  I just found that people is using it, see for example
http://stackoverflow.com/a/3243083

> * Should RelationGetIndexattrBitmap check whether a unique index is referenced
> from a pg_constraint row? Currently we pay part of the overhead independent
> from the existance of foreign keys.

Noah also suggested something more or less in the along the same lines.
My answer is that I don't want to do it currently; maybe we can improve
on what indexes we use later, but since this can be seen as a modularity
violation, I prefer to keep it as simple as possible.

> * mode == LockTupleKeyShare, == LockTupleShare, == LockTupleUpdate in
> heap_lock_tuple's BeingUpdated branch look like they should be an if/else if

Hm, will look at that.

> * I don't really like HEAP_UPDATE_KEY_REVOKED as a name, what about
> HEAP_KEYS_UPDATED (akin to HEAP_HOT_UPDATED)

Thanks.

> * how can we get infomask2 & HEAP_UPDATE_KEY_REVOKED && infomask &
> HEAP_XMAX_LOCK_ONLY?

SELECT FOR KEY UPDATE?  This didn't exist initially, but previous
reviewers (Noah) felt that it made sense to have it for consistency.  It
makes the lock conflict table make more sense, anyway.

> * heap_lock_tuple with mode == LockTupleKeyShare && nowait looks like it would
> wait anyway in heap_lock_updated_tuple_rec

Oops.

> * rename HeapSatisfiesHOTUpdate, adjust comments?

Yeah.

> * I wonder whether HeapSatisfiesHOTUpdate could be made to compute the result
> for key_attrs and hot_attrs at the same time, often enough they will do the
> same thing on the same column

Hmm, I will look at that.

> * you didn't start it but I find that all those l$i jump labels make the code
> harder to follow

You mean you suggest to have them renamed?  That can be arranged.

> * shouldn't XmaxGetUpdateXid be rather called MultiXactIdGetUpdateXid or such?

Yes.

>         /*
>          * If the existing lock mode is identical to or weaker than the new
>          * one, we can act as though there is no existing lock and have the
>          * upper block handle this.
>          */
> "block"?

Yeah, as in "the if {} block above".  Since this is not clear maybe it
needs rewording.

> * I am not yet sure whether the (xmax == add_to_xmax) optimization in
> compute_new_xmax_infomask is actually safe

Will think about it and add a/some comment(s).

> * ConditionalMultiXactIdWait and MultiXactIdWait should probably rely on a
> common implementation

Will look at that.

> * I am not that convinced that moving the waiting functions away from
> multixact.c is a good idea, but I guess the required knowledge about lockmodes
> makes it hard not to do so

Yeah.  The line between multixact.c and heapam.c is a zigzag currently,
I think.  Maybe it needs to be more clear; I think the multixact modes
really belong into heapam.c, and they are just uint32 to multixact.c.  I
wasn't brave enough to attempt this; maybe I should.

> * Haven't looked yet, but I have a slightly uneasy feeling about Hot Standby
> interactions. Did you look at that?
>
> * Hm?
> HeapTupleSatisfiesVacuum
> #if 0
>                 ResetMultiHintBit(tuple, buffer, xmax, true);
> #endif

Yeah.  I had a couple of ResetMultiHintBit calls sprinkled in some of
the tqual routines, with the idea that if they saw multis that were no
longer live they could be removed, and replaced by just the remaining
updater.  However, this doesn't really work because it's not safe to
change the Xmax when not holding exclusive lock, and the tqual routines
don't know that.  So we're stuck with keeping the multi in there, even
when we know they are no longer interesting.  This is a bit sad, because
it would be a performance benefit to remove them; but it's not strictly
necessary for correctness, so we can leave the optimization for a later
phase.  I let those #ifdefed out calls in place so that it's easy to see
where the reset needs to happen.


> This obviously is not a real review, but just learning what the problem & patch
> actually try to do. This is quite a bit to take in ;). I will let it sink in
> ant try to have a look at the architecture and performance next.

Well, the patch is large and complex so any review is obviously going to
take a long time.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Andres Freund wrote:

> > * heap_lock_tuple with mode == LockTupleKeyShare && nowait looks like it would
> > wait anyway in heap_lock_updated_tuple_rec
>
> Oops.

I took a stab at fixing this.  However, it is not easy.  First you need
a way to reproduce the problem, which involves setting breakpoints in
GDB.  (Since a REPEATABLE READ transaction will fail to follow an update
chain due to "tuple concurrently updated", you need to use a READ
COMMITTED transaction; but obviously the timing to insert the bunch of
updates in the middle is really short.  Hence I inserted a breakpoint at
the end of GetSnapshotData, had a SELECT FOR KEY SHARE NOWAIT get stuck
in it, and then launched a couple of updates in another session).  I was
able to reproduce the undesirable wait.

I quickly patched heap_lock_updated_tuple to pass down the nowait flag,
but this is actually not reached, because the update chain is first
followed by EvalPlanQual in ExecLockRows, and not by
heap_lock_updated_tuple directly.  And EPQ does not have the nowait
behavior.  So it still blocks.

Maybe what we need to do is prevent ExecLockRows from following the
update chain altogether -- after all, if heap_lock_tuple is going to do
it by itself maybe it's wholly redundant.

Not really sure what's the best way to approach this.  At this stage I'm
inclined to ignore the problem, unless some EPQ expert shows up and
tells me that (1) it's okay to patch EPQ in that way, or (2) we should
hack ExecLockRows (and remove EPQ?).


I pushed (to github) patches for a couple of other issues you raised.
Some others still need a bit more of my attention.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Here is version 22 of this patch.  This version contains fixes to issues
reported by Andres, as well as a rebase to latest master.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Noah Misch
Date:
On Thu, Oct 18, 2012 at 04:58:20PM -0300, Alvaro Herrera wrote:
> Here is version 22 of this patch.  This version contains fixes to issues
> reported by Andres, as well as a rebase to latest master.

I scanned this for obvious signs of work left to do.  It contains numerous XXX
and FIXME comments.  Many highlight micro-optimization opportunities and the
like; those can stay.  Others preclude commit, either highlighting an unsolved
problem or wrongly highlighting a non-problem:

> +     /*
> +      * XXX we do not lock this tuple here; the theory is that it's sufficient
> +      * with the buffer lock we're about to grab.  Any other code must be able
> +      * to cope with tuple lock specifics changing while they don't hold buffer
> +      * lock anyway.
> +      */

>    * so they will be uninteresting by the time our next transaction starts.
>    * (XXX not clear that this is correct --- other members of the MultiXact
>    * could hang around longer than we did.  However, it's not clear what a
> !  * better policy for flushing old cache entries would be.)  FIXME actually
> !  * this is plain wrong now that multixact's may contain update Xids.

> !     nmembers = GetMultiXactIdMembers(multi, &members, true);
> !     /*
> !      * XXX we don't have the infomask to run the required consistency check
> !      * here; the required notational overhead seems excessive.
> !      */

>           /* We assume the cache entries are sorted */
> !         /* XXX we assume the unused bits in "status" are zeroed */
> !         if (memcmp(members, entry->members, nmembers * sizeof(MultiXactMember)) == 0)

> !  * XXX do we have any issues with needing to checkpoint here?
>    */
> ! static void
> ! TruncateMultiXact(void)
>   {

> +     /* FIXME what should we initialize this to? */
> +     newFrozenMulti = ReadNextMultiXactId();

> +      * FIXME -- the XMAX_IS_MULTI test is a bit wrong .. it's possible to
> +      * have tuples with that bit set that are dead.  However, if that's
> +      * changed, the RawXmax() call below should probably be researched as well.
>        */
>       if (tuple->t_infomask &
> !         (HEAP_XMAX_INVALID | HEAP_XMAX_LOCK_ONLY | HEAP_XMAX_IS_MULTI))
>           return false;



Re: foreign key locks

From
Andres Freund
Date:
On Thursday, October 18, 2012 09:58:20 PM Alvaro Herrera wrote:
> Here is version 22 of this patch.  This version contains fixes to issues
> reported by Andres, as well as a rebase to latest master.

Ok, I now that pgconf.eu has ended I am starting to do a real review:

* Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2 and earlier 
you could be sure that if you FOR UPDATE'ed a row you could delete it. Unless 
I miss something now this will not block somebody else acquiring a FOR KEY 
SHARE lock, so this guarantee is gone.
This seems likely to introduce subtle problems in user applications.

I propose renaming FOR UPDATE => FOR NO KEY UPDATE, FOR KEY UPDATE => FOR 
UPDATE or similar (or PREVENT KEY UPDATE?). That keeps the old meaning of FOR 
UPDATE.

You write "SELECT FOR UPDATE is a standards-compliant exclusive lock". I 
didn't really find all that much about the semantics of FOR UPDATE on cursors 
in the standard other than "The operations of update and delete are permitted 
for updatable cursors, subject to constraining Access Rules.". 

* I would welcome adding some explanatory comments about the point of 
TupleLockExtraInfo and MultiXactStatusLock at the respective definition.

* Why do we have the HEAP_XMAX_IS_MULTI && HEAP_XMAX_LOCK_ONLY case?

* I think some of the longer comments could use the attention of a native 
speaker, unfortunately thats not me.

* I am still uncomfortable with the FOR SHARE deoptimization. I have seen 
people lock larger parts of their table to make some READ COMMITTED things 
actually safe.
Is there any problem retaining the non XMAX_IS_MULTI behaviour except space in 
infomask? That seems solveable by something like

#define HEAP_XMAX_SHR_LOCK     0x0010
#define HEAP_XMAX_EXCL_LOCK        0x0040
#define HEAP_XMAX_KEYSHR_LOCK   (HEAP_XMAX_SHR_LOCK  | HEAP_XMAX_EXCL_LOCK)

and somewhat more complex expressions when testing the locks ((infomask & 
HEAP_XMAX_KEYSHR_LOCK ) == HEAP_XMAX_KEYSHR_LOCK, etc).

* In heap_lock_tuple's  XMAX_IS_MULTI case
        for (i = 0; i < nmembers; i++)        {            if (TransactionIdIsCurrentTransactionId(members[i].xid))
      {                LockTupleMode    membermode;
 
                membermode = 
TUPLOCK_from_mxstatus(members[i].status);
                if (membermode > mode)                {                    if (have_tuple_lock)
UnlockTupleTuplock(relation,tid, mode);
 
                    pfree(members);                    return HeapTupleMayBeUpdated;                }            }
 }
 

why is it membermode > mode and not membermode >= mode?

* Is the case of a a non-key-update followed by a key-update actually handled 
when doing a heap_lock_tuple with mode = LockTupleKeyShare and follow_updates 
= false? I don't really see how, so it seems to deserve at least a comment.

But then I don't yet understand why follow_update makes sense.

* In heap_lock_tuple with mode == LockTupleUpdate && infomask & 
HEAP_XMAX_IS_MULTI, were leaking members when doing goto l3. Probably not 
relevant, but given the code tries to be careful everywhere else...
We might also leak in the members == 0 case there, not sure yet.

Ok, this is at about 35% of the diff in my second pass, but I just arrived back 
in Berlin, and this seems useful enough on its own...

Andres
-- 
Andres Freund        http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Simon Riggs
Date:
On 27 October 2012 00:06, Andres Freund <andres@2ndquadrant.com> wrote:
> On Thursday, October 18, 2012 09:58:20 PM Alvaro Herrera wrote:
>> Here is version 22 of this patch.  This version contains fixes to issues
>> reported by Andres, as well as a rebase to latest master.
>
> Ok, I now that pgconf.eu has ended I am starting to do a real review:
>
> * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2 and earlier
> you could be sure that if you FOR UPDATE'ed a row you could delete it. Unless
> I miss something now this will not block somebody else acquiring a FOR KEY
> SHARE lock, so this guarantee is gone.

Yes, that is exactly the point of this.

> This seems likely to introduce subtle problems in user applications.

Yes, it could. So we need some good docs on explaining this.

Which applications use FOR UPDATE?
Any analysis of particular situations would be very helpful in doing
the correct thing here.

I think introducing FOR DELETE would be much clearer as an addition/
synonym for FOR KEY UPDATE.


> I propose renaming FOR UPDATE => FOR NO KEY UPDATE, FOR KEY UPDATE => FOR
> UPDATE or similar (or PREVENT KEY UPDATE?). That keeps the old meaning of FOR
> UPDATE.

Which is essentially unwinding the feature, to some extent.

Does FOR UPDATE throw an error if the user later issues an UPDATE of
the PK or a DELETE? That sequence of actions would cause lock
escalation in the application, which could also lead to
deadlock/contention.

This sounds like we need a GUC or table-level default to control
whether FOR UPDATE means FOR UPDATE or FOR DELETE

More thought/input required on this point, it seems important.


> You write "SELECT FOR UPDATE is a standards-compliant exclusive lock". I
> didn't really find all that much about the semantics of FOR UPDATE on cursors
> in the standard other than "The operations of update and delete are permitted
> for updatable cursors, subject to constraining Access Rules.".

> * I think some of the longer comments could use the attention of a native
> speaker, unfortunately thats not me.



-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Andres Freund
Date:
On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
> On 27 October 2012 00:06, Andres Freund <andres@2ndquadrant.com> wrote:
> > On Thursday, October 18, 2012 09:58:20 PM Alvaro Herrera wrote:
> >> Here is version 22 of this patch.  This version contains fixes to issues
> >> reported by Andres, as well as a rebase to latest master.
> > 
> > Ok, I now that pgconf.eu has ended I am starting to do a real review:
> > 
> > * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2 and
> > earlier you could be sure that if you FOR UPDATE'ed a row you could
> > delete it. Unless I miss something now this will not block somebody else
> > acquiring a FOR KEY SHARE lock, so this guarantee is gone.
> 
> Yes, that is exactly the point of this.

Yes, sure. The point is the introduction of a weaker lock level which can be 
used by the ri triggers. I don't see any imperative that the semantics of the 
old lock level need to be redefined. That just seems dangerous to me.

We need to take care to reduce the complications of upgrades not introduce 
changes that require complex code reviews.

> > This seems likely to introduce subtle problems in user applications.
> 
> Yes, it could. So we need some good docs on explaining this.
> 
> Which applications use FOR UPDATE?

Any that want to protect themselves against deadlocks and/or visibility 
problems due to READ COMMITTED. Thats quite a bunch.

> Any analysis of particular situations would be very helpful in doing
> the correct thing here.

Usual things include

* avoiding problems due to lock upgrades in combination with foreign keys (as 
far as I can see some of those still persist).
* prevent rows being deleted by other transactions
* prepare for updating if computation of the new values take some time
* guarantee order of locking to make sure rows are DELETE/UPDATEed in the same 
order (still no ORDER BY in UPDATE/DELETE)
...

> I think introducing FOR DELETE would be much clearer as an addition/
> synonym for FOR KEY UPDATE.

Hm. Not really convinced. For me that seems to just make the overall situation 
even more complex.

> > I propose renaming FOR UPDATE => FOR NO KEY UPDATE, FOR KEY UPDATE => FOR
> > UPDATE or similar (or PREVENT KEY UPDATE?). That keeps the old meaning of
> > FOR UPDATE.
> 
> Which is essentially unwinding the feature, to some extent.

Why? The overall features available are just the same? The only thing is that 
existing semantics aren't changed.

> Does FOR UPDATE throw an error if the user later issues an UPDATE of
> the PK or a DELETE? That sequence of actions would cause lock
> escalation in the application, which could also lead to
> deadlock/contention.

Unless I miss something it precisely does *not* result in lock escalation with 
the 9.2 semanticsbut it *would* with fklocks applied. Thats exactly my point.

> This sounds like we need a GUC or table-level default to control
> whether FOR UPDATE means FOR UPDATE or FOR DELETE

I don't like adding a new guc for something that should be solveable with some 
creative naming. If a new user doesn't get a bit more concurrency due to 
manually issued 9.2 FOR UPDATE implicitly being converted into a FOR NO KEY 
UPDATE its not too bad. The code needs to be checked whether thats valid 
anyway. The reverse is not true...

> More thought/input required on this point, it seems important.

Yep, more input welcome.

Greetings,

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Simon Riggs
Date:
On 29 October 2012 12:27, Andres Freund <andres@2ndquadrant.com> wrote:

>> This sounds like we need a GUC or table-level default to control
>> whether FOR UPDATE means FOR UPDATE or FOR DELETE
>
> I don't like adding a new guc for something that should be solveable with some
> creative naming. If a new user doesn't get a bit more concurrency due to
> manually issued 9.2 FOR UPDATE implicitly being converted into a FOR NO KEY
> UPDATE its not too bad. The code needs to be checked whether thats valid
> anyway. The reverse is not true...

The main point here is that introducing new non-optional behaviour is
a compatibility problem and we shouldn't rush that.

If we decide to change FOR UPDATE to mean FOR NO KEY UPDATE that needs
separate consideration and shouldn't happen until sometime after the
feature goes in (months, or perhaps releases).

We're introducing a new feature that will allow us to avoid lock
problems, by taking the current FOR UPDATE behaviour and splitting it
into two options: one weaker and one stronger. We need explicit names
for both of those options. The most obvious naming is
FOR [[NO] KEY] UPDATE

Don't much like that, but its pretty unambiguous and fairly neat.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
"Kevin Grittner"
Date:
Andres Freund wrote:
> On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
>> Andres Freund <andres@2ndquadrant.com> wrote:

>>> * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2
>>> and earlier you could be sure that if you FOR UPDATE'ed a row you
>>> could delete it. Unless I miss something now this will not block
>>> somebody else acquiring a FOR KEY SHARE lock, so this guarantee
>>> is gone.
>> 
>> Yes, that is exactly the point of this.
> 
> The point is the introduction of a weaker lock level which can be 
> used by the ri triggers. I don't see any imperative that the
> semantics of the old lock level need to be redefined. That just
> seems dangerous to me.

I agree with Andres here -- the new lock level is needed within RI
triggers, and we might as well expose it for application programmer
use (with proper documentations), but there's no reason to break
existing application code by making the semantics of SELECT FOR
UPDATE less strict than they were before. Let's keep that term
meaning exactly the same thing if we can, and use new names for the
new levels.

We already present a barrier to people migrating from Oracle because
SELECT FOR UPDATE in PostgreSQL is weaker than it is in Oracle (where
it is just as strong as if an actual UPDATE were done -- write
conflicts and all); let's not increase the barrier by making SELECT
FOR UPDATE even weaker.

-Kevin



Re: foreign key locks

From
Noah Misch
Date:
On Mon, Oct 29, 2012 at 08:18:33AM -0400, Kevin Grittner wrote:
> Andres Freund wrote:
> > On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
> >> Andres Freund <andres@2ndquadrant.com> wrote:
> 
> >>> * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2
> >>> and earlier you could be sure that if you FOR UPDATE'ed a row you
> >>> could delete it. Unless I miss something now this will not block
> >>> somebody else acquiring a FOR KEY SHARE lock, so this guarantee
> >>> is gone.
> >> 
> >> Yes, that is exactly the point of this.
> > 
> > The point is the introduction of a weaker lock level which can be 
> > used by the ri triggers. I don't see any imperative that the
> > semantics of the old lock level need to be redefined. That just
> > seems dangerous to me.
> 
> I agree with Andres here -- the new lock level is needed within RI
> triggers, and we might as well expose it for application programmer
> use (with proper documentations), but there's no reason to break
> existing application code by making the semantics of SELECT FOR
> UPDATE less strict than they were before. Let's keep that term
> meaning exactly the same thing if we can, and use new names for the
> new levels.

+1.  I had not considered this angle during previous reviews, but I agree with
Andres's position.  Since this patch does not strengthen the strongest tuple
lock relative to its PostgreSQL 9.2 counterpart, that lock type should
continue to use the syntax "FOR UPDATE".  It will come to mean "the lock type
sufficient for all possible UPDATEs of the row".



Re: foreign key locks

From
Simon Riggs
Date:
On 31 October 2012 19:35, Noah Misch <noah@leadboat.com> wrote:
> On Mon, Oct 29, 2012 at 08:18:33AM -0400, Kevin Grittner wrote:
>> Andres Freund wrote:
>> > On Sunday, October 28, 2012 11:47:16 PM Simon Riggs wrote:
>> >> Andres Freund <andres@2ndquadrant.com> wrote:
>>
>> >>> * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2
>> >>> and earlier you could be sure that if you FOR UPDATE'ed a row you
>> >>> could delete it. Unless I miss something now this will not block
>> >>> somebody else acquiring a FOR KEY SHARE lock, so this guarantee
>> >>> is gone.
>> >>
>> >> Yes, that is exactly the point of this.
>> >
>> > The point is the introduction of a weaker lock level which can be
>> > used by the ri triggers. I don't see any imperative that the
>> > semantics of the old lock level need to be redefined. That just
>> > seems dangerous to me.
>>
>> I agree with Andres here -- the new lock level is needed within RI
>> triggers, and we might as well expose it for application programmer
>> use (with proper documentations), but there's no reason to break
>> existing application code by making the semantics of SELECT FOR
>> UPDATE less strict than they were before. Let's keep that term
>> meaning exactly the same thing if we can, and use new names for the
>> new levels.
>
> +1.  I had not considered this angle during previous reviews, but I agree with
> Andres's position.  Since this patch does not strengthen the strongest tuple
> lock relative to its PostgreSQL 9.2 counterpart, that lock type should
> continue to use the syntax "FOR UPDATE".  It will come to mean "the lock type
> sufficient for all possible UPDATEs of the row".

So we have syntax

FOR NON KEY UPDATE
FOR KEY UPDATE

KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Simon Riggs wrote:
> On 31 October 2012 19:35, Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Oct 29, 2012 at 08:18:33AM -0400, Kevin Grittner wrote:
> >> Andres Freund wrote:

> >> > The point is the introduction of a weaker lock level which can be
> >> > used by the ri triggers. I don't see any imperative that the
> >> > semantics of the old lock level need to be redefined. That just
> >> > seems dangerous to me.
> >>
> >> I agree with Andres here -- the new lock level is needed within RI
> >> triggers, and we might as well expose it for application programmer
> >> use (with proper documentations), but there's no reason to break
> >> existing application code by making the semantics of SELECT FOR
> >> UPDATE less strict than they were before. Let's keep that term
> >> meaning exactly the same thing if we can, and use new names for the
> >> new levels.
> >
> > +1.  I had not considered this angle during previous reviews, but I agree with
> > Andres's position.  Since this patch does not strengthen the strongest tuple
> > lock relative to its PostgreSQL 9.2 counterpart, that lock type should
> > continue to use the syntax "FOR UPDATE".  It will come to mean "the lock type
> > sufficient for all possible UPDATEs of the row".

Yeah, I agree with this too.

> So we have syntax
>
> FOR NON KEY UPDATE
> FOR KEY UPDATE
>
> KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE

Not really sure about the proposed syntax, but yes clearly we need some
other syntax to mean "FOR NON KEY UPDATE".  I would rather keep FOR
UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
the other (weaker) lock level welcome (but if you love FOR NON KEY
UPDATE, please chime in too)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Noah Misch
Date:
On Wed, Oct 31, 2012 at 05:22:10PM -0300, Alvaro Herrera wrote:
> Simon Riggs wrote:
> > On 31 October 2012 19:35, Noah Misch <noah@leadboat.com> wrote:
> > > +1.  I had not considered this angle during previous reviews, but I agree with
> > > Andres's position.  Since this patch does not strengthen the strongest tuple
> > > lock relative to its PostgreSQL 9.2 counterpart, that lock type should
> > > continue to use the syntax "FOR UPDATE".  It will come to mean "the lock type
> > > sufficient for all possible UPDATEs of the row".
> 
> Yeah, I agree with this too.
> 
> > So we have syntax
> > 
> > FOR NON KEY UPDATE
> > FOR KEY UPDATE
> > 
> > KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE
> 
> Not really sure about the proposed syntax, but yes clearly we need some
> other syntax to mean "FOR NON KEY UPDATE".  I would rather keep FOR
> UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
> the other (weaker) lock level welcome (but if you love FOR NON KEY
> UPDATE, please chime in too)

Agree on having "FOR UPDATE" without any "FOR KEY UPDATE" synonym.  For the
weaker lock, I mildly preferred the proposal of "FOR NO KEY UPDATE".  NON KEY
captures the idea better in English, but NO is close enough and already part
of the SQL lexicon.



Re: foreign key locks

From
Alvaro Herrera
Date:
FWIW I have gotten a lot of feedback about this patch, and since I don't
have time right now to produce an updated version, that I'm going to
close this as Returned with Feedback, and submit an updated version to
the upcoming commitfest.

This patch still needs much more review -- for example, as far as I
know, the multixact.c changes have gone largely unreviewed; they have
changed somewhat since Noah reviewed them (back in version 15 or so, I
think).  Of course, to me it all makes sense, but then I'm only its
author.

-- 
Alvaro Herrera                         http://www.flickr.com/photos/alvherre/
"Everybody understands Mickey Mouse. Few understand Hermann Hesse.
Hardly anybody understands Einstein. And nobody understands Emperor Norton."



Re: foreign key locks

From
Andres Freund
Date:
On Monday, November 05, 2012 02:37:15 PM Alvaro Herrera wrote:
> FWIW I have gotten a lot of feedback about this patch, and since I don't
> have time right now to produce an updated version, that I'm going to
> close this as Returned with Feedback, and submit an updated version to
> the upcoming commitfest.
> 
> This patch still needs much more review -- for example, as far as I
> know, the multixact.c changes have gone largely unreviewed; they have
> changed somewhat since Noah reviewed them (back in version 15 or so, I
> think).  Of course, to me it all makes sense, but then I'm only its
> author.

There also hasn't been any recent performance testing. I am not sure if I can 
get the time to do so before the next commitfest...

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Dimitri Fontaine
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> FOR NON KEY UPDATE
>> FOR KEY UPDATE
>> 
>> KEY is the default, so FOR UPDATE is a synonym of FOR KEY UPDATE
>
> Not really sure about the proposed syntax, but yes clearly we need some
> other syntax to mean "FOR NON KEY UPDATE".  I would rather keep FOR
> UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
> the other (weaker) lock level welcome (but if you love FOR NON KEY
> UPDATE, please chime in too)

FOR ANY UPDATE, synonym of FOR UPDATE
FOR KEY UPDATE, optimized version, when it applies to your case

I also tend to think that we should better not change the current
meaning of FOR UPDATE and have it default to FOR ANY UPDATE.

Unless it's easy to upgrade from ANY to KEY, and do that automatically
at the right time, but I fear there lie dragons (or something).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: foreign key locks

From
Alvaro Herrera
Date:
Noah Misch wrote:
> On Wed, Oct 31, 2012 at 05:22:10PM -0300, Alvaro Herrera wrote:

> > Not really sure about the proposed syntax, but yes clearly we need some
> > other syntax to mean "FOR NON KEY UPDATE".  I would rather keep FOR
> > UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
> > the other (weaker) lock level welcome (but if you love FOR NON KEY
> > UPDATE, please chime in too)
>
> Agree on having "FOR UPDATE" without any "FOR KEY UPDATE" synonym.  For the
> weaker lock, I mildly preferred the proposal of "FOR NO KEY UPDATE".  NON KEY
> captures the idea better in English, but NO is close enough and already part
> of the SQL lexicon.

This is the proposal I like best; however there is an asymmetry, because
the locking options now are

FOR KEY SHARE
FOR SHARE
FOR NO KEY UPDATE
FOR UPDATE

I used to have comments such as

/* Currently, SELECT ... FOR [KEY] UPDATE/SHARE requires UPDATE privileges */
#define ACL_SELECT_FOR_UPDATE    ACL_UPDATE

but now they are slightly incorrect because the NO is not illustrated.
I guess I could use SELECT ... FOR [NO KEY] UPDATE/SHARE but this leaves
out the "FOR KEY SHARE" case (and can be thought to introduce a FOR NO
KEY SHARE case).  And getting much more verbose than that is probably
not warranted.  In some places I would like the use a phrase like "the
locking clause", but I'm not sure that it's clear enough.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Here's version 23 of this patch, with fixes for the below comments and
a merge to master.

Andres Freund wrote:
> On Thursday, October 18, 2012 09:58:20 PM Alvaro Herrera wrote:
> > Here is version 22 of this patch.  This version contains fixes to issues
> > reported by Andres, as well as a rebase to latest master.
>
> Ok, I now that pgconf.eu has ended I am starting to do a real review:
>
> * Is it ok to make FOR UPDATE somewhat weaker than before? In 9.2 and earlier
> you could be sure that if you FOR UPDATE'ed a row you could delete it. Unless
> I miss something now this will not block somebody else acquiring a FOR KEY
> SHARE lock, so this guarantee is gone.
> This seems likely to introduce subtle problems in user applications.
>
> I propose renaming FOR UPDATE => FOR NO KEY UPDATE, FOR KEY UPDATE => FOR
> UPDATE or similar (or PREVENT KEY UPDATE?). That keeps the old meaning of FOR
> UPDATE.

Okay, in subsequent discussion everyone agreed that this is necessary;
thanks for noticing the problem.  Instead of your suggestion, however, I
used Noah's; luckily I just noticed that it was identical to yours.  So
apparently there is also agreement on this point.  I have updated the
code, comments and docs (and renamed some other stuff to match).

> You write "SELECT FOR UPDATE is a standards-compliant exclusive lock". I
> didn't really find all that much about the semantics of FOR UPDATE on cursors
> in the standard other than "The operations of update and delete are permitted
> for updatable cursors, subject to constraining Access Rules.".

I don't really know where that comes from, but it's a statement I
grabbed from the docs somewhere.

> * I would welcome adding some explanatory comments about the point of
> TupleLockExtraInfo and MultiXactStatusLock at the respective definition.

Done.  If that's not enough, I would welcome suggestions or specific
question needing clarification.

> * Why do we have the HEAP_XMAX_IS_MULTI && HEAP_XMAX_LOCK_ONLY case?

For pg_upgrade.  HEAP_XMAX_LOCK_ONLY is the value previously used by
HEAP_XMAX_SHARED_LOCK, so a database containing such values that is
upgraded will contain that bit pattern.

> * I think some of the longer comments could use the attention of a native
> speaker, unfortunately thats not me.

Sorry about that.

> * I am still uncomfortable with the FOR SHARE deoptimization. I have seen
> people lock larger parts of their table to make some READ COMMITTED things
> actually safe.

I haven't changed this yet.  There are bits available in infomask2 that
we could use, if we agree that it is necessary; or, as you say, we could
use two bits to distinguish three different states, but we need to
ensure that pg_upgraded databases will work sanely.

> * In heap_lock_tuple's  XMAX_IS_MULTI case
>
> [snip]
>
> why is it membermode > mode and not membermode >= mode?

Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
there was a deadlock possible here.  Maybe I should add a test to ensure
this doesn't happen.

> * Is the case of a a non-key-update followed by a key-update actually handled
> when doing a heap_lock_tuple with mode = LockTupleKeyShare and follow_updates
> = false? I don't really see how, so it seems to deserve at least a comment.

I wasn't able to figure out what you think is the problem.

> But then I don't yet understand why follow_update makes sense.

Basically heap_lock_tuple can be told by the caller to follow the update
chain to lock subsequent tuples, or not.  Mainly, EvalPlanQual does not
want this to happen, because it does its own update-chain walking.

In all honesty, it is not clear to me that this is the proper solution
to the problem.  Maybe some hacking around ExecLockRows is necessary.

> * In heap_lock_tuple with mode == LockTupleUpdate && infomask &
> HEAP_XMAX_IS_MULTI, were leaking members when doing goto l3. Probably not
> relevant, but given the code tries to be careful everywhere else...

Right, plugged.

> We might also leak in the members == 0 case there, not sure yet.

Hm, I don't think GetMultiXactIdMembers really considers the case when 0
members are to be returned.  Maybe that needs some more tweaking.

> Ok, this is at about 35% of the diff in my second pass, but I just arrived back
> in Berlin, and this seems useful enough on its own...

Many thanks.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> > * In heap_lock_tuple's  XMAX_IS_MULTI case
> >
> > [snip]
> >
> > why is it membermode > mode and not membermode >= mode?
>
> Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
> there was a deadlock possible here.  Maybe I should add a test to ensure
> this doesn't happen.

Done:
https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770

(I don't think this is worth a v24 submission).

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Noah Misch
Date:
On Wed, Nov 14, 2012 at 01:27:26PM -0300, Alvaro Herrera wrote:
> https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
> 
> (I don't think this is worth a v24 submission).

Are you aware of any defects in or unanswered questions of this version that
would stall your commit thereof?



Re: foreign key locks

From
Andres Freund
Date:
On 2012-11-14 13:27:26 -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>
> > > * In heap_lock_tuple's  XMAX_IS_MULTI case
> > >
> > > [snip]
> > >
> > > why is it membermode > mode and not membermode >= mode?
> >
> > Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
> > there was a deadlock possible here.  Maybe I should add a test to ensure
> > this doesn't happen.
>
> Done:
> https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
>
> (I don't think this is worth a v24 submission).

I have started doing some performance testing and I fear I was right in
being suspicious about the performance difference for FOR SHARE locks:

Tested with
pgbench -p 5442 -U andres \    -c 30  -j 30 \    -M prepared -f ~/tmp/postgres-fklocks/select-for-share.sql \    -T 20
postgres

on a pgbench -i -s 10 database, where select-for-share.sql is:

BEGIN;
\set naccounts 1000000
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid FOR SHARE;
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid FOR SHARE;
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid FOR SHARE;
COMMIT;

which very roughly resembles workloads I have seen in reality (locking
some records your rely on while you do some work).

With
52b4729fcfc20f056f17531a6670d8c4b9696c39 (alvherre/fklocks)
vs
273986bf0d39e5166eb15ba42ebff4803e23a688 (latest merged master)

I get
tps = 8986.133496 (excluding connections establishing)
vs
tps = 25307.861193 (excluding connections establishing)

Thats nearly a factor of three which seems to be too big to be
acceptable to me.
So I really think we need to bring FOR SHARE locks back as a flag.

I have done some benchmarking of other cases (plain pgbench, pgbench
with foreign keys, large insertions, large amounts of FOR SHARE locks)
and haven't found anything really outstanding so far.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Noah Misch wrote:
> On Wed, Nov 14, 2012 at 01:27:26PM -0300, Alvaro Herrera wrote:
> > https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
> >
> > (I don't think this is worth a v24 submission).
>
> Are you aware of any defects in or unanswered questions of this version that
> would stall your commit thereof?

Yeah, I am revisiting the list of XXX/FIXME comments you pointed out
previously.

And I would still like someone with EPQ understanding to review the
ExecLockRows / EvalPlanQual / heap_lock_tuple interactions.

Andres is on the verge of convincing me that we need to support
singleton FOR SHARE without multixacts due to performance concerns.  It
would be useful for more people to chime in here: is FOR SHARE an
important case to cater for?  I wonder if using FOR KEY SHARE (keep
performance characteristics, but would need to revise application code)
would satisfy Andres' users, for example.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Andres Freund
Date:
On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
> Noah Misch wrote:
> > On Wed, Nov 14, 2012 at 01:27:26PM -0300, Alvaro Herrera wrote:
> > > https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
> > >
> > > (I don't think this is worth a v24 submission).
> >
> > Are you aware of any defects in or unanswered questions of this version that
> > would stall your commit thereof?
>
> Yeah, I am revisiting the list of XXX/FIXME comments you pointed out
> previously.
>
> And I would still like someone with EPQ understanding to review the
> ExecLockRows / EvalPlanQual / heap_lock_tuple interactions.

I am in the process of looking at those atm, but we need somebody that
actually understands the intricacies here...

> Andres is on the verge of convincing me that we need to support
> singleton FOR SHARE without multixacts due to performance concerns.

I don't really see any arguments against doing so. We aren't in a that
big shortage of flags and if we need more than available I think we can
free some (e.g. XMAX/XMIN_INVALID).

> It
> would be useful for more people to chime in here: is FOR SHARE an
> important case to cater for?  I wonder if using FOR KEY SHARE (keep
> performance characteristics, but would need to revise application code)
> would satisfy Andres' users, for example.

It definitely wouldn't help in the cases I have seen because the point
is to protect against actual content changes of the rows, not just the
keys.
Note that you actually need to use explicit FOR SHARE/UPDATE for
correctness purposes in many scenarios unless youre running in 9.1+
serializable mode. And that cannot be used in some cases (try queuing
for example) because the rollback rates would be excessive.

Even if applications could be fixed, requiring changes to applications
locking behaviour, which possibly is far from trivial, seems like a too
big upgrade barrier.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Noah Misch
Date:
On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
> On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
> > Andres is on the verge of convincing me that we need to support
> > singleton FOR SHARE without multixacts due to performance concerns.
> 
> I don't really see any arguments against doing so. We aren't in a that
> big shortage of flags and if we need more than available I think we can
> free some (e.g. XMAX/XMIN_INVALID).

The patch currently leaves two unallocated bits.  Reclaiming currently-used
bits means a binary compatibility break.  Until we plan out such a thing,
reclaimable bits are not as handy as never-allocated bits.  I don't think we
should lightly expend one of the final two.

> > It
> > would be useful for more people to chime in here: is FOR SHARE an
> > important case to cater for?  I wonder if using FOR KEY SHARE (keep
> > performance characteristics, but would need to revise application code)
> > would satisfy Andres' users, for example.
> 
> It definitely wouldn't help in the cases I have seen because the point
> is to protect against actual content changes of the rows, not just the
> keys.
> Note that you actually need to use explicit FOR SHARE/UPDATE for
> correctness purposes in many scenarios unless youre running in 9.1+
> serializable mode. And that cannot be used in some cases (try queuing
> for example) because the rollback rates would be excessive.

I agree that tripling FOR SHARE cost is risky.  Where is the added cost
concentrated?  Perchance that multiple belies optimization opportunities.

Thanks,
nm



Re: foreign key locks

From
Andres Freund
Date:
On 2012-11-16 22:31:51 -0500, Noah Misch wrote:
> On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
> > On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
> > > Andres is on the verge of convincing me that we need to support
> > > singleton FOR SHARE without multixacts due to performance concerns.
> >
> > I don't really see any arguments against doing so. We aren't in a that
> > big shortage of flags and if we need more than available I think we can
> > free some (e.g. XMAX/XMIN_INVALID).
>
> The patch currently leaves two unallocated bits.  Reclaiming currently-used
> bits means a binary compatibility break.  Until we plan out such a thing,
> reclaimable bits are not as handy as never-allocated bits.  I don't think we
> should lightly expend one of the final two.

Not sure what you mean with a binary compatibility break?
pg_upgrade'ability?

What I previously suggested somewhere was:

#define HEAP_XMAX_SHR_LOCK    0x0010
#define HEAP_XMAX_EXCL_LOCK   0x0040
#define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
/** Different from _LOCK_BITS because it doesn't include LOCK_ONLY*/
#define HEAP_LOCK_MASK        (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)

#define HEAP_XMAX_IS_SHR_LOCKED(tup) \   (((tup)->t_infomask & HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK)
#define HEAP_XMAX_IS_EXCL_LOCKED(tup) \   (((tup)->t_infomask & HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK)
#define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \   (((tup)->t_infomask & HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK)

It makes checking for locks sightly more more complicated, but its not
too bad...

> > > It
> > > would be useful for more people to chime in here: is FOR SHARE an
> > > important case to cater for?  I wonder if using FOR KEY SHARE (keep
> > > performance characteristics, but would need to revise application code)
> > > would satisfy Andres' users, for example.
> >
> > It definitely wouldn't help in the cases I have seen because the point
> > is to protect against actual content changes of the rows, not just the
> > keys.
> > Note that you actually need to use explicit FOR SHARE/UPDATE for
> > correctness purposes in many scenarios unless youre running in 9.1+
> > serializable mode. And that cannot be used in some cases (try queuing
> > for example) because the rollback rates would be excessive.
>
> I agree that tripling FOR SHARE cost is risky.  Where is the added cost
> concentrated?  Perchance that multiple belies optimization opportunities.

Good question, let me play a bit.

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Noah Misch
Date:
On Sat, Nov 17, 2012 at 03:20:20PM +0100, Andres Freund wrote:
> On 2012-11-16 22:31:51 -0500, Noah Misch wrote:
> > On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
> > > On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
> > > > Andres is on the verge of convincing me that we need to support
> > > > singleton FOR SHARE without multixacts due to performance concerns.
> > >
> > > I don't really see any arguments against doing so. We aren't in a that
> > > big shortage of flags and if we need more than available I think we can
> > > free some (e.g. XMAX/XMIN_INVALID).
> >
> > The patch currently leaves two unallocated bits.  Reclaiming currently-used
> > bits means a binary compatibility break.  Until we plan out such a thing,
> > reclaimable bits are not as handy as never-allocated bits.  I don't think we
> > should lightly expend one of the final two.
> 
> Not sure what you mean with a binary compatibility break?
> pg_upgrade'ability?

Yes.  If we decide HEAP_XMIN_INVALID isn't helping, we can stop adding it to
tuples anytime.  Old tuples may continue to carry the bit, with no particular
deadline for their demise.  To reuse that bit in the mean time, we'll need to
prove that no tuple writable by PostgreSQL 8.3+ will get an unacceptable
interpretation under the new meaning of the bit.  Alternately, build the
mechanism to prove that all such old bits are gone before using the bit in the
new way.  This keeps HEAP_MOVED_IN and HEAP_MOVED_OFF unavailable today.

> What I previously suggested somewhere was:
> 
> #define HEAP_XMAX_SHR_LOCK    0x0010
> #define HEAP_XMAX_EXCL_LOCK   0x0040
> #define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
> /*
>  * Different from _LOCK_BITS because it doesn't include LOCK_ONLY
>  */
> #define HEAP_LOCK_MASK        (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
> 
> #define HEAP_XMAX_IS_SHR_LOCKED(tup) \
>     (((tup)->t_infomask & HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK)
> #define HEAP_XMAX_IS_EXCL_LOCKED(tup) \
>     (((tup)->t_infomask & HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK)
> #define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \
>     (((tup)->t_infomask & HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK)
> 
> It makes checking for locks sightly more more complicated, but its not
> too bad...

Agreed; that seems reasonable.



Re: foreign key locks

From
Andres Freund
Date:
> > I agree that tripling FOR SHARE cost is risky.  Where is the added cost
> > concentrated?  Perchance that multiple belies optimization opportunities.
>
> Good question, let me play a bit.

Ok, I benchmarked around and from what I see there is no single easy
target.
The biggest culprits I could find are:
1. higher amount of XLogInsert calls per transaction (visible
in pgbench -t instead of -T mode while watching the WAL volume)
2. Memory allocations in GetMultiXactIdMembers
3. Memory allocations in mXactCachePuta) cache entry itselfb) the cache context
4. More lwlocks acquisitions

We can possibly optimize a bit with 2) by using a static buffer for
common member sizes, but thats not going to buy us too much...

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Noah Misch
Date:
On Sat, Nov 17, 2012 at 05:07:18PM +0100, Andres Freund wrote:
> > > I agree that tripling FOR SHARE cost is risky.  Where is the added cost
> > > concentrated?  Perchance that multiple belies optimization opportunities.
> >
> > Good question, let me play a bit.
> 
> Ok, I benchmarked around and from what I see there is no single easy
> target.
> The biggest culprits I could find are:
> 1. higher amount of XLogInsert calls per transaction (visible
> in pgbench -t instead of -T mode while watching the WAL volume)
> 2. Memory allocations in GetMultiXactIdMembers
> 3. Memory allocations in mXactCachePut
>  a) cache entry itself
>  b) the cache context
> 4. More lwlocks acquisitions
> 
> We can possibly optimize a bit with 2) by using a static buffer for
> common member sizes, but thats not going to buy us too much...

In that case, +1 for your proposal to prop up FOR SHARE.



Re: foreign key locks

From
Andres Freund
Date:
On 2012-11-14 13:27:26 -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>
> > > * In heap_lock_tuple's  XMAX_IS_MULTI case
> > >
> > > [snip]
> > >
> > > why is it membermode > mode and not membermode >= mode?
> >
> > Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
> > there was a deadlock possible here.  Maybe I should add a test to ensure
> > this doesn't happen.
>
> Done:
> https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770

Some more review bits, based on ffd6250d1d393f2ecb9bfc55c2c6f715dcece557

- if oldestMultiXactId + db is set and then that database is dropped we seem to have a problem because
MultiXactAdvanceOldestwon't overwrite those values. Should probably use SetMultiXactIdLimit directly.
 

- what stop multixacts only being filled out (i.e RecordNewMultiXact()) *after* the XLogInsert() *and* after a
MultiXactGetCheckptMulti()?Afaics MultiXactGenLock is not hold in CreateMultiXactId(). If we crash in that moment we
loosethe multixact data which now means potential data loss...
 

- multixact member group data crossing 512 sector boundaries makes me uneasy (as its 5 bytes). I don't really see a
scenariowhere its dangerous, but ... Does anybody see a problem here?
 

- there are quite some places that domultiStopLimit = multiWrapLimit - 100;if (multiStopLimit < FirstMultiXactId)
multiStopLimit-= FirstMultiXactId;
 
 perhaps MultiXactIdAdvance and MultiXactIdRetreat macros are in order?

- I find using a default: clause in switches with enum types where everything is expected to be handled like the
followinga bad idea, this way the compiler won't warn you if youve missed case's which makes changing/extending code
harder:   switch (rc->strength)    {        case LCS_FORNOKEYUPDATE:            newrc->markType = ROW_MARK_EXCLUSIVE;
        break;        case LCS_FORSHARE:            newrc->markType = ROW_MARK_SHARE;            break;        case
LCS_FORKEYSHARE:           newrc->markType = ROW_MARK_KEYSHARE;            break;        case LCS_FORUPDATE:
newrc->markType= ROW_MARK_KEYEXCLUSIVE;            break;        default:            elog(ERROR, "unsupported rowmark
type%d", rc->strength);    }
 
-
#if 0        /*         * The idea here is to remove the IS_MULTI bit, and replace the         * xmax with the
updater'sXid.  However, we can't really do it:         * modifying the Xmax is not allowed under our buffer locking
   * rules, unless we have an exclusive lock; but we don't know that         * we have it.  So the multi needs to
remainin place :-(         */        ResetMultiHintBit(tuple, buffer, xmax, true);
 
#endif
Three things:   - HeapTupleSatisfiesUpdate is actually always called exclusively locked ;)   - Extending something like
LWLockHeldByMeto also return the current     lockmode doesn't sound hard   - we seem to be using #ifdef NOT_YET for
suchcases
 

- Using a separate production for the lockmode seems to be nicer imo, not really important though
for_locking_item:        FOR UPDATE locked_rels_list opt_nowait
...        | FOR NO KEY UPDATE locked_rels_list opt_nowait
...        | FOR SHARE locked_rels_list opt_nowait
...        | FOR KEY SHARE locked_rels_list opt_nowait    ;

- not really padding, MultiXactStatus is 4bytes.../* * XXX Note: there's a lot of padding space in MultiXactMember.  We
could* find a more compact representation of this Xlog record -- perhaps all the * status flags in one XLogRecData,
thenall the xids in another one?  Not * clear that it's worth the trouble though. */
 
- why
#define SizeOfMultiXactCreate (offsetof(xl_multixact_create, nmembers) + sizeof(int32))
and not
#define SizeOfMultiXactCreate offsetof(xl_multixact_create, members)
- starting a critical section in GetNewMultiXactId but not ending it is ugly, but not new

Greetings,

Andres

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Andres Freund
Date:
On 2012-11-14 13:27:26 -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>
> > > * In heap_lock_tuple's  XMAX_IS_MULTI case
> > >
> > > [snip]
> > >
> > > why is it membermode > mode and not membermode >= mode?
> >
> > Uh, that's a bug.  Fixed.  As noticed in the comment above that snippet,
> > there was a deadlock possible here.  Maybe I should add a test to ensure
> > this doesn't happen.
>
> Done:
> https://github.com/alvherre/postgres/commit/df2847e38198e99f57e52490e1e9391ebb70d770
>
> (I don't think this is worth a v24 submission).

One more observation:       /*        * Get and lock the updated version of the row; if fail, return       NULL.
*/
-       copyTuple = EvalPlanQualFetch(estate, relation, LockTupleExclusive,
+       copyTuple = EvalPlanQualFetch(estate, relation, LockTupleNoKeyExclusive,

That doesn't seem to be correct to me. Why is it ok to acquire a
potentially too low locklevel here?

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Andres Freund wrote:


> - I find using a default: clause in switches with enum types where everything
>   is expected to be handled like the following a bad idea, this way the
>   compiler won't warn you if youve missed case's which makes changing/extending code harder:
>         switch (rc->strength)
>         {

I eliminated some of these default clauses, but the compiler is not
happy about not having some of them, so they remain.

> -
> #if 0
>             /*
>              * The idea here is to remove the IS_MULTI bit, and replace the
>              * xmax with the updater's Xid.  However, we can't really do it:
>              * modifying the Xmax is not allowed under our buffer locking
>              * rules, unless we have an exclusive lock; but we don't know that
>              * we have it.  So the multi needs to remain in place :-(
>              */
>             ResetMultiHintBit(tuple, buffer, xmax, true);
> #endif
>
>  Three things:
>     - HeapTupleSatisfiesUpdate is actually always called exclusively locked ;)
>     - Extending something like LWLockHeldByMe to also return the current
>       lockmode doesn't sound hard
>     - we seem to be using #ifdef NOT_YET for such cases

After spending some time trying to make this work usefully, I observed
that it's pointless, at least if we apply it only in
HeapTupleSatisfiesUpdate, because the IS_MULTI bit is going to be
removed by compute_new_xmax_infomask if the multi is gone.  Something
like this would be useful if we could do it in other places; say when
we're merely scanning page without the specific intention of modifying
that particular tuple.  Maybe in heap_prune_page, for example.  ISTM we
can see that as an optimization we can add later.

For the record, the implementation of ResetMultiHintBit looks like this:

/** When a tuple is found to be marked by a MultiXactId, all whose members are* completed transactions, the
HEAP_XMAX_IS_MULTIbit can be removed.* Additionally, at the request of caller, we can set the Xmax to the given* Xid,
andset HEAP_XMAX_COMMITTED.** Note that per buffer access rules, the caller to this function must hold* exclusive
contentlock on the affected buffer.*/ 
static inline void
ResetMultiHintBit(HeapTupleHeader tuple, Buffer buffer,                  TransactionId xid, bool mark_committed)
{Assert(LWLockModeHeld((&BufferDescriptors[buffer])->content_lock ==
LW_EXCLUSIVE));Assert(tuple->t_infomask& HEAP_XMAX_IS_MULTI);Assert(!(tuple->t_infomask &
HEAP_XMAX_INVALID));Assert(!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple)));
tuple->t_infomask &= ~HEAP_XMAX_BITS;HeapTupleHeaderSetXmax(tuple, xid);if (!TransactionIdIsValid(xid))
tuple->t_infomask|= HEAP_XMAX_INVALID;/* note that HEAP_KEYS_UPDATED persists, if set */ 
if (mark_committed)    SetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED, xid);else
SetBufferCommitInfoNeedsSave(buffer);
}

(This is removed by commit 52f86f82fb3e01a6ed0b8106bac20f319bb3ad0a in
my github tree, but that commit might be lost in the future, hence this
paste.)


Some of your other observations have been fixed by commits that I have
pushed to my github tree.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Andres Freund
Date:
On 2012-11-20 17:36:14 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
>
>
> > - I find using a default: clause in switches with enum types where everything
> >   is expected to be handled like the following a bad idea, this way the
> >   compiler won't warn you if youve missed case's which makes changing/extending code harder:
> >         switch (rc->strength)
> >         {
>
> I eliminated some of these default clauses, but the compiler is not
> happy about not having some of them, so they remain.

You have removed the ones that looked removable to me...

>
> > -
> > #if 0
> >             /*
> >              * The idea here is to remove the IS_MULTI bit, and replace the
> >              * xmax with the updater's Xid.  However, we can't really do it:
> >              * modifying the Xmax is not allowed under our buffer locking
> >              * rules, unless we have an exclusive lock; but we don't know that
> >              * we have it.  So the multi needs to remain in place :-(
> >              */
> >             ResetMultiHintBit(tuple, buffer, xmax, true);
> > #endif
> >
> >  Three things:
> >     - HeapTupleSatisfiesUpdate is actually always called exclusively locked ;)
> >     - Extending something like LWLockHeldByMe to also return the current
> >       lockmode doesn't sound hard
> >     - we seem to be using #ifdef NOT_YET for such cases
>
> After spending some time trying to make this work usefully, I observed
> that it's pointless, at least if we apply it only in
> HeapTupleSatisfiesUpdate, because the IS_MULTI bit is going to be
> removed by compute_new_xmax_infomask if the multi is gone.  Something
> like this would be useful if we could do it in other places; say when
> we're merely scanning page without the specific intention of modifying
> that particular tuple.  Maybe in heap_prune_page, for example.  ISTM we
> can see that as an optimization we can add later.

heap_prune_page sounds like a good fit, yes. One other place would be
when following hot chains, but the locking would be far more critical in
that path, so its less likely to be beneficial.
Pushing it off sounds good to me.

> Some of your other observations have been fixed by commits that I have
> pushed to my github tree.

A short repetition of the previous pgbench run of many SELECT FOR
SHARE's:

10s test:
master:  22673 24637 23874 25527
fklocks: 24835 24601 24606 24868

60s test:
master:  32609 33087
fklocks: 33350 33359

Very nice!

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Here's version 24.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Here's version 24.

Old review emails still contains some items that didn't lead to any
changes.  I tried to keep close track of those.  To that list I add a
couple of things of my own.  Here they are, for those following along.
I welcome suggestions.


- Fix the multixact cache in multixact.c -- see mXactCacheEnt.

- ResetHintBitMulti() was removed; need to remove old XMAX_IS_MULTI somewhere; perhaps during HOT page pruning?

- EvalPlanQual and ExecLockRows maybe need a different overall locking strategy.  Right now follow_updates=false is
usedto avoid deadlocks. 

- Ensure multixact.c behaves sanely on wraparound.

- Is the case of a a non-key-update followed by a key-update actually handled when doing a heap_lock_tuple with mode =
LockTupleKeyShareand follow_updates = false? I don't really see how, so it seems to deserve at least a comment. 

- if oldestMultiXactId + db is set and then that database is dropped we seem to have a problem because
MultiXactAdvanceOldestwon't overwrite those values. Should probably use SetMultiXactIdLimit directly. 

- what stop multixacts only being filled out (i.e RecordNewMultiXact()) *after* the XLogInsert() *and* after a
MultiXactGetCheckptMulti()?Afaics MultiXactGenLock is not hold in CreateMultiXactId(). If we crash in that moment we
loosethe multixact data which now means potential data loss... 

- multixact member group data crossing 512 sector boundaries makes me uneasy (as its 5 bytes). I don't really see a
scenariowhere its dangerous, but ... Does anybody see a problem here? 

- there are quite some places that domultiStopLimit = multiWrapLimit - 100;if (multiStopLimit < FirstMultiXactId)
multiStopLimit-= FirstMultiXactId; 
 perhaps MultiXactIdAdvance and MultiXactIdRetreat macros are in order?

- not really padding, MultiXactStatus is 4bytes.../* * XXX Note: there's a lot of padding space in MultiXactMember.  We
could* find a more compact representation of this Xlog record -- perhaps all the * status flags in one XLogRecData,
thenall the xids in another one?  Not * clear that it's worth the trouble though. */ 

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
"Kevin Grittner"
Date:
Alvaro Herrera wrote:

> Here's version 24.

This no longer applies after the rmgr rm_desc patch.

-Kevin



Re: foreign key locks

From
"Kevin Grittner"
Date:
Kevin Grittner wrote:
> Alvaro Herrera wrote:
>
>> Here's version 24.
>
> This no longer applies after the rmgr rm_desc patch.

I took a shot at merging those so I could review the patch against
HEAD; attached in hopes that it will be useful.

Hopefully this isn't too far off from what .

-Kevin

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Kevin Grittner wrote:
> Kevin Grittner wrote:
> > Alvaro Herrera wrote:
> >
> >> Here's version 24.
> >
> > This no longer applies after the rmgr rm_desc patch.
>
> I took a shot at merging those so I could review the patch against
> HEAD; attached in hopes that it will be useful.
>
> Hopefully this isn't too far off from what .

Uh, sorry for remaining silent -- I'm about to post an updated patch,
having just pushed my merge to github.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Kevin Grittner wrote:
> > Kevin Grittner wrote:
> > > Alvaro Herrera wrote:
> > >
> > >> Here's version 24.
> > >
> > > This no longer applies after the rmgr rm_desc patch.
> >
> > I took a shot at merging those so I could review the patch against
> > HEAD; attached in hopes that it will be useful.
> >
> > Hopefully this isn't too far off from what .
>
> Uh, sorry for remaining silent -- I'm about to post an updated patch,
> having just pushed my merge to github.

Here it is.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks - EvalPlanQual interactions

From
Andres Freund
Date:
On 2012-11-27 12:07:34 -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> > Here's version 24.
>
> Old review emails still contains some items that didn't lead to any
> changes.  I tried to keep close track of those.  To that list I add a
> couple of things of my own.  Here they are, for those following along.
> I welcome suggestions.
>
> - EvalPlanQual and ExecLockRows maybe need a different overall locking
>   strategy.  Right now follow_updates=false is used to avoid deadlocks.

I think this really might need some work. Afaics the EPQ code now needs
to actually determine what locklevel needs to be passed to
EvalPlanQualFetch via EvalPlanQual otherwise some of the locking issues
that triggered all this remain. That sucks a bit from a modularity
perspective, but I don't see another way.

Greetings,

Andres

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
"Erik Rijkers"
Date:
On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
> Here it is.
>
>   fklocks-26.patch.gz

This applies today after removing, not only the infamous catversion.h chunk, but also a file_fdw
chunk. (It's not really a problem.)

I also wondered if anyone has any perl/python/bash programs handy that uses these new options.  I
can hack something together myself, but I just thought I'd ask.


Thanks,


Erik Rijkers





Re: foreign key locks

From
Andres Freund
Date:
On 2012-12-22 10:53:47 +0100, Erik Rijkers wrote:
> On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
> > Here it is.
> >
> >   fklocks-26.patch.gz
>
> This applies today after removing, not only the infamous catversion.h chunk, but also a file_fdw
> chunk. (It's not really a problem.)
>
> I also wondered if anyone has any perl/python/bash programs handy that uses these new options.  I
> can hack something together myself, but I just thought I'd ask.

I have mostly done code review and some very targeted testing (pgbench,
gdb + psql). So no ready scripts, sorry.
There are imo some unfinished pieces (the whole EPQ integration) that
will require significant retesting once Alvaro has time to work on this
again..

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Erik Rijkers wrote:
> On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
> > Here it is.
> >
> >   fklocks-26.patch.gz
>
> This applies today after removing, not only the infamous catversion.h chunk, but also a file_fdw
> chunk. (It's not really a problem.)

FWIW, while it's probably not interesting to you, it must be noted that
the catversion.h number to use must match a
contrib/pg_upgrade/pg_upgrade.h symbol.

As far as file_fdw hunks, I only see this one:

--- a/contrib/file_fdw/output/file_fdw.source
+++ b/contrib/file_fdw/output/file_fdw.source
@@ -191,7 +191,7 @@ ERROR:  cannot change foreign table "agg_csv"
 DELETE FROM agg_csv WHERE a = 100;
 ERROR:  cannot change foreign table "agg_csv"
 SELECT * FROM agg_csv FOR UPDATE OF agg_csv;
-ERROR:  SELECT FOR UPDATE/SHARE cannot be used with foreign table "agg_csv"
+ERROR:  SELECT FOR UPDATE/SHARE/KEY UPDATE/KEY SHARE cannot be used with foreign table "agg_csv"
 LINE 1: SELECT * FROM agg_csv FOR UPDATE OF agg_csv;
                                             ^
 -- but this should be ignored

Surely that needs to be patched still?  And this hunk applies without
any trouble, so I don't see why you had to remove it.

Anyway, here's v27, which is the same code merged to latest master.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Alvaro Herrera
Date:
Here's version 28 of this patch.  The only substantive change here from
v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
or LockTupleNoKeyExclusive, depending on whether the key columns are
being modified by the update.  This needs to go through EvalPlanQual, so
that function is now getting the lock mode as a parameter instead of
hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
still use LockTupleExclusive, so there's no change for anybody other
than FOR EACH ROW BEFORE UPDATE triggers).

At this point, I think I've done all I wanted to do in connection with
this patch, and I intend to commit.  I think this is a good time to get
it committed, so that people can play with it more extensively and
report any breakage I might have caused before we even get into beta.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Andres Freund
Date:
On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
> Here's version 28 of this patch.  The only substantive change here from
> v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> or LockTupleNoKeyExclusive, depending on whether the key columns are
> being modified by the update.  This needs to go through EvalPlanQual, so
> that function is now getting the lock mode as a parameter instead of
> hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
> still use LockTupleExclusive, so there's no change for anybody other
> than FOR EACH ROW BEFORE UPDATE triggers).

Is that enough in case of a originally non-key update in read committed
mode that turns into a key update due to a concurrent key update?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Andres Freund wrote:
> On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
> > Here's version 28 of this patch.  The only substantive change here from
> > v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> > or LockTupleNoKeyExclusive, depending on whether the key columns are
> > being modified by the update.  This needs to go through EvalPlanQual, so
> > that function is now getting the lock mode as a parameter instead of
> > hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
> > still use LockTupleExclusive, so there's no change for anybody other
> > than FOR EACH ROW BEFORE UPDATE triggers).
>
> Is that enough in case of a originally non-key update in read committed
> mode that turns into a key update due to a concurrent key update?

Hm, let me try to work through your example.  You say that a transaction
T1 does a non-key update, and is working through the BEFORE UPDATE
trigger; then transaction T2 does a key update and changes the key
underneath T1?  So at that point T1 becomes a key update, because it's
now using the original key values which are no longer the key?

I don't think this can really happen, because T2 (which is requesting
TupleLockExclusive) would block on the lock that the trigger is grabbing
(TupleLockNoKeyExclusive) on the tuple.  So T2 would sleep until T1 is
committed.


Now, maybe you meant that the BEFORE UPDATE trigger changes the key
value but the user-supplied UPDATE query does not.  So the trigger turns
the no-key update into a key update.  What would happen here is that
GetTupleForTrigger would acquire TupleLockNoKeyExclusive on the tuple,
and later heap_update would acquire TupleLockExclusive.  So there is
lock escalation happening.  This could cause a deadlock against someone
else grabbing a TupleLockKeyShare on the tuple.  I think the answer is
"don't do that", i.e. don't update the key columns in a BEFORE UPDATE
trigger.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Andres Freund
Date:
On 2013-01-11 12:11:47 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
> > > Here's version 28 of this patch.  The only substantive change here from
> > > v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> > > or LockTupleNoKeyExclusive, depending on whether the key columns are
> > > being modified by the update.  This needs to go through EvalPlanQual, so
> > > that function is now getting the lock mode as a parameter instead of
> > > hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
> > > still use LockTupleExclusive, so there's no change for anybody other
> > > than FOR EACH ROW BEFORE UPDATE triggers).
> >
> > Is that enough in case of a originally non-key update in read committed
> > mode that turns into a key update due to a concurrent key update?
>
> Hm, let me try to work through your example.  You say that a transaction
> T1 does a non-key update, and is working through the BEFORE UPDATE
> trigger; then transaction T2 does a key update and changes the key
> underneath T1?  So at that point T1 becomes a key update, because it's
> now using the original key values which are no longer the key?
>
> I don't think this can really happen, because T2 (which is requesting
> TupleLockExclusive) would block on the lock that the trigger is grabbing
> (TupleLockNoKeyExclusive) on the tuple.  So T2 would sleep until T1 is
> committed.

No, I was thinking about an update without triggers present.

T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
T1: BEGIN; -- read committed
T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id = 1 */
T2: BEGIN; -- read committed
T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key update, waiting */
T1: COMMIT;
T2: /* UPDATE follows to updated row, due to the changed name its a key update now */

Does that make sense?

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Andres Freund wrote:

> No, I was thinking about an update without triggers present.
>
> T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
> T1: BEGIN; -- read committed
> T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id = 1 */
> T2: BEGIN; -- read committed
> T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key update, waiting */
> T1: COMMIT;
> T2: /* UPDATE follows to updated row, due to the changed name its a key update now */
>
> Does that make sense?

So I guess your question is "is T2 now holding a TupleLockExclusive
lock?"  To answer it, I turned your example into a isolationtester spec:

setup
{
CREATE TABLE tbl(id serial primary key, name text unique, data text);
INSERT INTO tbl VALUES (1, 'blarg', 'no data');
}

teardown
{
DROP TABLE tbl;
}

session "s1"
step "s1b" { BEGIN; }
step "s1u" { UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; }
step "s1c" { COMMIT; }

session "s2"
step "s2b" { BEGIN; }
step "s2u" { UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; }
step "s2c" { COMMIT; }

session "s3"
step "s3l" { SELECT * FROM tbl FOR KEY SHARE; }

permutation "s1b" "s1u" "s2b" "s2u" "s1c" "s3l" "s2c"


And the results:
Parsed test spec with 3 sessions

starting permutation: s1b s1u s2b s2u s1c s3l s2c
step s1b: BEGIN;
step s1u: UPDATE tbl SET name = 'foo' WHERE name = 'blarg';
step s2b: BEGIN;
step s2u: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; <waiting ...>
step s1c: COMMIT;
step s2u: <... completed>
step s3l: SELECT * FROM tbl FOR KEY SHARE; <waiting ...>
step s2c: COMMIT;
step s3l: <... completed>
id             name           data

1              blarg          blarg


So session 3 is correctly waiting for session 2 to finish before being
ablt to grab its FOR KEY SHARE lock, indicating that session 2 is
holding a FOR UPDATE lock.  Good.

If I change session 1 to update the data column instead of name, session
3 no longer needs to wait for session 2, meaning session 2 now only
grabs a FOR NO KEY UPDATE lock.  Also good.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Andres Freund
Date:
On 2013-01-11 13:10:49 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
>
> > No, I was thinking about an update without triggers present.
> >
> > T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
> > T1: BEGIN; -- read committed
> > T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id = 1 */
> > T2: BEGIN; -- read committed
> > T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key update, waiting */
> > T1: COMMIT;
> > T2: /* UPDATE follows to updated row, due to the changed name its a key update now */
> >
> > Does that make sense?
>
> So I guess your question is "is T2 now holding a TupleLockExclusive
> lock?"  To answer it, I turned your example into a isolationtester spec:

Great! I reread the code and it does make sense the way its implemented
now. I misremembered something...

I vote for adding that spectest including some appropriate permutations.

FWIW: Looks good to me. It could use another pair of eyes, but I guess
that will have to come by being used.

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Here's version 28 of this patch.  The only substantive change here from
> v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> or LockTupleNoKeyExclusive, depending on whether the key columns are
> being modified by the update.  This needs to go through EvalPlanQual, so
> that function is now getting the lock mode as a parameter instead of
> hardcoded LockTupleExclusive.  (All other users of GetTupleForTrigger
> still use LockTupleExclusive, so there's no change for anybody other
> than FOR EACH ROW BEFORE UPDATE triggers).
>
> At this point, I think I've done all I wanted to do in connection with
> this patch, and I intend to commit.  I think this is a good time to get
> it committed, so that people can play with it more extensively and
> report any breakage I might have caused before we even get into beta.

While trying this out this morning I noticed a bug in the XLog code:
trying to lock the updated version of a tuple when the page that
contains the updated version requires a backup block, would cause this:

PANIC:  invalid xlog record length 0

The reason is that there is an (unknown to me) rule that there must be
some data not associated with a buffer:

    /*
     * NOTE: We disallow len == 0 because it provides a useful bit of extra
     * error checking in ReadRecord.  This means that all callers of
     * XLogInsert must supply at least some not-in-a-buffer data. [...]
     */

This seems pretty strange to me.  And having the rule be spelled out
only in a comment within XLogInsert and not at its top, and not nearby
the XLogRecData struct definition either, seems pretty strange to me.
I wonder how come every PG hacker except me knows this.

For the curious, I attach an isolationtester spec file I used to
reproduce the problem (and verify the fix).

To fix the crash I needed to do this:

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 99fced1..9762890 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4838,7 +4838,7 @@ l4:
         {
             xl_heap_lock_updated xlrec;
             XLogRecPtr    recptr;
-            XLogRecData    rdata;
+            XLogRecData    rdata[2];
             Page        page = BufferGetPage(buf);

             xlrec.target.node = rel->rd_node;
@@ -4846,13 +4846,18 @@ l4:
             xlrec.xmax = new_xmax;
             xlrec.infobits_set = compute_infobits(new_infomask, new_infomask2);

-            rdata.data = (char *) &xlrec;
-            rdata.len = SizeOfHeapLockUpdated;
-            rdata.buffer = buf;
-            rdata.buffer_std = true;
-            rdata.next = NULL;
+            rdata[0].data = (char *) &xlrec;
+            rdata[0].len = SizeOfHeapLockUpdated;
+            rdata[0].buffer = InvalidBuffer;
+            rdata[0].next = &(rdata[1]);
+
+            rdata[1].data = NULL;
+            rdata[1].len = 0;
+            rdata[1].buffer = buf;
+            rdata[1].buffer_std = true;
+            rdata[1].next = NULL;

-            recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED, &rdata);
+            recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED, rdata);

             PageSetLSN(page, recptr);
             PageSetTLI(page, ThisTimeLineID);

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: foreign key locks

From
Simon Riggs
Date:
On 18 January 2013 20:02, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> XLOG_HEAP2_LOCK_UPDATED

Every xlog record needs to know where to put the block.

Compare with XLOG_HEAP_LOCK

xlrec.target.node = relation->rd_node;

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> The reason is that there is an (unknown to me) rule that there must be
> some data not associated with a buffer:

>     /*
>      * NOTE: We disallow len == 0 because it provides a useful bit of extra
>      * error checking in ReadRecord.  This means that all callers of
>      * XLogInsert must supply at least some not-in-a-buffer data. [...]
>      */

> This seems pretty strange to me.  And having the rule be spelled out
> only in a comment within XLogInsert and not at its top, and not nearby
> the XLogRecData struct definition either, seems pretty strange to me.
> I wonder how come every PG hacker except me knows this.

I doubt it ever came up before.  What use is logging only the content of
a buffer page?  Surely you'd need to know, for example, which relation
and page number it is from.
        regards, tom lane



Re: foreign key locks

From
Andres Freund
Date:
On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > The reason is that there is an (unknown to me) rule that there must be
> > some data not associated with a buffer:
> 
> >     /*
> >      * NOTE: We disallow len == 0 because it provides a useful bit of extra
> >      * error checking in ReadRecord.  This means that all callers of
> >      * XLogInsert must supply at least some not-in-a-buffer data. [...]
> >      */
> 
> > This seems pretty strange to me.  And having the rule be spelled out
> > only in a comment within XLogInsert and not at its top, and not nearby
> > the XLogRecData struct definition either, seems pretty strange to me.
> > I wonder how come every PG hacker except me knows this.
> 
> I doubt it ever came up before.  What use is logging only the content of
> a buffer page?  Surely you'd need to know, for example, which relation
> and page number it is from.

It only got to be a length of 0 because the the data got removed due to
a logged full page write. And the backup block contains the data about
which blocks it is logging in itself.
I wonder if the check shouldn't just check write_len instead, directly
below the loop that ads backup blocks.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
>> I doubt it ever came up before.  What use is logging only the content of
>> a buffer page?  Surely you'd need to know, for example, which relation
>> and page number it is from.

> It only got to be a length of 0 because the the data got removed due to
> a logged full page write. And the backup block contains the data about
> which blocks it is logging in itself.

And if the full-page-image case *hadn't* been invoked, what then?  I
still don't see a very good argument for xlog records with no fixed
data.

> I wonder if the check shouldn't just check write_len instead, directly
> below the loop that ads backup blocks.

We're not changing this unless you can convince me that the read-side
error check mentioned in the comment is useless.
        regards, tom lane



Re: foreign key locks

From
Andres Freund
Date:
On 2013-01-18 16:01:18 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
> >> I doubt it ever came up before.  What use is logging only the content of
> >> a buffer page?  Surely you'd need to know, for example, which relation
> >> and page number it is from.
> 
> > It only got to be a length of 0 because the the data got removed due to
> > a logged full page write. And the backup block contains the data about
> > which blocks it is logging in itself.
> 
> And if the full-page-image case *hadn't* been invoked, what then?  I
> still don't see a very good argument for xlog records with no fixed
> data.

In that case data would have been logged?

The code in question was:
           xl_heap_lock_updated xlrec;           xlrec.target.node = rel->rd_node;
...           xlrec.xmax = new_xmax;

-           rdata.data = (char *) &xlrec;
-           rdata.len = SizeOfHeapLockUpdated;
-           rdata.buffer = buf;
-           rdata.buffer_std = true;
-           rdata.next = NULL;
-           recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED, &rdata);

Other wal logging code (and fklocks now as well) just put those into
two XLogRecData blocks to avoid the issue.

> > I wonder if the check shouldn't just check write_len instead, directly
> > below the loop that ads backup blocks.
> 
> We're not changing this unless you can convince me that the read-side
> error check mentioned in the comment is useless.

Youre right, the read side check is worth quite a bit. I think I am
retracting my suggestion.
I guess the amount of extra data thats uselessly logged although never
used in in the redo functions doesn't really amass to anything
significant in comparison to the backup block data.


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: foreign key locks

From
Simon Riggs
Date:
On 18 January 2013 21:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
>> On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
>>> I doubt it ever came up before.  What use is logging only the content of
>>> a buffer page?  Surely you'd need to know, for example, which relation
>>> and page number it is from.
>
>> It only got to be a length of 0 because the the data got removed due to
>> a logged full page write. And the backup block contains the data about
>> which blocks it is logging in itself.
>
> And if the full-page-image case *hadn't* been invoked, what then?  I
> still don't see a very good argument for xlog records with no fixed
> data.
>
>> I wonder if the check shouldn't just check write_len instead, directly
>> below the loop that ads backup blocks.
>
> We're not changing this unless you can convince me that the read-side
> error check mentioned in the comment is useless.


There's some confusion here, I think. Alvaro wasn't proposing a WAL
record that had no fixed data.

The current way XLogRecData works is that if you put data and buffer
together on the same chunk then we optimise the data away if we take a
backup block.

Alvaro chose what looks like the right way to do this when you have
simple data: use one XLogRecData chunk and let the data part get
optimised away. But doing that results in a WAL record that has a
backup block, plus data of 0 length, which then fails the later check.

All current code gets around this by including multiple XLogRecData
chunks, which results in including additional data that is superfluous
when the backup block is present. Alvaro was questioning this; I
didn't understand that either when he said it, but I do now.

The zero length check should stay, definitely.

What this looks like is that further compression of the WAL is
possible, but given its alongside backup blocks, that's on the order
of a 1% saving, so probably isn't worth considering right now. The way
to do that would to include a small token to show record has been
optimised, rather than being zero length.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: foreign key locks

From
Alvaro Herrera
Date:
I just pushed this patch to the master branch.  There was a
corresponding catversion bump and pg_control version bump.  I have
verified that "make check-world" passes on my machine, as well as
isolation tests and pg_upgrade.

Tom Lane said at one point "this is too complex to maintain".  Several
times during the development I feared he might well be right.  I am sure
he will be discovering many oversights and poor design choices, when
gets around to reviewing the code; hopefully some extra effort will be
all that's needed to make the whole thing work sanely and not eat
anyone's data.  I just hope that nothing so serious comes up that the
patch will need to be reverted completely.

This patch is the result of the work of many people.  I am not allowed
to mention the patch sponsors in the commit message, so I'm doing it
here: first and foremost I need to thank my previous and current
employers Command Prompt and 2ndQuadrant -- they were extremely kind in
allowing me to work on this for days on end (and not all of it was
supported by financial sponsors).  Joel Jacobson started the whole
effort by posting a screencast of a problem his company was having; I
hope they found a workaround in the meantime, because his post was in
mid 2010.  The key idea of this patch' design came from Simon Riggs;
Noah Misch provided additional design advice that allowed the project
torun to completion.  Noah and Andres Freund spent considerable time
reviewing early versions of this patch; they uncovered many embarrasing
bugs in my implementation.  Kevin Grittner, Robert Haas, and others,
provided useful comments as well.  Noah Misch, Andres Freund, Marti
Raudsepp and Alexander Shulgin also provided bits of code.

Any bugs that remain are, of course, my own fault only.

Financial support came from

* Command Prompt Inc. of Washington, US
* 2ndQuadrant Ltd. of United Kingdom
* Trustly (then Glue Finance) of Sweden
* Novozymes A/S of Denmark
* MailerMailer LLC of Maryland, US
* PostgreSQL Experts, Inc. of California, US.

My sincerest thanks to everyone.


I seriously hope that no patch of mine ever becomes this monstruous
again.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services