(Followup) Request for suggestions - Mailing list pgsql-hackers

From Stephan Szabo
Subject (Followup) Request for suggestions
Date
Msg-id 20021009073209.O3488-100000@megazone23.bigpanda.com
Whole thread Raw
In response to Request for suggestions  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
List pgsql-hackers

I wasn't particularly clear (sorry, wrote the message
1/2 right before bed, 1/2 right after getting up) so
I'm going to followup with details and hope that
I'm more awake.

A little background just in case there are people
that haven't looked.

Right now, foreign key checks always default to using
HeapTupleSatisfiesNow to check for the validity of
rows and uses for update to do the locking.  I believe
that we can use something like the lock suggested
by Alex Hayard which does not actually lock the row
but only waits for concurrent modification that
actually has a lock to finish, except that to do
so would make the constraint fail, unless checks
for changes to the primary key actually could see
uncommitted changes to the foreign key table.  Unless
the old row being worked on was made by this transaction
in which case you shouldn't need to do a dirty check.



To that end, I've put together some modifications
for testing on my local system (since I'm not 100%
sure that the above is true for all cases) where
the primary key triggers actually use
HeapTupleSatisfiesDirty using the
ReferentialIntegritySnapshotOverride hack (making
it contain three values, none, use now, use dirty)
and added a for foreign key specifier to selects
which has the semantics of Alex's lock.  The code
in heap_mark4fk is called in effectively the same place
as heap_mark4update in the execution code.  Basically
if (...) { heap_mark4update() } else { heap_mark4fk() }.

However, the heap_mark4update code (which I cribbed
the mark4fk code from) doesn't like getting rows which
HeapTupleSatisfiesUpdate says are invisible (it throws an
invalid tid error IIRC).
Right now, I'm waiting for the transaction that made
the row to complete and returning HeapTupleUpdated if
it rolled back and HeapTupleMayBeUpdated if it didn't,
but I know that's wrong.

I think the logic needs to be something like:If the row is invisible, If the row has xmax==0, wait for xmin to complete
If the transaction rolled back, ignore the row.  Otherwise, check to see if someone else has locked it.   If so, go
backto the the HeapTupleSatisfiesUpdate test   Otherwise, work with the row as it was. Otherwise,  If xmax==xmin, we
wantto ignore the row  Otherwise, -- can this case even occur? --   Wait on xmax per normal rules of heap_mark4update
 
but I'm very fuzzy on this.

In addition, at some point I'm going to have to modify
the actual referential actions (as opposed to no action)
to do a similar check, which means I'm going to want
a delete or update statement which needs to wait
on uncommitted transactions to modify the rows.  It looks
like heap_delete and heap_update also will error
on rows that HeapTupleSatisfiesUpdate says are invisible.
For heap_mark4fk it was reasonably safe to change the
result==HeapTupleInvisible case since it was new code
that I was adding, but I'm a bit leery about doing
something similar to heap_delete or heap_update.
Is the coding for result==HeapTupleInvisible in those
functions meant as a defensive measure that shouldn't
occur?



pgsql-hackers by date:

Previous
From: Sir Mordred The Traitor
Date:
Subject: Just a thought
Next
From: Vince Vielhaber
Date:
Subject: Re: Just a thought