Thread: Re: [HACKERS] CIC and deadlocks

Re: [HACKERS] CIC and deadlocks

From
"Pavan Deolasee"
Date:

On 4/1/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Good point.  I'm envisioning a procarray.c function along the
lines of
        bool TransactionHasSnapshot(xid)
which returns true if the xid is currently listed in PGPROC
and has a nonzero xmin.  CIC's cleanup wait loop would check
this and ignore the xid if it returns false.  Your point means
that this function would have to take exclusive not shared lock
while scanning the procarray, which is kind of annoying, but
it seems not fatal since CIC isn't done all that frequently.


When I looked at the code, it occurred to me that possibly we are
OK with just taking shared lock on the procarray. That means that
some other transaction can concurrently set its serializable snapshot
while we are scanning the procarray. But that should not harm us:
if we see the snapshot set, we wait for the transaction. A transaction
which is setting its serializable snapshot NOW, can not see the
tuples that we did not index, isn't it ?

A patch based on the discussion is attached.

Thanks,
Pavan

--

EnterpriseDB     http://www.enterprisedb.com
Attachment

Re: [HACKERS] CIC and deadlocks

From
Tom Lane
Date:
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> When I looked at the code, it occurred to me that possibly we are
> OK with just taking shared lock on the procarray. That means that
> some other transaction can concurrently set its serializable snapshot
> while we are scanning the procarray. But that should not harm us:
> if we see the snapshot set, we wait for the transaction. A transaction
> which is setting its serializable snapshot NOW, can not see the
> tuples that we did not index, isn't it ?

[ itch... ]  The problem is with time-extended execution of
GetSnapshotData; what happens if the other guy lost the CPU for a good
long time while in the middle of GetSnapshotData?  He might set his
xmin based on info you saw as long gone.

You might be correct that it's safe, but the argument would have to
hinge on the OldestXmin process being unable to commit because of
someone holding shared ProcArrayLock; a point you are definitely not
making above.  (Study the comments in GetSnapshotData for awhile,
also those in xact.c's commit-related code.)

I'm about to head to bed and am certainly in no condition to carry the
proof through.  Have at it ...

            regards, tom lane

Re: [HACKERS] CIC and deadlocks

From
"Pavan Deolasee"
Date:

On 4/11/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ itch... ]  The problem is with time-extended execution of
GetSnapshotData; what happens if the other guy lost the CPU for a good
long time while in the middle of GetSnapshotData?  He might set his
xmin based on info you saw as long gone.

You might be correct that it's safe, but the argument would have to
hinge on the OldestXmin process being unable to commit because of
someone holding shared ProcArrayLock; a point you are definitely not
making above.  (Study the comments in GetSnapshotData for awhile,
also those in xact.c's commit-related code.)


My argument was based on what you said above, but I obviously did not
state it well :)

Anyways, I think its better to be safe and we agree that its not such a
bad thing to take exclusive lock on procarray because CIC is not something
that happens very often. Attached is a revised patch which takes exclusive
lock on the procarray, rest remaining the same.

Thanks,
Pavan

--

EnterpriseDB     http://www.enterprisedb.com
Attachment

Re: [HACKERS] CIC and deadlocks

From
Bruce Momjian
Date:
This has been saved for the 8.4 release:

    http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Pavan Deolasee wrote:
> On 4/11/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> >
> > [ itch... ]  The problem is with time-extended execution of
> > GetSnapshotData; what happens if the other guy lost the CPU for a good
> > long time while in the middle of GetSnapshotData?  He might set his
> > xmin based on info you saw as long gone.
> >
> > You might be correct that it's safe, but the argument would have to
> > hinge on the OldestXmin process being unable to commit because of
> > someone holding shared ProcArrayLock; a point you are definitely not
> > making above.  (Study the comments in GetSnapshotData for awhile,
> > also those in xact.c's commit-related code.)
> >
> >
> My argument was based on what you said above, but I obviously did not
> state it well :)
>
> Anyways, I think its better to be safe and we agree that its not such a
> bad thing to take exclusive lock on procarray because CIC is not something
> that happens very often. Attached is a revised patch which takes exclusive
> lock on the procarray, rest remaining the same.
>
> Thanks,
> Pavan
>
> --
>
> EnterpriseDB     http://www.enterprisedb.com

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] CIC and deadlocks

From
Tom Lane
Date:
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> [ patch to reduce probability of deadlock of CREATE INDEX CONCURRENTLY
>   with other things ]

This patch no longer applies because of the VirtualXid changes.
Looking at it again, I'm fairly dissatisfied with it anyway;
I really don't like moving the GetTransactionSnapshot calls around
like that, because it opens a risk that GetTransactionSnapshot won't
get called at all.

Since the autovacuum case is already dealt with separately, I'm
thinking there is no problem here that we actually need to solve.
C.I.C. can never be guaranteed free of deadlock risk, so I don't
see a lot of value in making it free of deadlock risk against
just CLUSTER and VACUUM FULL.

            regards, tom lane