Re: [PATCH] lock_timeout and common SIGALRM framework - Mailing list pgsql-hackers

From Boszormenyi Zoltan
Subject Re: [PATCH] lock_timeout and common SIGALRM framework
Date
Msg-id 506019E2.4010806@cybertec.at
Whole thread Raw
In response to Re: [PATCH] lock_timeout and common SIGALRM framework  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
2012-09-22 20:49 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> new version with a lot more cleanup is attached.
> I looked at this patch, and frankly I'm rather dismayed.  It's a mess.

Thank you for the kind words. :-)

> To start at the bottom level, the changes to PGSemaphoreLock broke it,
> and seem probably unnecessary anyway.  As coded, calling the "condition
> checker" breaks handling of error returns from semop(), unless the checker
> is careful to preserve errno, which LmgrTimeoutCondition isn't (and really
> shouldn't need to be anyway).

I would call it an oversight, nothing more.

>    More, if the checker does return true,
> it causes PGSemaphoreLock to utterly violate its contract: it returns to
> the caller without having acquired the semaphore, and without even telling
> the caller so.  Worse, if we *did* acquire the semaphore, we might still
> exit via this path, since the placement of the condition check call
> ignores the comment a few lines up:
>
>       * Once we acquire the lock, we do NOT check for an interrupt before
>       * returning.  The caller needs to be able to record ownership of the lock
>       * before any interrupt can be accepted.
>
> We could possibly fix all this with a redesigned API contract for
> PGSemaphoreLock, but frankly I do not see a good reason to be tinkering
> with it at all.  We never needed to get it involved with deadlock check
> handling, and I don't see why that needs to change for lock timeouts.
>
> One very good reason why monkeying with PGSemaphoreLock is wrong is that
> on some platforms a SIGALRM interrupt won't interrupt the semop() call,
> and thus control would never reach the "checker" anyway.  If we're going
> to throw an error, it must be thrown from the interrupt handler.

Then please, explain to me, how on Earth can the current deadlock_timeout
can report the error? Sure, I can see the PG_TRY() ... PG_END_TRY() block in
lock.c but as far as I can see, nothing in the CheckDeadLock() -> DeadLockCheck()
-> DeadLockCheckRecurse() path diverts the code to return to a different
address from the signal handler, i.e. there is no elog(ERROR) or ereport(ERROR)
even in the DS_HARD_DEADLOCK case, so nothing calls siglongjmp(). So logically
the code shouldn't end up in the PG_CATCH() branch.

semop() will only get an errno = EINTR when returning from the signal handler
and would loop again. Then what makes it return beside being able to lock the
semaphore? The conditions in ProcSleep() that e.g. print the lock stats work somehow.

> The whole lmgrtimeout module seems to me to be far more mechanism than is
> warranted, for too little added functionality.  In the first place, there
> is nothing on the horizon suggesting that we need to let any plug-in code
> get control here, and if anything the delicacy of what's going on leads me
> to not wish to expose such a possibility.  In the second place, it isn't
> adding any actually useful functionality, it's just agglomerating some
> checks.  The minimum thing I would want it to do is avoid calling
> timeout.c multiple times, which is what would happen right now (leading
> to four extra syscalls per lock acquisition, which is enough new overhead
> to constitute a strong objection to committing this patch at all).
>
> On the whole I think we could forget lmgrtimeout and just hardwire the
> lock timeout and deadlock check cases.  But in any case we're going to
> need support in timeout.c for enabling/disabling multiple timeouts at
> once without extra setitimer calls.

OK, so you prefer the previous hardcoding PGSemaphoreTimedLock()
that makes every LockAcquire() check its return code and the detailed
error message about failed to lock the given object?

I will add new functions to timeout.c to remove many timeout sources
at once to decrease the amount of syscalls needed.

> I'm also not thrilled about the way in which the existing deadlock
> checking code has been hacked up.  As an example, you added this to
> DeadLockReport():
>
> +    if (!DeadLockTimeoutCondition())
> +        return;
>
> which again causes it to violate its contract, namely to report a
> deadlock, in the most fundamental way -- existing callers aren't
> expecting it to return *at all*.

Existing caller*s*? There is only one caller. The reasoning behind this
change was that if the code reaches ReportLmgrTimeoutError() in lock.c
then at least one timeout triggered and one *Report function in the chain
will do ereport(ERROR). *THAT* would trigger te siglongjmp() ending
up in PG_CATCH(). The added lines in DeadLockReport() ensures that
the deadlock error is not reported if it didn't trigger.

>    Surely we can decouple the deadlock
> and lock timeout cases better than that; or at least if we can't it's
> a delusion to propose anything like lmgrtimeout in the first place.
>
> There's considerable lack of attention to updating comments, too.
> For instance in WaitOnLock you only bothered to update the comment
> immediately adjacent to the changed code, and not the two comment
> blocks above that, which both have specific references to deadlocks
> being the reason for failure.
>
> Also, the "per statement" mode for lock timeout doesn't seem to be
> any such thing, because it's implemented like this:
>
> +            case LOCK_TIMEOUT_PER_STMT:
> +                enable_timeout_at(LOCK_TIMEOUT,
> +                            TimestampTzPlusMilliseconds(
> +                                    GetCurrentStatementStartTimestamp(),
> +                                    LockTimeout));
> +                break;
>
> That doesn't provide anything like "you can spend at most N milliseconds
> waiting for locks during a statement".  What it is is "if you happen to be
> waiting for a lock N milliseconds after the statement starts, or if you
> attempt to acquire any lock more than N milliseconds after the statement
> starts, you lose instantly".  I don't think that definition actually adds
> any useful functionality compared to setting statement_timeout to N
> milliseconds, and it's certainly wrongly documented.

OK, I will add the code to "lose instantly" if the code happens to be executed
after the specified deadline, this is another oversight on my side. Where can
I get brown paper bags in large amounts? :-)

But I certainly don't agree with you that the "per statement" variant of the
lock timeout is nothing more than statement timeout. What happens if the
statement doesn't spend its timeout i.e. in the lockless SELECT case or anything
that has to lock but it can get all the locks under the specified time?

statement_timeout can trigger the error while the server is already returning
the data to the client and this is not useful. Thanks to the single-row processing
mode introduced in 9.2, the client can at least get partial data and an error
before the next row.

I have certainly seen use cases that said: I only allow the statement X time
to wait on locks but otherwise give me all data.

>    To do what the
> documentation implies would require tracking and adding up the time spent
> waiting for locks during a statement.  Which might be a good thing to do,
> especially if the required gettimeofday() calls could be shared with what
> timeout.c probably has to do anyway at start and stop of a lock wait.

I will look into it.

> But this code doesn't do it.
>
> Lastly, I'm not sure where is the best place to be adding the control
> logic for this, but I'm pretty sure postinit.c is not it.  It oughta be
> somewhere under storage/lmgr/, no?

Indeed.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/




pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: patch: shared session variables
Next
From: Heikki Linnakangas
Date:
Subject: Re: Configuration include directory