Re: [PATCH] lock_timeout and common SIGALRM framework - Mailing list pgsql-hackers

From Boszormenyi Zoltan
Subject Re: [PATCH] lock_timeout and common SIGALRM framework
Date
Msg-id 4FA80C3D.6060605@cybertec.at
Whole thread Raw
In response to Re: [PATCH] lock_timeout and common SIGALRM framework  (Boszormenyi Zoltan <zb@cybertec.at>)
Responses Re: [PATCH] lock_timeout and common SIGALRM framework
List pgsql-hackers
Hi,

2012-04-24 09:00 keltezéssel, Boszormenyi Zoltan írta:
> 2012-04-23 15:08 keltezéssel, Marc Cousin írta:
>> On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
>>> 2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
>>>> 2012-04-06 14:47 keltezéssel, Cousin Marc írta:
>>>>> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
>>>>>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>>>>>>> I think this patch is doing two things: first touching infrastructure
>>>>>>> stuff and then adding lock_timeout on top of that.  Would it work to
>>>>>>> split the patch in two pieces?
>>>>>>>
>>>>>> Sure. Attached is the split version.
>>>>>>
>>>>>> Best regards,
>>>>>> Zoltán Böszörményi
>>>>>>
>>>>> Hi,
>>>>>
>>>>> I've started looking at and testing both patches.
>>>>>
>>>>> Technically speaking, I think the source looks much better than the
>>>>> first version of lock timeout, and may help adding other timeouts in the
>>>>> future. I haven't tested it in depth though, because I encountered the
>>>>> following problem:
>>>>>
>>>>> While testing the patch, I found a way to crash PG. But what's weird is
>>>>> that it crashes also with an unpatched git version.
>>>>>
>>>>> Here is the way to reproduce it (I have done it with a pgbench schema):
>>>>>
>>>>> - Set a small statement_timeout (just to save time during the tests)
>>>>>
>>>>> Session1:
>>>>> =#BEGIN;
>>>>> =#lock TABLE pgbench_accounts ;
>>>>>
>>>>> Session 2:
>>>>> =#BEGIN;
>>>>> =#lock TABLE pgbench_accounts ;
>>>>> ERROR:  canceling statement due to statement timeout
>>>>> =# lock TABLE pgbench_accounts ;
>>>>>
>>>>> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
>>>>> done with a rollback to savepoint of course.
>>>>>
>>>>> Session 2 crashes with this : TRAP : FailedAssertion(«
>>>>> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
>>>>> 749).
>>>>>
>>>>> It can also be done without a statement_timeout, and a control-C on the
>>>>> second lock table.
>>>>>
>>>>> I didn't touch anything but this. It occurs everytime, when asserts are
>>>>> activated.
>>>>>
>>>>> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
>>>>> of events. So maybe it's something introduced since ? Or is the assert
>>>>> still valid ?
>>>>>
>>>>> Cheers
>>>>>
>>>> Attached are the new patches. I rebased them to current GIT and
>>>> they are expected to be applied after Robert Haas' patch in the
>>>> "bug in fast-path locking" thread.
>>>>
>>>> Now it survives the above scenario.
>>>>
>>>> Best regards,
>>>> Zoltán Böszörményi
>>> New patch attached, rebased to today's GIT.
>>>
>>> Best regards,
>>> Zoltán Böszörményi
>>>
>> Ok, I've done what was missing from the review (from when I had a bug in
>> locking the other day), so here is the full review. By the way, this
>> patch doesn't belong to current commitfest, but to the next one.
>
> It was added to 2012-Next when I posted it, 2012-01 was already
> closed for new additions.
>
>> Is the patch in context diff format?
>> Yes
>>
>> Does it apply cleanly to the current git master?
>> Yes
>>
>> Does it include reasonable tests, necessary doc patches, etc?
>> The new lock_timeout GUC is documented. There are regression tests.
>>
>> Read what the patch is supposed to do, and consider:
>> Does the patch actually implement that?
>> Yes
>>
>> Do we want that?
>> I do. Mostly for administrative jobs which could lock the whole
>> application. It would be much easier to run reindexes, vacuum full, etc…
>> without worrying about bringing application down because of lock
>> contention.
>>
>> Do we already have it?
>> No.
>>
>> Does it follow SQL spec, or the community-agreed behavior?
>> I don't know if there is a consensus on this new GUC. statement_timeout
>> is obviously not in the SQL spec.
>>
>> Does it include pg_dump support (if applicable)?
>> Not applicable
>>
>> Are there dangers?
>> Yes, as it rewrites all the timeout code. I feel it is much cleaner this
>> way, as there is a generic set of function managing all sigalarm code,
>> but it heavily touches this part.
>>
>> Have all the bases been covered?
>> I tried all sql statements I could think of (select, insert, update,
>> delete, truncate, drop, create index, adding constraint, lock.
>>
>> I tried having statement_timeout, lock_timeout and deadlock_timeout at
>> very short and close or equal values. It worked too.
>>
>> Rollback to savepoint while holding locks dont crash PostgreSQL anymore.
>>
>> Other timeouts such as archive_timeout and checkpoint_timeout still
>> work.
>>
>> Does the feature work as advertised?
>> Yes
>>
>> Are there corner cases the author has failed to consider?
>> I didn't find any.
>>
>> Are there any assertion failures or crashes?
>> No.
>>
>> Does the patch slow down simple tests?
>> No
>>
>> If it claims to improve performance, does it?
>> Not applicable
>>
>> Does it slow down other things?
>> No
>>
>> Does it follow the project coding guidelines?
>> I think so
>>
>> Are there portability issues?
>> No, all the portable code (acquiring locks and manipulating sigalarm) is
>> the same as before.
>>
>> Will it work on Windows/BSD etc?
>> It should. I couldn't test it though.
>>
>> Are the comments sufficient and accurate?
>> Yes
>>
>> Does it do what it says, correctly?
>> Yes
>>
>> Does it produce compiler warnings?
>> No
>>
>> Can you make it crash?
>> Not anymore
>>
>> Is everything done in a way that fits together coherently with other
>> features/modules?
>> Yes, I think so. The new way of handling sigalarm seems more robust to
>> me.
>>
>> Are there interdependencies that can cause problems?
>> I don't see any.
>>
>> Regards,
>>
>> Marc
>
> Thanks for the review.
>
> Best regards,
> Zoltán Böszörményi
>

New patches are attached, rebased to current GIT to avoid
a (trivial) reject. No other changes.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: "unexpected EOF" messages
Next
From: Simon Riggs
Date:
Subject: Re: Latch for the WAL writer - further reducing idle wake-ups.