Thread: [PATCH] lock_timeout and common SIGALRM framework

[PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.

The framework is using a new internal API for timeouts:

bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);

A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:

typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);

typedef struct {
         TimeoutName     index;
         bool            resched_next;
         timeout_init    timeout_init;
         timeout_destroy timeout_destroy;
         timeout_check   timeout_check;
         timeout_start   timeout_start;
         TimestampTz     fin_time;
} timeout_params;

This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.

And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.

The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.

Documentation and extensive comments are included.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:
> Hi,
>
> attached is a patch to implement a framework to simplify and
> correctly nest multiplexing more than two timeout sources
> into the same SIGALRM signal handler.
>
> The framework is using a new internal API for timeouts:
>
> bool enable_timeout(TimeoutName tn, int delayms);
> bool disable_timeout(TimeoutName tn, bool keep_indicator);
> bool disable_all_timeouts(bool keep_indicators);
>
> A timeout source has these features to allow different initialization,
> cleanup and check functions and rescheduling:
>
> typedef void (*timeout_init)(TimestampTz, TimestampTz);
> typedef void (*timeout_destroy)(bool);
> typedef bool (*timeout_check)(void);
> typedef TimestampTz (*timeout_start)(void);
>
> typedef struct {
>         TimeoutName     index;
>         bool            resched_next;
>         timeout_init    timeout_init;
>         timeout_destroy timeout_destroy;
>         timeout_check   timeout_check;
>         timeout_start   timeout_start;
>         TimestampTz     fin_time;
> } timeout_params;
>
> This makes it possible to differentiate between the standby and
> statement timeouts, regular deadlock and standby deadlock using
> the same signal handler function.
>
> And finally, this makes it possible to implement the lock_timeout
> feature that we at Cybertec implemented more than 2 years ago.
>
> The patch also adds two new tests into prepared_xacts.sql to trigger
> the lock_timeout instead of statement_timeout.
>
> Documentation and extensive comments are included.

Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.

>
> Best regards,
> Zoltán Böszörményi


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-04 15:17 keltezéssel, Boszormenyi Zoltan írta:
Hi,

2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:
Hi,

attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.

The framework is using a new internal API for timeouts:

bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);

A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:

typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);

typedef struct {
        TimeoutName     index;
        bool            resched_next;
        timeout_init    timeout_init;
        timeout_destroy timeout_destroy;
        timeout_check   timeout_check;
        timeout_start   timeout_start;
        TimestampTz     fin_time;
} timeout_params;

This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.

And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.

The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.

Documentation and extensive comments are included.

Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.

3rd and (for now) final version. Tidied comments, the time checks in
Check*() functions and function order in timeout.c.



Best regards,
Zoltán Böszörményi






-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de    http://www.postgresql.at/
Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-04 16:22 keltezéssel, Boszormenyi Zoltan írta:
2012-04-04 15:17 keltezéssel, Boszormenyi Zoltan írta:
Hi,

2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:
Hi,

attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.

The framework is using a new internal API for timeouts:

bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);

A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:

typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);

typedef struct {
        TimeoutName     index;
        bool            resched_next;
        timeout_init    timeout_init;
        timeout_destroy timeout_destroy;
        timeout_check   timeout_check;
        timeout_start   timeout_start;
        TimestampTz     fin_time;
} timeout_params;

This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.

And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.

The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.

Documentation and extensive comments are included.

Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.

3rd and (for now) final version.

I lied. This is the final one. I fixed a typo in the documentation
and replaced timeout_start_time (previously static to proc.c)
with get_timeout_start(DEADLOCK_TIMEOUT). This one should
have happened in the second version.

Tidied comments, the time checks in
Check*() functions and function order in timeout.c.



Best regards,
Zoltán Böszörményi






-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de    http://www.postgresql.at/




-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de    http://www.postgresql.at/
Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-04 17:12 keltezéssel, Boszormenyi Zoltan írta: <blockquote cite="mid:4F7C64F2.5070905@cybertec.at"
type="cite">2012-04-04 16:22 keltezéssel, Boszormenyi Zoltan írta: <blockquote cite="mid:4F7C5925.107@cybertec.at"
type="cite">2012-04-04 15:17 keltezéssel, Boszormenyi Zoltan írta: <blockquote cite="mid:4F7C49DA.7080008@cybertec.at"
type="cite">Hi,<br /><br /> 2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta: <br /><blockquote type="cite">Hi,
<br/><br /> attached is a patch to implement a framework to simplify and <br /> correctly nest multiplexing more than
twotimeout sources <br /> into the same SIGALRM signal handler. <br /><br /> The framework is using a new internal API
fortimeouts: <br /><br /> bool enable_timeout(TimeoutName tn, int delayms); <br /> bool disable_timeout(TimeoutName tn,
boolkeep_indicator); <br /> bool disable_all_timeouts(bool keep_indicators); <br /><br /> A timeout source has these
featuresto allow different initialization, <br /> cleanup and check functions and rescheduling: <br /><br /> typedef
void(*timeout_init)(TimestampTz, TimestampTz); <br /> typedef void (*timeout_destroy)(bool); <br /> typedef bool
(*timeout_check)(void);<br /> typedef TimestampTz (*timeout_start)(void); <br /><br /> typedef struct { <br />        
TimeoutName    index; <br />         bool            resched_next; <br />         timeout_init    timeout_init; <br />
       timeout_destroy timeout_destroy; <br />         timeout_check   timeout_check; <br />         timeout_start  
timeout_start;<br />         TimestampTz     fin_time; <br /> } timeout_params; <br /><br /> This makes it possible to
differentiatebetween the standby and <br /> statement timeouts, regular deadlock and standby deadlock using <br /> the
samesignal handler function. <br /><br /> And finally, this makes it possible to implement the lock_timeout <br />
featurethat we at Cybertec implemented more than 2 years ago. <br /><br /> The patch also adds two new tests into
prepared_xacts.sqlto trigger <br /> the lock_timeout instead of statement_timeout. <br /><br /> Documentation and
extensivecomments are included. <br /></blockquote><br /> Second version. Every timeout-related functions are now in a
separate<br /> source file, src/backend/storage/timeout.c with accessor functions. <br /> There are no related global
variablesanymore, only the GUCs. <br /></blockquote><br /> 3rd and (for now) final version.</blockquote><br /> I lied.
Thisis the final one. I fixed a typo in the documentation<br /> and replaced timeout_start_time (previously static to
proc.c)<br/> with get_timeout_start(DEADLOCK_TIMEOUT). This one should<br /> have happened in the second version.<br
/><br/><blockquote cite="mid:4F7C5925.107@cybertec.at" type="cite"> Tidied comments, the time checks in<br /> Check*()
functionsand function order in timeout.c.<br /><br /><blockquote cite="mid:4F7C49DA.7080008@cybertec.at"
type="cite"><br/><blockquote type="cite"><br /> Best regards, <br /> Zoltán Böszörményi <br
/></blockquote></blockquote></blockquote></blockquote><br/> One comment for testers: all the timeout GUC values are
givenin<br /> milliseconds, the kernel interface (setitimer) and TimestampTz uses<br /> microseconds.<br /><br /> The
transactionthat locks is inherently a read/write one and by the time<br /> the code reaches ProcSleep(), at least a few
tensof microseconds has<br /> already passed since the start of the statement even on 3GHz+ CPUs.<br /><br /> Not to
mentionthat computers nowadays have high precision timers<br /> and OSs running on them utilitize those. So, the time
returnedby<br /> GetCurrentStatementStartTimestamp() will certainly be different from<br /> GetCurrentTimestamp(). This
meansthat the timeouts' fin_time will also<br /> be different.<br /><br /> Long story short, using the same value for
statement_timeoutand<br /> lock_timeout (or deadlock_timeout for that matter) means that<br /> statement_timeout will
triggerfirst. The story is different only on<br /> a combination of a fast CPU and an OS with
greater-then-millisecond<br/> timer resolution.<br /><br /> Best regards,<br /> Zoltán Böszörményi<br /><br /><pre
class="moz-signature"cols="90">--  
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> 
</pre>

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that.  Would it work to
split the patch in two pieces?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
> I think this patch is doing two things: first touching infrastructure
> stuff and then adding lock_timeout on top of that.  Would it work to
> split the patch in two pieces?
>

Sure. Attached is the split version.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Cousin Marc
Date:
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>> I think this patch is doing two things: first touching infrastructure
>> stuff and then adding lock_timeout on top of that.  Would it work to
>> split the patch in two pieces?
>>
>
> Sure. Attached is the split version.
>
> Best regards,
> Zoltán Böszörményi
>
Hi,

I've started looking at and testing both patches.

Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:

While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.

Here is the way to reproduce it (I have done it with a pgbench schema):

- Set a small statement_timeout (just to save time during the tests)

Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;

Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR:  canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;

I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.

Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).

It can also be done without a statement_timeout, and a control-C on the
second lock table.

I didn't touch anything but this. It occurs everytime, when asserts are
activated.

I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?

Cheers


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>>> I think this patch is doing two things: first touching infrastructure
>>> stuff and then adding lock_timeout on top of that.  Would it work to
>>> split the patch in two pieces?
>>>
>> Sure. Attached is the split version.
>>
>> Best regards,
>> Zoltán Böszörményi
>>
> Hi,
>
> I've started looking at and testing both patches.
>
> Technically speaking, I think the source looks much better than the
> first version of lock timeout, and may help adding other timeouts in the
> future.

Thanks.

>   I haven't tested it in depth though, because I encountered the
> following problem:
>
> While testing the patch, I found a way to crash PG. But what's weird is
> that it crashes also with an unpatched git version.
>
> Here is the way to reproduce it (I have done it with a pgbench schema):
>
> - Set a small statement_timeout (just to save time during the tests)
>
> Session1:
> =#BEGIN;
> =#lock TABLE pgbench_accounts ;
>
> Session 2:
> =#BEGIN;
> =#lock TABLE pgbench_accounts ;
> ERROR:  canceling statement due to statement timeout
> =# lock TABLE pgbench_accounts ;
>
> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
> done with a rollback to savepoint of course.
>
> Session 2 crashes with this : TRAP : FailedAssertion(«
> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
> 749).
>
> It can also be done without a statement_timeout, and a control-C on the
> second lock table.

Indeed, the unpatched GIT version crashes if you enter  =#lock TABLE pgbench_accounts ;
the second time in session 2 after the first one failed. Also,
manually spelling it out:

Session 1:

$ psql
psql (9.2devel)
Type "help" for help.

zozo=# begin;
BEGIN
zozo=# lock table pgbench_accounts;
LOCK TABLE
zozo=#

Session 2:

zozo=# begin;
BEGIN
zozo=# savepoint a;
SAVEPOINT
zozo=# lock table pgbench_accounts;
ERROR:  canceling statement due to statement timeout
zozo=# rollback to a;
ROLLBACK
zozo=# savepoint b;
SAVEPOINT
zozo=# lock table pgbench_accounts;
The connection to the server was lost. Attempting reset: Failed.
!>

Server log after the second lock table:

TRAP: FailedAssertion("!(locallock->holdsStrongLockCount == 0)", File: "lock.c", Line: 749)
LOG:  server process (PID 12978) was terminated by signal 6: Aborted

Best regards,
Zoltán Böszörményi

>
> I didn't touch anything but this. It occurs everytime, when asserts are
> activated.
>
> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
> of events. So maybe it's something introduced since ? Or is the assert
> still valid ?
>
> Cheers
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-08 11:24 keltezéssel, Boszormenyi Zoltan írta:
> 2012-04-06 14:47 keltezéssel, Cousin Marc írta:
>> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
>>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>>>> I think this patch is doing two things: first touching infrastructure
>>>> stuff and then adding lock_timeout on top of that.  Would it work to
>>>> split the patch in two pieces?
>>>>
>>> Sure. Attached is the split version.
>>>
>>> Best regards,
>>> Zoltán Böszörményi
>>>
>> Hi,
>>
>> I've started looking at and testing both patches.
>>
>> Technically speaking, I think the source looks much better than the
>> first version of lock timeout, and may help adding other timeouts in the
>> future.
>
> Thanks.
>
>>   I haven't tested it in depth though, because I encountered the
>> following problem:
>>
>> While testing the patch, I found a way to crash PG. But what's weird is
>> that it crashes also with an unpatched git version.
>>
>> Here is the way to reproduce it (I have done it with a pgbench schema):
>>
>> - Set a small statement_timeout (just to save time during the tests)
>>
>> Session1:
>> =#BEGIN;
>> =#lock TABLE pgbench_accounts ;
>>
>> Session 2:
>> =#BEGIN;
>> =#lock TABLE pgbench_accounts ;
>> ERROR:  canceling statement due to statement timeout
>> =# lock TABLE pgbench_accounts ;
>>
>> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
>> done with a rollback to savepoint of course.
>>
>> Session 2 crashes with this : TRAP : FailedAssertion(«
>> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
>> 749).
>>
>> It can also be done without a statement_timeout, and a control-C on the
>> second lock table.
>
> Indeed, the unpatched GIT version crashes if you enter
>   =#lock TABLE pgbench_accounts ;
> the second time in session 2 after the first one failed. Also,
> manually spelling it out:
>
> Session 1:
>
> $ psql
> psql (9.2devel)
> Type "help" for help.
>
> zozo=# begin;
> BEGIN
> zozo=# lock table pgbench_accounts;
> LOCK TABLE
> zozo=#
>
> Session 2:
>
> zozo=# begin;
> BEGIN
> zozo=# savepoint a;
> SAVEPOINT
> zozo=# lock table pgbench_accounts;
> ERROR:  canceling statement due to statement timeout
> zozo=# rollback to a;
> ROLLBACK
> zozo=# savepoint b;
> SAVEPOINT
> zozo=# lock table pgbench_accounts;
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> Server log after the second lock table:
>
> TRAP: FailedAssertion("!(locallock->holdsStrongLockCount == 0)", File: "lock.c", Line: 749)
> LOG:  server process (PID 12978) was terminated by signal 6: Aborted
>
> Best regards,
> Zoltán Böszörményi

Robert, the Assert triggering with the above procedure
is in your "fast path" locking code with current GIT.

Best regards,
Zoltán Böszörményi

>
>>
>> I didn't touch anything but this. It occurs everytime, when asserts are
>> activated.
>>
>> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
>> of events. So maybe it's something introduced since ? Or is the assert
>> still valid ?
>>
>> Cheers
>>
>
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-06 14:47 keltezéssel, Cousin Marc írta:
> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>>> I think this patch is doing two things: first touching infrastructure
>>> stuff and then adding lock_timeout on top of that.  Would it work to
>>> split the patch in two pieces?
>>>
>> Sure. Attached is the split version.
>>
>> Best regards,
>> Zoltán Böszörményi
>>
> Hi,
>
> I've started looking at and testing both patches.
>
> Technically speaking, I think the source looks much better than the
> first version of lock timeout, and may help adding other timeouts in the
> future. I haven't tested it in depth though, because I encountered the
> following problem:
>
> While testing the patch, I found a way to crash PG. But what's weird is
> that it crashes also with an unpatched git version.
>
> Here is the way to reproduce it (I have done it with a pgbench schema):
>
> - Set a small statement_timeout (just to save time during the tests)
>
> Session1:
> =#BEGIN;
> =#lock TABLE pgbench_accounts ;
>
> Session 2:
> =#BEGIN;
> =#lock TABLE pgbench_accounts ;
> ERROR:  canceling statement due to statement timeout
> =# lock TABLE pgbench_accounts ;
>
> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
> done with a rollback to savepoint of course.
>
> Session 2 crashes with this : TRAP : FailedAssertion(«
> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
> 749).
>
> It can also be done without a statement_timeout, and a control-C on the
> second lock table.
>
> I didn't touch anything but this. It occurs everytime, when asserts are
> activated.
>
> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
> of events. So maybe it's something introduced since ? Or is the assert
> still valid ?
>
> Cheers
>

Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
"bug in fast-path locking" thread.

Now it survives the above scenario.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
> 2012-04-06 14:47 keltezéssel, Cousin Marc írta:
>> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
>>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>>>> I think this patch is doing two things: first touching infrastructure
>>>> stuff and then adding lock_timeout on top of that.  Would it work to
>>>> split the patch in two pieces?
>>>>
>>> Sure. Attached is the split version.
>>>
>>> Best regards,
>>> Zoltán Böszörményi
>>>
>> Hi,
>>
>> I've started looking at and testing both patches.
>>
>> Technically speaking, I think the source looks much better than the
>> first version of lock timeout, and may help adding other timeouts in the
>> future. I haven't tested it in depth though, because I encountered the
>> following problem:
>>
>> While testing the patch, I found a way to crash PG. But what's weird is
>> that it crashes also with an unpatched git version.
>>
>> Here is the way to reproduce it (I have done it with a pgbench schema):
>>
>> - Set a small statement_timeout (just to save time during the tests)
>>
>> Session1:
>> =#BEGIN;
>> =#lock TABLE pgbench_accounts ;
>>
>> Session 2:
>> =#BEGIN;
>> =#lock TABLE pgbench_accounts ;
>> ERROR:  canceling statement due to statement timeout
>> =# lock TABLE pgbench_accounts ;
>>
>> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
>> done with a rollback to savepoint of course.
>>
>> Session 2 crashes with this : TRAP : FailedAssertion(«
>> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
>> 749).
>>
>> It can also be done without a statement_timeout, and a control-C on the
>> second lock table.
>>
>> I didn't touch anything but this. It occurs everytime, when asserts are
>> activated.
>>
>> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
>> of events. So maybe it's something introduced since ? Or is the assert
>> still valid ?
>>
>> Cheers
>>
>
> Attached are the new patches. I rebased them to current GIT and
> they are expected to be applied after Robert Haas' patch in the
> "bug in fast-path locking" thread.
>
> Now it survives the above scenario.
>
> Best regards,
> Zoltán Böszörményi

New patch attached, rebased to today's GIT.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Marc Cousin
Date:
On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
> 2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
> > 2012-04-06 14:47 keltezéssel, Cousin Marc írta:
> >> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
> >>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
> >>>> I think this patch is doing two things: first touching infrastructure
> >>>> stuff and then adding lock_timeout on top of that.  Would it work to
> >>>> split the patch in two pieces?
> >>>>
> >>> Sure. Attached is the split version.
> >>>
> >>> Best regards,
> >>> Zoltán Böszörményi
> >>>
> >> Hi,
> >>
> >> I've started looking at and testing both patches.
> >>
> >> Technically speaking, I think the source looks much better than the
> >> first version of lock timeout, and may help adding other timeouts in the
> >> future. I haven't tested it in depth though, because I encountered the
> >> following problem:
> >>
> >> While testing the patch, I found a way to crash PG. But what's weird is
> >> that it crashes also with an unpatched git version.
> >>
> >> Here is the way to reproduce it (I have done it with a pgbench schema):
> >>
> >> - Set a small statement_timeout (just to save time during the tests)
> >>
> >> Session1:
> >> =#BEGIN;
> >> =#lock TABLE pgbench_accounts ;
> >>
> >> Session 2:
> >> =#BEGIN;
> >> =#lock TABLE pgbench_accounts ;
> >> ERROR:  canceling statement due to statement timeout
> >> =# lock TABLE pgbench_accounts ;
> >>
> >> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
> >> done with a rollback to savepoint of course.
> >>
> >> Session 2 crashes with this : TRAP : FailedAssertion(«
> >> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
> >> 749).
> >>
> >> It can also be done without a statement_timeout, and a control-C on the
> >> second lock table.
> >>
> >> I didn't touch anything but this. It occurs everytime, when asserts are
> >> activated.
> >>
> >> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
> >> of events. So maybe it's something introduced since ? Or is the assert
> >> still valid ?
> >>
> >> Cheers
> >>
> >
> > Attached are the new patches. I rebased them to current GIT and
> > they are expected to be applied after Robert Haas' patch in the
> > "bug in fast-path locking" thread.
> >
> > Now it survives the above scenario.
> >
> > Best regards,
> > Zoltán Böszörményi
> 
> New patch attached, rebased to today's GIT.
> 
> Best regards,
> Zoltán Böszörményi
> 

Ok, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.



Is the patch in context diff format?
Yes

Does it apply cleanly to the current git master?
Yes

Does it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.

Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
Yes

Do we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.

Do we already have it?
No.

Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.

Does it include pg_dump support (if applicable)?
Not applicable

Are there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.

Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.

I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.

Rollback to savepoint while holding locks dont crash PostgreSQL anymore.

Other timeouts such as archive_timeout and checkpoint_timeout still
work.

Does the feature work as advertised?
Yes

Are there corner cases the author has failed to consider?
I didn't find any.

Are there any assertion failures or crashes?
No.

Does the patch slow down simple tests?
No

If it claims to improve performance, does it?
Not applicable

Does it slow down other things?
No

Does it follow the project coding guidelines?
I think so

Are there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.

Will it work on Windows/BSD etc?
It should. I couldn't test it though.

Are the comments sufficient and accurate?
Yes

Does it do what it says, correctly?
Yes

Does it produce compiler warnings?
No

Can you make it crash?
Not anymore

Is everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way of handling sigalarm seems more robust to
me.

Are there interdependencies that can cause problems?
I don't see any.

Regards,

Marc





Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-04-23 15:08 keltezéssel, Marc Cousin írta:
> On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
>> 2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
>>> 2012-04-06 14:47 keltezéssel, Cousin Marc írta:
>>>> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
>>>>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>>>>>> I think this patch is doing two things: first touching infrastructure
>>>>>> stuff and then adding lock_timeout on top of that.  Would it work to
>>>>>> split the patch in two pieces?
>>>>>>
>>>>> Sure. Attached is the split version.
>>>>>
>>>>> Best regards,
>>>>> Zoltán Böszörményi
>>>>>
>>>> Hi,
>>>>
>>>> I've started looking at and testing both patches.
>>>>
>>>> Technically speaking, I think the source looks much better than the
>>>> first version of lock timeout, and may help adding other timeouts in the
>>>> future. I haven't tested it in depth though, because I encountered the
>>>> following problem:
>>>>
>>>> While testing the patch, I found a way to crash PG. But what's weird is
>>>> that it crashes also with an unpatched git version.
>>>>
>>>> Here is the way to reproduce it (I have done it with a pgbench schema):
>>>>
>>>> - Set a small statement_timeout (just to save time during the tests)
>>>>
>>>> Session1:
>>>> =#BEGIN;
>>>> =#lock TABLE pgbench_accounts ;
>>>>
>>>> Session 2:
>>>> =#BEGIN;
>>>> =#lock TABLE pgbench_accounts ;
>>>> ERROR:  canceling statement due to statement timeout
>>>> =# lock TABLE pgbench_accounts ;
>>>>
>>>> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
>>>> done with a rollback to savepoint of course.
>>>>
>>>> Session 2 crashes with this : TRAP : FailedAssertion(«
>>>> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
>>>> 749).
>>>>
>>>> It can also be done without a statement_timeout, and a control-C on the
>>>> second lock table.
>>>>
>>>> I didn't touch anything but this. It occurs everytime, when asserts are
>>>> activated.
>>>>
>>>> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
>>>> of events. So maybe it's something introduced since ? Or is the assert
>>>> still valid ?
>>>>
>>>> Cheers
>>>>
>>> Attached are the new patches. I rebased them to current GIT and
>>> they are expected to be applied after Robert Haas' patch in the
>>> "bug in fast-path locking" thread.
>>>
>>> Now it survives the above scenario.
>>>
>>> Best regards,
>>> Zoltán Böszörményi
>> New patch attached, rebased to today's GIT.
>>
>> Best regards,
>> Zoltán Böszörményi
>>
> Ok, I've done what was missing from the review (from when I had a bug in
> locking the other day), so here is the full review. By the way, this
> patch doesn't belong to current commitfest, but to the next one.

It was added to 2012-Next when I posted it, 2012-01 was already
closed for new additions.

> Is the patch in context diff format?
> Yes
>
> Does it apply cleanly to the current git master?
> Yes
>
> Does it include reasonable tests, necessary doc patches, etc?
> The new lock_timeout GUC is documented. There are regression tests.
>
> Read what the patch is supposed to do, and consider:
> Does the patch actually implement that?
> Yes
>
> Do we want that?
> I do. Mostly for administrative jobs which could lock the whole
> application. It would be much easier to run reindexes, vacuum full, etc…
> without worrying about bringing application down because of lock
> contention.
>
> Do we already have it?
> No.
>
> Does it follow SQL spec, or the community-agreed behavior?
> I don't know if there is a consensus on this new GUC. statement_timeout
> is obviously not in the SQL spec.
>
> Does it include pg_dump support (if applicable)?
> Not applicable
>
> Are there dangers?
> Yes, as it rewrites all the timeout code. I feel it is much cleaner this
> way, as there is a generic set of function managing all sigalarm code,
> but it heavily touches this part.
>
> Have all the bases been covered?
> I tried all sql statements I could think of (select, insert, update,
> delete, truncate, drop, create index, adding constraint, lock.
>
> I tried having statement_timeout, lock_timeout and deadlock_timeout at
> very short and close or equal values. It worked too.
>
> Rollback to savepoint while holding locks dont crash PostgreSQL anymore.
>
> Other timeouts such as archive_timeout and checkpoint_timeout still
> work.
>
> Does the feature work as advertised?
> Yes
>
> Are there corner cases the author has failed to consider?
> I didn't find any.
>
> Are there any assertion failures or crashes?
> No.
>
> Does the patch slow down simple tests?
> No
>
> If it claims to improve performance, does it?
> Not applicable
>
> Does it slow down other things?
> No
>
> Does it follow the project coding guidelines?
> I think so
>
> Are there portability issues?
> No, all the portable code (acquiring locks and manipulating sigalarm) is
> the same as before.
>
> Will it work on Windows/BSD etc?
> It should. I couldn't test it though.
>
> Are the comments sufficient and accurate?
> Yes
>
> Does it do what it says, correctly?
> Yes
>
> Does it produce compiler warnings?
> No
>
> Can you make it crash?
> Not anymore
>
> Is everything done in a way that fits together coherently with other
> features/modules?
> Yes, I think so. The new way of handling sigalarm seems more robust to
> me.
>
> Are there interdependencies that can cause problems?
> I don't see any.
>
> Regards,
>
> Marc

Thanks for the review.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

2012-04-24 09:00 keltezéssel, Boszormenyi Zoltan írta:
> 2012-04-23 15:08 keltezéssel, Marc Cousin írta:
>> On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
>>> 2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
>>>> 2012-04-06 14:47 keltezéssel, Cousin Marc írta:
>>>>> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
>>>>>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>>>>>>> I think this patch is doing two things: first touching infrastructure
>>>>>>> stuff and then adding lock_timeout on top of that.  Would it work to
>>>>>>> split the patch in two pieces?
>>>>>>>
>>>>>> Sure. Attached is the split version.
>>>>>>
>>>>>> Best regards,
>>>>>> Zoltán Böszörményi
>>>>>>
>>>>> Hi,
>>>>>
>>>>> I've started looking at and testing both patches.
>>>>>
>>>>> Technically speaking, I think the source looks much better than the
>>>>> first version of lock timeout, and may help adding other timeouts in the
>>>>> future. I haven't tested it in depth though, because I encountered the
>>>>> following problem:
>>>>>
>>>>> While testing the patch, I found a way to crash PG. But what's weird is
>>>>> that it crashes also with an unpatched git version.
>>>>>
>>>>> Here is the way to reproduce it (I have done it with a pgbench schema):
>>>>>
>>>>> - Set a small statement_timeout (just to save time during the tests)
>>>>>
>>>>> Session1:
>>>>> =#BEGIN;
>>>>> =#lock TABLE pgbench_accounts ;
>>>>>
>>>>> Session 2:
>>>>> =#BEGIN;
>>>>> =#lock TABLE pgbench_accounts ;
>>>>> ERROR:  canceling statement due to statement timeout
>>>>> =# lock TABLE pgbench_accounts ;
>>>>>
>>>>> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
>>>>> done with a rollback to savepoint of course.
>>>>>
>>>>> Session 2 crashes with this : TRAP : FailedAssertion(«
>>>>> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
>>>>> 749).
>>>>>
>>>>> It can also be done without a statement_timeout, and a control-C on the
>>>>> second lock table.
>>>>>
>>>>> I didn't touch anything but this. It occurs everytime, when asserts are
>>>>> activated.
>>>>>
>>>>> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
>>>>> of events. So maybe it's something introduced since ? Or is the assert
>>>>> still valid ?
>>>>>
>>>>> Cheers
>>>>>
>>>> Attached are the new patches. I rebased them to current GIT and
>>>> they are expected to be applied after Robert Haas' patch in the
>>>> "bug in fast-path locking" thread.
>>>>
>>>> Now it survives the above scenario.
>>>>
>>>> Best regards,
>>>> Zoltán Böszörményi
>>> New patch attached, rebased to today's GIT.
>>>
>>> Best regards,
>>> Zoltán Böszörményi
>>>
>> Ok, I've done what was missing from the review (from when I had a bug in
>> locking the other day), so here is the full review. By the way, this
>> patch doesn't belong to current commitfest, but to the next one.
>
> It was added to 2012-Next when I posted it, 2012-01 was already
> closed for new additions.
>
>> Is the patch in context diff format?
>> Yes
>>
>> Does it apply cleanly to the current git master?
>> Yes
>>
>> Does it include reasonable tests, necessary doc patches, etc?
>> The new lock_timeout GUC is documented. There are regression tests.
>>
>> Read what the patch is supposed to do, and consider:
>> Does the patch actually implement that?
>> Yes
>>
>> Do we want that?
>> I do. Mostly for administrative jobs which could lock the whole
>> application. It would be much easier to run reindexes, vacuum full, etc…
>> without worrying about bringing application down because of lock
>> contention.
>>
>> Do we already have it?
>> No.
>>
>> Does it follow SQL spec, or the community-agreed behavior?
>> I don't know if there is a consensus on this new GUC. statement_timeout
>> is obviously not in the SQL spec.
>>
>> Does it include pg_dump support (if applicable)?
>> Not applicable
>>
>> Are there dangers?
>> Yes, as it rewrites all the timeout code. I feel it is much cleaner this
>> way, as there is a generic set of function managing all sigalarm code,
>> but it heavily touches this part.
>>
>> Have all the bases been covered?
>> I tried all sql statements I could think of (select, insert, update,
>> delete, truncate, drop, create index, adding constraint, lock.
>>
>> I tried having statement_timeout, lock_timeout and deadlock_timeout at
>> very short and close or equal values. It worked too.
>>
>> Rollback to savepoint while holding locks dont crash PostgreSQL anymore.
>>
>> Other timeouts such as archive_timeout and checkpoint_timeout still
>> work.
>>
>> Does the feature work as advertised?
>> Yes
>>
>> Are there corner cases the author has failed to consider?
>> I didn't find any.
>>
>> Are there any assertion failures or crashes?
>> No.
>>
>> Does the patch slow down simple tests?
>> No
>>
>> If it claims to improve performance, does it?
>> Not applicable
>>
>> Does it slow down other things?
>> No
>>
>> Does it follow the project coding guidelines?
>> I think so
>>
>> Are there portability issues?
>> No, all the portable code (acquiring locks and manipulating sigalarm) is
>> the same as before.
>>
>> Will it work on Windows/BSD etc?
>> It should. I couldn't test it though.
>>
>> Are the comments sufficient and accurate?
>> Yes
>>
>> Does it do what it says, correctly?
>> Yes
>>
>> Does it produce compiler warnings?
>> No
>>
>> Can you make it crash?
>> Not anymore
>>
>> Is everything done in a way that fits together coherently with other
>> features/modules?
>> Yes, I think so. The new way of handling sigalarm seems more robust to
>> me.
>>
>> Are there interdependencies that can cause problems?
>> I don't see any.
>>
>> Regards,
>>
>> Marc
>
> Thanks for the review.
>
> Best regards,
> Zoltán Böszörményi
>

New patches are attached, rebased to current GIT to avoid
a (trivial) reject. No other changes.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

another rebasing and applied the GIT changes in
ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
Windows implementation of PGSemaphoreTimedLock.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
> Hi,
>
> another rebasing and applied the GIT changes in
> ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
> Windows implementation of PGSemaphoreTimedLock.

Hi,

I gave the framework patch a look.  One thing I'm not sure about is the
way you've defined the API.  It seems a bit strange to have a nice and
clean, generic interface in timeout.h; and then have the internal
implementation of the API cluttered with details such as what to do when
the deadlock timeout expires.  Wouldn't it be much better if we could
keep the details of CheckDeadLock itself within proc.c, for example?

Same for the other specific Check*Timeout functions.  It seems to me
that the only timeout.c specific requirement is to be able to poke
base_timeouts[].indicator and fin_time.  Maybe that can get done in
timeout.c and then have the generic checker call the module-specific
checker.  ... I see that you have things to do before and after setting
"indicator".  Maybe you could split the module-specific Check functions
in two and have timeout.c call each in turn.  Other than that I don't
see that this should pose any difficulty.

Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
more than once per signal if you have multiple timeouts enabled.  Maybe
it'd be better to call it in once handle_sig_alarm and then pass the
value down, if necessary (AFAICS if you restructure the code as I
suggest above, you don't need to get the value down the the
module-specific code).

As for the Init*Timeout() and Destroy*Timeout() functions, they seem
a bit pointless -- I mean if they just always call the generic
InitTimeout and DestroyTimeout functions, why not just define the struct
in a way that makes the specific functions unnecessary?  Maybe you even
already have all that's necessary ... I think you just change
base_timeouts[i].timeout_destroy(false); to DestroyTimeout(i, false) and
so on.

Minor nitpick: the "Interface:" comment in timeout.c seems useless.
That kind of thing tends to get overlooked and obsolete over time.  We
have examples of such things all over the place.  I'd just rip it and
have timeout.h be the interface documentation.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:

BTW it doesn't seem that datatype/timestamp.h is really necessary in
timeout.h.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-18 19:46 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
>> Hi,
>>
>> another rebasing and applied the GIT changes in
>> ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
>> Windows implementation of PGSemaphoreTimedLock.
> Hi,
>
> I gave the framework patch a look.  One thing I'm not sure about is the
> way you've defined the API.  It seems a bit strange to have a nice and
> clean, generic interface in timeout.h; and then have the internal
> implementation of the API cluttered with details such as what to do when
> the deadlock timeout expires.  Wouldn't it be much better if we could
> keep the details of CheckDeadLock itself within proc.c, for example?

Do you mean adding a callback function argument to for enable_timeout()
would be better?

> Same for the other specific Check*Timeout functions.  It seems to me
> that the only timeout.c specific requirement is to be able to poke
> base_timeouts[].indicator and fin_time.  Maybe that can get done in
> timeout.c and then have the generic checker call the module-specific
> checker.

Or instead of static functions, Check* functions can be external
to timeout.c. It seemed to be a good idea to move all the timeout
related functions to timeout.c.

>    ... I see that you have things to do before and after setting
> "indicator".  Maybe you could split the module-specific Check functions
> in two and have timeout.c call each in turn.  Other than that I don't
> see that this should pose any difficulty.
>
> Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
> more than once per signal if you have multiple timeouts enabled.

Actually, GetCurrentTimestamp() is not called multiple times,
because only the first timeout source in the queue can get triggered.

>    Maybe
> it'd be better to call it in once handle_sig_alarm and then pass the
> value down, if necessary (AFAICS if you restructure the code as I
> suggest above, you don't need to get the value down the the
> module-specific code).

But yes, this way it can be cleaner.

>
> As for the Init*Timeout() and Destroy*Timeout() functions, they seem
> a bit pointless -- I mean if they just always call the generic
> InitTimeout and DestroyTimeout functions, why not just define the struct
> in a way that makes the specific functions unnecessary?  Maybe you even
> already have all that's necessary ... I think you just change
> base_timeouts[i].timeout_destroy(false); to DestroyTimeout(i, false) and
> so on.

OK, I will experiment with it.

> Minor nitpick: the "Interface:" comment in timeout.c seems useless.
> That kind of thing tends to get overlooked and obsolete over time.  We
> have examples of such things all over the place.  I'd just rip it and
> have timeout.h be the interface documentation.

OK.

> BTW it doesn't seem that datatype/timestamp.h is really necessary in
> timeout.h.

You are wrong. get_timeout_start() returns TimestampTz and it's
defined in datatype/timestamp.h.

Thanks for the review, I will send the new code soon.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

attached are the new patches.

The base patch is now simplified and is about 10K smaller:
- no more individual Init* and Destroy* functions
- Check* functions don't check for "now", it's done centrally by the signal handler
- CheckDeadLock() is moved back to proc.c and made public for timeout.c
- a new header storage/proctimeout.h is introduced, so timeout.c can know
    about CheckDeadLock()

The lock_timeout patch gained a new feature and enum GUC:

SET lock_timeout_option = { 'per_lock' | 'per_statement' } ;

'per_lock' is the default and carries the previous behaviour: the timeout value
applies to all locks individually. The statement may take up to N*timeout.

'per_statement' behaves like statement_timeout. The important difference is
that statement_timeout may trigger during the executor phase when a long
result set is already being returned to the client and the transfer is cut because
of the timeout. On the other hand, lock_timeout may only trigger during locking
the objects and if all locks were granted under the specified time, the result set
is then returned to the client.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of mar jun 19 04:44:35 -0400 2012:
>
> 2012-06-18 19:46 keltezéssel, Alvaro Herrera írta:
> > Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
> >> Hi,
> >>
> >> another rebasing and applied the GIT changes in
> >> ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
> >> Windows implementation of PGSemaphoreTimedLock.
> > Hi,
> >
> > I gave the framework patch a look.  One thing I'm not sure about is the
> > way you've defined the API.  It seems a bit strange to have a nice and
> > clean, generic interface in timeout.h; and then have the internal
> > implementation of the API cluttered with details such as what to do when
> > the deadlock timeout expires.  Wouldn't it be much better if we could
> > keep the details of CheckDeadLock itself within proc.c, for example?
>
> Do you mean adding a callback function argument to for enable_timeout()
> would be better?

Well, that's not exactly what I was thinking, but maybe your idea is
better than what I had in mind.

> > Same for the other specific Check*Timeout functions.  It seems to me
> > that the only timeout.c specific requirement is to be able to poke
> > base_timeouts[].indicator and fin_time.  Maybe that can get done in
> > timeout.c and then have the generic checker call the module-specific
> > checker.
>
> Or instead of static functions, Check* functions can be external
> to timeout.c. It seemed to be a good idea to move all the timeout
> related functions to timeout.c.

Yeah, except that they play with members of the timeout_params struct,
which hopefully does not need to be exported.  So if you can just move
(that is, put back in their original places) the portions that do not
touch that strcut, it'd be great.

> > Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
> > more than once per signal if you have multiple timeouts enabled.
>
> Actually, GetCurrentTimestamp() is not called multiple times,
> because only the first timeout source in the queue can get triggered.

I see.

> > BTW it doesn't seem that datatype/timestamp.h is really necessary in
> > timeout.h.
>
> You are wrong. get_timeout_start() returns TimestampTz and it's
> defined in datatype/timestamp.h.

Oh, right.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-19 16:54 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of mar jun 19 04:44:35 -0400 2012:
>> 2012-06-18 19:46 keltezéssel, Alvaro Herrera írta:
>>> Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
>>>> Hi,
>>>>
>>>> another rebasing and applied the GIT changes in
>>>> ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
>>>> Windows implementation of PGSemaphoreTimedLock.
>>> Hi,
>>>
>>> I gave the framework patch a look.  One thing I'm not sure about is the
>>> way you've defined the API.  It seems a bit strange to have a nice and
>>> clean, generic interface in timeout.h; and then have the internal
>>> implementation of the API cluttered with details such as what to do when
>>> the deadlock timeout expires.  Wouldn't it be much better if we could
>>> keep the details of CheckDeadLock itself within proc.c, for example?
>> Do you mean adding a callback function argument to for enable_timeout()
>> would be better?
> Well, that's not exactly what I was thinking, but maybe your idea is
> better than what I had in mind.
>
>>> Same for the other specific Check*Timeout functions.  It seems to me
>>> that the only timeout.c specific requirement is to be able to poke
>>> base_timeouts[].indicator and fin_time.  Maybe that can get done in
>>> timeout.c and then have the generic checker call the module-specific
>>> checker.
>> Or instead of static functions, Check* functions can be external
>> to timeout.c. It seemed to be a good idea to move all the timeout
>> related functions to timeout.c.
> Yeah, except that they play with members of the timeout_params struct,
> which hopefully does not need to be exported.  So if you can just move
> (that is, put back in their original places) the portions that do not
> touch that strcut, it'd be great.

OK, all 4 Check* functions are now moved back into proc.c,
nothing outside of timeout.c touches anything in it. New patches
are attached.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-19 18:44 keltezéssel, Boszormenyi Zoltan írta:
> 2012-06-19 16:54 keltezéssel, Alvaro Herrera írta:
>> Excerpts from Boszormenyi Zoltan's message of mar jun 19 04:44:35 -0400 2012:
>>> 2012-06-18 19:46 keltezéssel, Alvaro Herrera írta:
>>>> Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
>>>>> Hi,
>>>>>
>>>>> another rebasing and applied the GIT changes in
>>>>> ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
>>>>> Windows implementation of PGSemaphoreTimedLock.
>>>> Hi,
>>>>
>>>> I gave the framework patch a look.  One thing I'm not sure about is the
>>>> way you've defined the API.  It seems a bit strange to have a nice and
>>>> clean, generic interface in timeout.h; and then have the internal
>>>> implementation of the API cluttered with details such as what to do when
>>>> the deadlock timeout expires.  Wouldn't it be much better if we could
>>>> keep the details of CheckDeadLock itself within proc.c, for example?
>>> Do you mean adding a callback function argument to for enable_timeout()
>>> would be better?
>> Well, that's not exactly what I was thinking, but maybe your idea is
>> better than what I had in mind.
>>
>>>> Same for the other specific Check*Timeout functions.  It seems to me
>>>> that the only timeout.c specific requirement is to be able to poke
>>>> base_timeouts[].indicator and fin_time.  Maybe that can get done in
>>>> timeout.c and then have the generic checker call the module-specific
>>>> checker.
>>> Or instead of static functions, Check* functions can be external
>>> to timeout.c. It seemed to be a good idea to move all the timeout
>>> related functions to timeout.c.
>> Yeah, except that they play with members of the timeout_params struct,
>> which hopefully does not need to be exported.  So if you can just move
>> (that is, put back in their original places) the portions that do not
>> touch that strcut, it'd be great.
>
> OK, all 4 Check* functions are now moved back into proc.c,
> nothing outside of timeout.c touches anything in it. New patches
> are attached.

OK, one last post for today. The #included headers are cleaned up
in timeout.c, miscadmin.h and storage/standby.h are not needed.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of mar jun 19 12:44:04 -0400 2012:

> OK, all 4 Check* functions are now moved back into proc.c,
> nothing outside of timeout.c touches anything in it. New patches
> are attached.

Yeah, I like this one better, thanks.

It seems to me that the "check" functions are no longer "check" anymore,
right?  I mean, they don't check whether the time has expired.  It can
be argued that CheckDeadLock is well named, because it does check
whether there is a deadlock before doing anything else; but
CheckStandbyTimeout is no longer a check -- it just sends a signal.
Do we need to rename these functions?

Why are you using the deadlock timeout for authentication?  Wouldn't it
make more sense to have a separate TimeoutName, just to keep things
clean?

The "NB:" comment here doesn't seem to be useful anymore:
+ /*****************************************************************************
+  * Init, Destroy and Check functions for different timeouts.
+  *
+  * NB: all Check* functions are run inside a signal handler, so be very wary
+  * about what is done in them or in called routines.
+  *****************************************************************************/


In base_timeouts you don't initialize fin_time for any of the members.
This is probably unimportant but then why initialize start_time?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-19 19:19 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of mar jun 19 12:44:04 -0400 2012:
>
>> OK, all 4 Check* functions are now moved back into proc.c,
>> nothing outside of timeout.c touches anything in it. New patches
>> are attached.
> Yeah, I like this one better, thanks.

Thanks. It seems I am on the right track then. :-)

> It seems to me that the "check" functions are no longer "check" anymore,
> right?  I mean, they don't check whether the time has expired.  It can
> be argued that CheckDeadLock is well named, because it does check
> whether there is a deadlock before doing anything else; but
> CheckStandbyTimeout is no longer a check -- it just sends a signal.
> Do we need to rename these functions?

Well, maybe. How about *Function() instead of Check*()?

> Why are you using the deadlock timeout for authentication?

Because it originally also used the deadlock timeout.
I agree that it was abusing code that's written for something else.

>    Wouldn't it
> make more sense to have a separate TimeoutName, just to keep things
> clean?

Yes, sure. Can you tell me a way to test authentication timeout?
"pre_auth_delay" is not good for testing it.

Changing authentication_timeout to 1sec both in the GUC list and the initial value of it in
postmaster.c plus adding this code below after enabling the auth timeout didn't help.
---8<------8<------8<------8<---
if (AuthenticationTimeout > 0)
     pg_usleep((AuthenticationTimeout + 2) * 1000000L);
---8<------8<------8<------8<---

I got this despite authentication_timeout being 1 second:

---8<------8<------8<------8<---
============== removing existing temp installation    ==============
============== creating temporary installation        ==============
============== initializing database system           ==============
============== starting postmaster                    ==============

pg_regress: postmaster did not respond within 60 seconds
Examine .../postgresql.1/src/test/regress/log/postmaster.log for the reason
---8<------8<------8<------8<---

Anyway, it doesn't seem to me that the code after enabling auth timeout
actually uses anything from the timeout code but the side effect of
a signal interrupting read() on the client connection socket and logs
"incomplete startup packet" or "invalid length of startup packet".
If that's true, then it can use its own (dummy) timeout and I modified
postmaster.c and timeout.c accordingly. It survives "make check".

> The "NB:" comment here doesn't seem to be useful anymore:
> + /*****************************************************************************
> +  * Init, Destroy and Check functions for different timeouts.
> +  *
> +  * NB: all Check* functions are run inside a signal handler, so be very wary
> +  * about what is done in them or in called routines.
> +  *****************************************************************************/

I removed it.

> In base_timeouts you don't initialize fin_time for any of the members.
> This is probably unimportant but then why initialize start_time?

The answer is "out of habit" and "leftover".
But now the checks are only done for active timeouts,
I think neither of these are needed to be initialized now.
Thanks for spotting it. It survives "make check".

Attached are the new patches with these changes.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
I cleaned up the framework patch a bit.  My version's attached.  Mainly,
returning false for failure in some code paths that are only going to
have the caller elog(FATAL) is rather pointless -- it seems much better
to just have the code itself do the elog(FATAL).  In any case, I
searched for reports of those error messages being reported in the wild
and saw none.

There are other things but they are in the nitpick department.  (A
reference to "->check_timeout" remains that needs to be fixed too).

There is one thing that still bothers me on this whole framework patch,
but I'm not sure it's easily fixable.  Basically the API to timeout.c is
the whole list of timeouts and their whole behaviors.  If you want to
add a new one, you can't just call into the entry points, you have to
modify timeout.c itself (as well as timeout.h as well as whatever code
it is that you want to add timeouts to).  This may be good enough, but I
don't like it.  I think it boils down to proctimeout.h is cheating.

The interface I would actually like to have is something that lets the
modules trying to get a timeout register the timeout-checking function
as a callback.  API-wise, it would be much cleaner.  However, I'm not
raelly sure that code-wise it would be any cleaner at all.  In fact I
think it'd be rather cumbersome.  However, if you could give that idea
some thought, it'd be swell.

I haven't looked at the second patch at all yet.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-26 06:59 keltezéssel, Alvaro Herrera írta:
> I cleaned up the framework patch a bit.  My version's attached.  Mainly,
> returning false for failure in some code paths that are only going to
> have the caller elog(FATAL) is rather pointless -- it seems much better
> to just have the code itself do the elog(FATAL).  In any case, I
> searched for reports of those error messages being reported in the wild
> and saw none.

OK. The cleanups are always good.

One nitpick, though. Your version doesn't contain the timeout.h
and compilation fails because of it. :-) You might have forgotten
to do "git commit -a".

> There are other things but they are in the nitpick department.  (A
> reference to "->check_timeout" remains that needs to be fixed too).

Yes, it's called ->timeout_func() now.

> There is one thing that still bothers me on this whole framework patch,
> but I'm not sure it's easily fixable.  Basically the API to timeout.c is
> the whole list of timeouts and their whole behaviors.  If you want to
> add a new one, you can't just call into the entry points, you have to
> modify timeout.c itself (as well as timeout.h as well as whatever code
> it is that you want to add timeouts to).  This may be good enough, but I
> don't like it.  I think it boils down to proctimeout.h is cheating.
>
> The interface I would actually like to have is something that lets the
> modules trying to get a timeout register the timeout-checking function
> as a callback.  API-wise, it would be much cleaner.  However, I'm not
> raelly sure that code-wise it would be any cleaner at all.  In fact I
> think it'd be rather cumbersome.  However, if you could give that idea
> some thought, it'd be swell.

Well, I can make the registration interface similar to how LWLocks
are treated, but that doesn't avoid modification of the base_timeouts
array in case a new internal use case arises. Say:

#define USER_TIMEOUTS    4

int    n_timeouts = TIMEOUT_MAX;
static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];

and register_timeout() adds data after TIMEOUT_MAX.

> I haven't looked at the second patch at all yet.

This is the whole point of the first patch. But as I said above for
registering a new timeout source, it's a new internal use case.
One that touches a lot of places which cannot simply be done
by registering a new timeout source.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Robert Haas
Date:
On Tue, Jun 26, 2012 at 3:59 AM, Boszormenyi Zoltan <zb@cybertec.at> wrote:
> Well, I can make the registration interface similar to how LWLocks
> are treated, but that doesn't avoid modification of the base_timeouts
> array in case a new internal use case arises. Say:
>
> #define USER_TIMEOUTS    4
>
> int    n_timeouts = TIMEOUT_MAX;
> static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];

Since timeouts - unlike lwlocks - do not need to touch shared memory,
there's no need for a hard-coded limit here.  You can just allocate
the array using MemoryContextAlloc(TopMemoryContext, ...) and enlarge
it as necessary.  To avoid needing to modify the base_timeouts array,
you can just have internal callers push their entries into the array
during process startup using the same function call that an external
module would use.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-26 13:50 keltezéssel, Robert Haas írta:
> On Tue, Jun 26, 2012 at 3:59 AM, Boszormenyi Zoltan <zb@cybertec.at> wrote:
>> Well, I can make the registration interface similar to how LWLocks
>> are treated, but that doesn't avoid modification of the base_timeouts
>> array in case a new internal use case arises. Say:
>>
>> #define USER_TIMEOUTS    4
>>
>> int    n_timeouts = TIMEOUT_MAX;
>> static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];
> Since timeouts - unlike lwlocks - do not need to touch shared memory,
> there's no need for a hard-coded limit here.  You can just allocate
> the array using MemoryContextAlloc(TopMemoryContext, ...) and enlarge
> it as necessary.  To avoid needing to modify the base_timeouts array,
> you can just have internal callers push their entries into the array
> during process startup using the same function call that an external
> module would use.

I know, but it doesn't feel right to "register" static functionality.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Robert Haas
Date:
On Tue, Jun 26, 2012 at 8:03 AM, Boszormenyi Zoltan <zb@cybertec.at> wrote:
> I know, but it doesn't feel right to "register" static functionality.

We do it elsewhere.  The overhead is pretty minimal compared to other
things we already do during startup, and avoiding the need for the
array to have a fixed-size seems worth it, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jun 26, 2012 at 8:03 AM, Boszormenyi Zoltan <zb@cybertec.at> wrote:
>> I know, but it doesn't feel right to "register" static functionality.

> We do it elsewhere.  The overhead is pretty minimal compared to other
> things we already do during startup, and avoiding the need for the
> array to have a fixed-size seems worth it, IMHO.

It's not even clear that the array needs to be dynamically resizable (yet).
Compare for instance syscache invalidation callbacks, which have so far
gotten along fine with a fixed-size array to hold registrations.  It's
foreseeable that we'll need something better someday, but that day
hasn't arrived, and might never arrive.

I agree with the feeling that this patch isn't done if the "core"
timeout code has to know specifically about each usage.  We have that
situation already.
        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of mar jun 26 03:59:06 -0400 2012:
>
> 2012-06-26 06:59 keltezéssel, Alvaro Herrera írta:
> > I cleaned up the framework patch a bit.  My version's attached.  Mainly,
> > returning false for failure in some code paths that are only going to
> > have the caller elog(FATAL) is rather pointless -- it seems much better
> > to just have the code itself do the elog(FATAL).  In any case, I
> > searched for reports of those error messages being reported in the wild
> > and saw none.
>
> OK. The cleanups are always good.
>
> One nitpick, though. Your version doesn't contain the timeout.h
> and compilation fails because of it. :-) You might have forgotten
> to do "git commit -a".

Oops.  Attached. It's pretty much the same you had, except some "bools"
are changed to void.

> > There is one thing that still bothers me on this whole framework patch,
> > but I'm not sure it's easily fixable.  Basically the API to timeout.c is
> > the whole list of timeouts and their whole behaviors.  If you want to
> > add a new one, you can't just call into the entry points, you have to
> > modify timeout.c itself (as well as timeout.h as well as whatever code
> > it is that you want to add timeouts to).  This may be good enough, but I
> > don't like it.  I think it boils down to proctimeout.h is cheating.
> >
> > The interface I would actually like to have is something that lets the
> > modules trying to get a timeout register the timeout-checking function
> > as a callback.  API-wise, it would be much cleaner.  However, I'm not
> > raelly sure that code-wise it would be any cleaner at all.  In fact I
> > think it'd be rather cumbersome.  However, if you could give that idea
> > some thought, it'd be swell.
>
> Well, I can make the registration interface similar to how LWLocks
> are treated, but that doesn't avoid modification of the base_timeouts
> array in case a new internal use case arises. Say:
>
> #define USER_TIMEOUTS    4
>
> int    n_timeouts = TIMEOUT_MAX;
> static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];
>
> and register_timeout() adds data after TIMEOUT_MAX.

Well, I don't expect that we're going to have many external uses.  My
point about registration is so that internal callers use it as well.  I
was thinking we could do something like xact.c does, which is to have
somewhere in proc.c a call like this:

TimeoutRegister(DEADLOCK_TIMEOUT, 1, true, CheckDeadLock, GetCurrentTimestamp);

at process startup (the magic value 1 is the priority.  Maybe there's a
better way to handle this).  That way, timeout.c or timeout.h do not
need to know anything about proc.c at all; we don't need proctimeout.h
at all.  The only thing it (the timeout module) needs to know is that
there is a symbolic constant named DEADLOCK_TIMEOUT.  Similarly for
statement timeout, etc.

When you call enable_timeout you first have to ensure that
DEADLOCK_TIMEOUT has been registered; and if it's not, die on an
Assert().  That way you ensure that there are no bugs where you try to
enable a timeout that hasn't been registered.

(If we later find the need to let external modules add timeouts, which I
find unlikely, we can have TimeoutRegister return TimeoutName and have
this value be dynamically allocated.  But for now I think that would be
useless complication).

The difference with lwlocks is that each lwlock is just a number.  The
lwlock.c module doesn't need to know anything about how each lwlock is
actually used.  It's not the same thing here -- which is why I think it
would be better to have all modules, even the hardcoded ones, use a
registering interface.

... Now, it could be argued that it would be even better to have
TimeoutRegister not take the TimeoutName argument, and instead generate
it dynamically and pass it back to the caller -- that way you don't need
the enum in timeout.h.  The problem I have with that idea is that it
would force the caller modules to have a global variable to keep track
of that, which seems worse to me.

> > I haven't looked at the second patch at all yet.
>
> This is the whole point of the first patch.

I know that.  But I want to get the details of the framework right
before we move on to add more stuff to it.

> But as I said above for
> registering a new timeout source, it's a new internal use case.
> One that touches a lot of places which cannot simply be done
> by registering a new timeout source.

Sure.  That's going to be the case for any other timeout we want to add
(which is why I think we don't really need dynamic timeouts).

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
<div class="moz-cite-prefix">2012-06-26 18:12 keltezéssel, Alvaro Herrera írta:<br /></div><blockquote
cite="mid:1340726100-sup-3131@alvh.no-ip.org"type="cite"><pre wrap="">Excerpts from Boszormenyi Zoltan's message of mar
jun26 03:59:06 -0400 2012: 
</pre><blockquote type="cite"><pre wrap="">
2012-06-26 06:59 keltezéssel, Alvaro Herrera írta:
</pre><blockquote type="cite"><pre wrap="">I cleaned up the framework patch a bit.  My version's attached.  Mainly,
returning false for failure in some code paths that are only going to
have the caller elog(FATAL) is rather pointless -- it seems much better
to just have the code itself do the elog(FATAL).  In any case, I
searched for reports of those error messages being reported in the wild
and saw none.
</pre></blockquote><pre wrap="">
OK. The cleanups are always good.

One nitpick, though. Your version doesn't contain the timeout.h
and compilation fails because of it. :-) You might have forgotten
to do "git commit -a".
</pre></blockquote><pre wrap="">
Oops.  Attached. It's pretty much the same you had, except some "bools"
are changed to void.

</pre><blockquote type="cite"><blockquote type="cite"><pre wrap="">There is one thing that still bothers me on this
wholeframework patch, 
but I'm not sure it's easily fixable.  Basically the API to timeout.c is
the whole list of timeouts and their whole behaviors.  If you want to
add a new one, you can't just call into the entry points, you have to
modify timeout.c itself (as well as timeout.h as well as whatever code
it is that you want to add timeouts to).  This may be good enough, but I
don't like it.  I think it boils down to proctimeout.h is cheating.

The interface I would actually like to have is something that lets the
modules trying to get a timeout register the timeout-checking function
as a callback.  API-wise, it would be much cleaner.  However, I'm not
raelly sure that code-wise it would be any cleaner at all.  In fact I
think it'd be rather cumbersome.  However, if you could give that idea
some thought, it'd be swell.
</pre></blockquote><pre wrap="">
Well, I can make the registration interface similar to how LWLocks
are treated, but that doesn't avoid modification of the base_timeouts
array in case a new internal use case arises. Say:

#define USER_TIMEOUTS    4

int    n_timeouts = TIMEOUT_MAX;
static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];

and register_timeout() adds data after TIMEOUT_MAX.
</pre></blockquote><pre wrap="">
Well, I don't expect that we're going to have many external uses.  My
point about registration is so that internal callers use it as well.  I
was thinking we could do something like xact.c does, which is to have
somewhere in proc.c a call like this:

TimeoutRegister(DEADLOCK_TIMEOUT, 1, true, CheckDeadLock, GetCurrentTimestamp);

at process startup (the magic value 1 is the priority.  Maybe there's a
better way to handle this).  That way, timeout.c or timeout.h do not
need to know anything about proc.c at all; we don't need proctimeout.h
at all.  The only thing it (the timeout module) needs to know is that
there is a symbolic constant named DEADLOCK_TIMEOUT.  Similarly for
statement timeout, etc.

When you call enable_timeout you first have to ensure that
DEADLOCK_TIMEOUT has been registered; and if it's not, die on an
Assert().  That way you ensure that there are no bugs where you try to
enable a timeout that hasn't been registered.</pre></blockquote><br /> Currently, TimeoutName (as an index value) and
"priority"is the same<br /> semantically.<br /><br /> I would also add an Assert into register_timeout() to avoid
doubleregistration<br /> of the same to prevent overriding he callback function pointer. If that's in<br /> place, the
TimeoutNamevalue may still work as is: an index into base_timeouts[].<br /><br /><blockquote
cite="mid:1340726100-sup-3131@alvh.no-ip.org"type="cite"><pre wrap="">(If we later find the need to let external
modulesadd timeouts, which I 
find unlikely, we can have TimeoutRegister return TimeoutName and have
this value be dynamically allocated.  But for now I think that would be
useless complication).

The difference with lwlocks is that each lwlock is just a number.</pre></blockquote><br /> Strictly speaking, just as
TimeoutName.<br/><br /><blockquote cite="mid:1340726100-sup-3131@alvh.no-ip.org" type="cite"><pre wrap="">  The 
lwlock.c module doesn't need to know anything about how each lwlock is
actually used.  It's not the same thing here -- which is why I think it
would be better to have all modules, even the hardcoded ones, use a
registering interface.</pre></blockquote><br /> OK.<br /><br /><blockquote
cite="mid:1340726100-sup-3131@alvh.no-ip.org"type="cite"><pre wrap="">... Now, it could be argued that it would be even
betterto have 
TimeoutRegister not take the TimeoutName argument, and instead generate
it dynamically and pass it back to the caller -- that way you don't need
the enum in timeout.h.</pre></blockquote><br /> This was what I had in mind at first ...<br /><br /><blockquote
cite="mid:1340726100-sup-3131@alvh.no-ip.org"type="cite"><pre wrap="">  The problem I have with that idea is that it 
would force the caller modules to have a global variable to keep track
of that, which seems worse to me.</pre></blockquote><br /> ... and realized this as well.<br /><br /> So, should I keep
theenum TimeoutName? Are global variables for<br /> keeping dynamically assigned values preferred over the enum?<br />
Currentlywe have 5 timeout sources in total, 3 of them are used by<br /> regular backends, the remaining 2 are used by
replicationstandby.<br /> We can have a fixed size array (say with 8 or 16 elements) for future use<br /> and this
wouldbe plenty.<br /><br /> Opinions?<br /><br /><blockquote cite="mid:1340726100-sup-3131@alvh.no-ip.org"
type="cite"><prewrap=""> 

</pre><blockquote type="cite"><blockquote type="cite"><pre wrap="">I haven't looked at the second patch at all yet.
</pre></blockquote><pre wrap="">
This is the whole point of the first patch.
</pre></blockquote><pre wrap="">
I know that.  But I want to get the details of the framework right
before we move on to add more stuff to it.

</pre><blockquote type="cite"><pre wrap="">But as I said above for
registering a new timeout source, it's a new internal use case.
One that touches a lot of places which cannot simply be done
by registering a new timeout source.
</pre></blockquote><pre wrap="">
Sure.  That's going to be the case for any other timeout we want to add
(which is why I think we don't really need dynamic timeouts).

</pre><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap="">
</pre></blockquote><br /><br /><pre class="moz-signature" cols="90">--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> 
</pre>

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of mar jun 26 12:43:34 -0400 2012:

> So, should I keep the enum TimeoutName? Are global variables for
> keeping dynamically assigned values preferred over the enum?
> Currently we have 5 timeout sources in total, 3 of them are used by
> regular backends, the remaining 2 are used by replication standby.
> We can have a fixed size array (say with 8 or 16 elements) for future use
> and this would be plenty.
>
> Opinions?

My opinion is that the fixed size array is fine.

I'll go set the patch "waiting on author".  Also, remember to review
some other people's patches.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-26 18:49 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of mar jun 26 12:43:34 -0400 2012:
>
>> So, should I keep the enum TimeoutName? Are global variables for
>> keeping dynamically assigned values preferred over the enum?
>> Currently we have 5 timeout sources in total, 3 of them are used by
>> regular backends, the remaining 2 are used by replication standby.
>> We can have a fixed size array (say with 8 or 16 elements) for future use
>> and this would be plenty.
>>
>> Opinions?
> My opinion is that the fixed size array is fine.

Attached is the version which uses a registration interface.

Also, to further minimize knowledge of timeouts in timeout.c,
all GUCs are moved back to proc.c

> I'll go set the patch "waiting on author".  Also, remember to review
> some other people's patches.

I will look into it.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-06-27 10:34 keltezéssel, Boszormenyi Zoltan írta:
> 2012-06-26 18:49 keltezéssel, Alvaro Herrera írta:
>> Excerpts from Boszormenyi Zoltan's message of mar jun 26 12:43:34 -0400 2012:
>>
>>> So, should I keep the enum TimeoutName? Are global variables for
>>> keeping dynamically assigned values preferred over the enum?
>>> Currently we have 5 timeout sources in total, 3 of them are used by
>>> regular backends, the remaining 2 are used by replication standby.
>>> We can have a fixed size array (say with 8 or 16 elements) for future use
>>> and this would be plenty.
>>>
>>> Opinions?
>> My opinion is that the fixed size array is fine.
>
> Attached is the version which uses a registration interface.
>
> Also, to further minimize knowledge of timeouts in timeout.c,
> all GUCs are moved back to proc.c
>
>> I'll go set the patch "waiting on author".  Also, remember to review
>> some other people's patches.
>
> I will look into it.
>
> Best regards,
> Zoltán Böszörményi

Does anyone have a little time to look at the latest timeout framework
with the registration interface and the 2nd patch too? I am at work
until Friday next week, after that I will be on vacation for two weeks.
Just in case there is anything that needs tweaking to make it more
acceptable.

Thanks in advance,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of vie jun 29 14:30:28 -0400 2012:

> Does anyone have a little time to look at the latest timeout framework
> with the registration interface and the 2nd patch too? I am at work
> until Friday next week, after that I will be on vacation for two weeks.
> Just in case there is anything that needs tweaking to make it more
> acceptable.

I cleaned up this a bit more and now I think it's ready to commit --
as soon as somebody tests that the standby bits still work.

I still have not looked at the second patch.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:

I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
surely you need a bool return to let the caller know whether the
acquisition succeeded or failed?  AFAICS you are relying on
get_timeout_indicator() but this seems to me the wrong thing to do ...
(not to mention how ugly it is to percolate through two levels of
abstraction)

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-03 23:31 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of vie jun 29 14:30:28 -0400 2012:
>
>> Does anyone have a little time to look at the latest timeout framework
>> with the registration interface and the 2nd patch too? I am at work
>> until Friday next week, after that I will be on vacation for two weeks.
>> Just in case there is anything that needs tweaking to make it more
>> acceptable.
> I cleaned up this a bit more and now I think it's ready to commit --
> as soon as somebody tests that the standby bits still work.

You just broke initdb with this cleanup. :-)

---8<------8<------8<------8<------8<------8<------8<---
$ cat src/test/regress/log/initdb.log
Running in noclean mode.  Mistakes will not be cleaned up.
The files belonging to this database system will be owned by user "zozo".
This user must also own the server process.

The database cluster will be initialized with locales  COLLATE:  hu_HU.utf8  CTYPE:    hu_HU.utf8  MESSAGES: C
MONETARY:hu_HU.utf8  NUMERIC:  hu_HU.utf8  TIME:     hu_HU.utf8 
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "hungarian".

creating directory
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data/base/1 ... ok
initializing pg_authid ... TRAP: FailedAssertion("!(base_timeouts_initialized)", File:
"timeout.c", Line: 217)
sh: line 1: 29872 Aborted                 (core dumped)

"/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/tmp_check/install/home/zozo/pgc92dev-locktimeout/bin/postgres"

--single -F -O -c search_path=pg_catalog -c exit_on_error=true template1 > /dev/null
child process exited with exit code 134
initdb: data directory
"/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data" not
removed at user's request
---8<------8<------8<------8<------8<------8<------8<---

initdb starts postgres --single, that doesn't do BackendInitialize(),
only PostgresMain(). So, you need InitializeTimeouts() before
the RegisterTimeout() calls in PostgresMain and the elog(PANIC)
must not be in InitializeTimeouts() if called twice.


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-04 12:09 keltezéssel, Boszormenyi Zoltan írta:
> 2012-07-03 23:31 keltezéssel, Alvaro Herrera írta:
>> Excerpts from Boszormenyi Zoltan's message of vie jun 29 14:30:28 -0400 2012:
>>
>>> Does anyone have a little time to look at the latest timeout framework
>>> with the registration interface and the 2nd patch too? I am at work
>>> until Friday next week, after that I will be on vacation for two weeks.
>>> Just in case there is anything that needs tweaking to make it more
>>> acceptable.
>> I cleaned up this a bit more and now I think it's ready to commit --
>> as soon as somebody tests that the standby bits still work.
>
> You just broke initdb with this cleanup. :-)
>
> ---8<------8<------8<------8<------8<------8<------8<---
> $ cat src/test/regress/log/initdb.log
> Running in noclean mode.  Mistakes will not be cleaned up.
> The files belonging to this database system will be owned by user "zozo".
> This user must also own the server process.
>
> The database cluster will be initialized with locales
>   COLLATE:  hu_HU.utf8
>   CTYPE:    hu_HU.utf8
>   MESSAGES: C
>   MONETARY: hu_HU.utf8
>   NUMERIC:  hu_HU.utf8
>   TIME:     hu_HU.utf8
> The default database encoding has accordingly been set to "UTF8".
> The default text search configuration will be set to "hungarian".
>
> creating directory
> /home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data ... ok
> creating subdirectories ... ok
> selecting default max_connections ... 100
> selecting default shared_buffers ... 32MB
> creating configuration files ... ok
> creating template1 database in
> /home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data/base/1 ... ok
> initializing pg_authid ... TRAP: FailedAssertion("!(base_timeouts_initialized)", File:
> "timeout.c", Line: 217)
> sh: line 1: 29872 Aborted                 (core dumped)
>
"/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/tmp_check/install/home/zozo/pgc92dev-locktimeout/bin/postgres"

> --single -F -O -c search_path=pg_catalog -c exit_on_error=true template1 > /dev/null
> child process exited with exit code 134
> initdb: data directory
> "/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data" not
> removed at user's request
> ---8<------8<------8<------8<------8<------8<------8<---
>
> initdb starts postgres --single, that doesn't do BackendInitialize(),
> only PostgresMain(). So, you need InitializeTimeouts() before
> the RegisterTimeout() calls in PostgresMain and the elog(PANIC)
> must not be in InitializeTimeouts() if called twice.
>
>

Attached is the fix for this problem. PostgresMain() has a new
argument: bool single_user. This way, InitializeTimeouts() can
keep its elog(PANIC) if called twice and "postgres --single"
doesn't fail its Assert() in RegisterTimeout().

Comments?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-03 23:38 keltezéssel, Alvaro Herrera írta:
>
> I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
> surely you need a bool return to let the caller know whether the
> acquisition succeeded or failed?

Well, this is the same interface PGSemaphoreTryLock() uses.
By this reasoning, it's also broken.

>    AFAICS you are relying on
> get_timeout_indicator() but this seems to me the wrong thing to do ...
> (not to mention how ugly it is to percolate through two levels of
> abstraction)

What other way do you suggest? EINTR may come from
a different signal, which may also be ignored or not. Ctrl-C
is handled and leads to elog(ERROR) but an ignored signal
technically calls a NOP handler deep inside the OS runtime
libraries but the signal *is* delivered to the backend which
in turn interrupts semop() or whatever the platform equivalent is.

I can add a flag to timeout.c that is set whenever SIGALRM
is delivered but checking that would be another "abstraction
violation" as calling get_timeout_indicator() in your opinion.

The original coding of PGSemaphoreTryLock() used
semtimedop(), sem_timedwait() and the timeout value applied
to WaitForMultipleObjectsEx(). This was quickly shot down
as using the SIGALRM signal and its behaviour to interrupt the
locking operation is be better and fits the PostgreSQL portability
features. Also, OS X at the time didn't support sem_timedwait().

I am not complaining, just recalling the different details.

How about not hardcoding get_timeout_indicator(LOCK_TIMEOUT)
into PGSemaphoreTimedLock()? Passing TimeoutName to it would
make it more generic and usable for other timeout sources.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of mié jul 04 06:32:46 -0400 2012:
> 2012-07-04 12:09 keltezéssel, Boszormenyi Zoltan írta:

> > You just broke initdb with this cleanup. :-)

Ouch.

> > initdb starts postgres --single, that doesn't do BackendInitialize(),
> > only PostgresMain(). So, you need InitializeTimeouts() before
> > the RegisterTimeout() calls in PostgresMain and the elog(PANIC)
> > must not be in InitializeTimeouts() if called twice.
>
> Attached is the fix for this problem. PostgresMain() has a new
> argument: bool single_user. This way, InitializeTimeouts() can
> keep its elog(PANIC) if called twice and "postgres --single"
> doesn't fail its Assert() in RegisterTimeout().

Hmm.  Maybe it's better to leave InitializeTimeouts to be called twice
after all.  The fix seems a lot uglier than the disease it's curing.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of mié jul 04 07:03:44 -0400 2012:
>
> 2012-07-03 23:38 keltezéssel, Alvaro Herrera írta:
> >
> > I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
> > surely you need a bool return to let the caller know whether the
> > acquisition succeeded or failed?
>
> Well, this is the same interface PGSemaphoreTryLock() uses.
> By this reasoning, it's also broken.

Eh, no -- as far as I can see, PGSemaphoreTryLock returns a boolean,
which is precisely my point.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-04 17:25 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of mié jul 04 07:03:44 -0400 2012:
>> 2012-07-03 23:38 keltezéssel, Alvaro Herrera írta:
>>> I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
>>> surely you need a bool return to let the caller know whether the
>>> acquisition succeeded or failed?
>> Well, this is the same interface PGSemaphoreTryLock() uses.
>> By this reasoning, it's also broken.
> Eh, no -- as far as I can see, PGSemaphoreTryLock returns a boolean,
> which is precisely my point.

You're right. I blame the heat for not being able to properly
read my own code.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-04 17:20 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of mié jul 04 06:32:46 -0400 2012:
>> 2012-07-04 12:09 keltezéssel, Boszormenyi Zoltan írta:
>>> You just broke initdb with this cleanup. :-)
> Ouch.
>
>>> initdb starts postgres --single, that doesn't do BackendInitialize(),
>>> only PostgresMain(). So, you need InitializeTimeouts() before
>>> the RegisterTimeout() calls in PostgresMain and the elog(PANIC)
>>> must not be in InitializeTimeouts() if called twice.
>> Attached is the fix for this problem. PostgresMain() has a new
>> argument: bool single_user. This way, InitializeTimeouts() can
>> keep its elog(PANIC) if called twice and "postgres --single"
>> doesn't fail its Assert() in RegisterTimeout().
> Hmm.  Maybe it's better to leave InitializeTimeouts to be called twice
> after all.  The fix seems a lot uglier than the disease it's curing.

Attached are the refreshed patches. InitializeTimeouts() can be called
twice and PGSemaphoreTimedLock() returns bool now. This saves
two calls to get_timeout_indicator().

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Boszormenyi Zoltan <zb@cybertec.at> writes:
> Attached are the refreshed patches. InitializeTimeouts() can be called
> twice and PGSemaphoreTimedLock() returns bool now. This saves
> two calls to get_timeout_indicator().

I'm starting to look at this patch now.  There are a number of cosmetic
things I don't care for, the biggest one being the placement of
timeout.c under storage/lmgr/.  That seems an entirely random place,
since the functionality provided has got nothing to do with storage
let alone locks.  I'm inclined to think that utils/misc/ is about
the best option in the existing backend directory hierarchy.  Anybody
object to that, or have a better idea?

Another thing that needs some discussion is the handling of
InitializeTimeouts.  As designed, I think it's completely unsafe,
the reason being that if a process using timeouts forks off another
one, the child will inherit the parent's timeout reasons and be unable
to reset them.  Right now this might not be such a big problem because
the postmaster doesn't need any timeouts, but what if it does in the
future?  So I think we should drop the base_timeouts_initialized
"protection", and that means we need a pretty consistent scheme for
where to call InitializeTimeouts.  But we already have the same issue
with respect to on_proc_exit callbacks, so we can just add
InitializeTimeouts calls in the same places as on_exit_reset().

Comments?

I'll work up a revised patch and post it.
        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of mié jul 11 15:47:47 -0400 2012:
>
> Boszormenyi Zoltan <zb@cybertec.at> writes:
> > Attached are the refreshed patches. InitializeTimeouts() can be called
> > twice and PGSemaphoreTimedLock() returns bool now. This saves
> > two calls to get_timeout_indicator().
>
> I'm starting to look at this patch now.  There are a number of cosmetic
> things I don't care for, the biggest one being the placement of
> timeout.c under storage/lmgr/.  That seems an entirely random place,
> since the functionality provided has got nothing to do with storage
> let alone locks.  I'm inclined to think that utils/misc/ is about
> the best option in the existing backend directory hierarchy.  Anybody
> object to that, or have a better idea?

I agree with the proposed new location.

> Another thing that needs some discussion is the handling of
> InitializeTimeouts.  As designed, I think it's completely unsafe,
> the reason being that if a process using timeouts forks off another
> one, the child will inherit the parent's timeout reasons and be unable
> to reset them.  Right now this might not be such a big problem because
> the postmaster doesn't need any timeouts, but what if it does in the
> future?  So I think we should drop the base_timeouts_initialized
> "protection", and that means we need a pretty consistent scheme for
> where to call InitializeTimeouts.  But we already have the same issue
> with respect to on_proc_exit callbacks, so we can just add
> InitializeTimeouts calls in the same places as on_exit_reset().

I do agree that InitializeTimeouts is not optimally placed.  We
discussed this upthread.

Some of the calls of on_exit_reset() are placed in code that's about to
die.  Surely we don't need InitializeTimeouts() then.  Maybe we should
have another routine, say InitializeProcess (noting we already
InitProcess so maybe some name would be good), that calls both
on_exit_reset and InitializeTimeouts.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
I wrote:
> I'm starting to look at this patch now.

After reading this further, I think that the "sched_next" option is a
bad idea and we should get rid of it.  AFAICT, what it is meant to do
is (if !sched_next) automatically do "disable_all_timeouts(true)" if
the particular timeout happens to fire.  But there is no reason the
timeout's callback function couldn't do that; and doing it in the
callback is more flexible since you could have logic about whether to do
it or not, rather than freezing the decision at RegisterTimeout time.
Moreover, it does not seem to me to be a particularly good idea to
encourage timeouts to have such behavior, anyway.  Each time we add
another timeout we'd have to look to see if it's still sane for each
existing timeout to use !sched_next.  It would likely be better, in
most cases, for individual callbacks to explicitly disable any other
individual timeout reasons that should no longer be fired.

I am also underwhelmed by the "timeout_start" callback function concept.
In the first place, that's broken enable_timeout, which incorrectly
assumes that the value it gets must be "now" (see its schedule_alarm
call).  In the second place, it seems fairly likely that callers of
get_timeout_start would likewise want the clock time at which the
timeout was enabled, not the timeout_start reference time.  (If they
did want the latter, why couldn't they get it from wherever the callback
function had gotten it?)  I'm inclined to propose that we drop the
timeout_start concept and instead provide two functions for scheduling
interrupts:
enable_timeout_after(TimeoutName tn, int delay_ms);enable_timeout_at(TimeoutName tn, TimestampTz fin_time);

where you use the former if you want the standard GetCurrentTimestamp +
n msec calculation, but if you want the stop time calculated in some
other way, you calculate it yourself and use the second function.
        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Tom Lane's message of mié jul 11 15:47:47 -0400 2012:
>> ... that means we need a pretty consistent scheme for
>> where to call InitializeTimeouts.  But we already have the same issue
>> with respect to on_proc_exit callbacks, so we can just add
>> InitializeTimeouts calls in the same places as on_exit_reset().

> I do agree that InitializeTimeouts is not optimally placed.  We
> discussed this upthread.

> Some of the calls of on_exit_reset() are placed in code that's about to
> die.  Surely we don't need InitializeTimeouts() then.  Maybe we should
> have another routine, say InitializeProcess (noting we already
> InitProcess so maybe some name would be good), that calls both
> on_exit_reset and InitializeTimeouts.

Yeah, I was wondering about that too, but it seems a bit ad-hoc from a
modularity standpoint.  I gave some consideration to the idea of putting
these calls directly into fork_process(), but we'd have to be very sure
that there would never be a case where it was incorrect to do them after
forking.
        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Here is a revised version of the timeout-infrastructure patch.
I whacked it around quite a bit, notably:

* I decided that the most convenient way to handle the initialization
issue was to combine establishment of the signal handler with resetting
of the per-process variables.  So handle_sig_alarm is no longer global,
and InitializeTimeouts is called at the places where we used to do
"pqsignal(SIGALRM, handle_sig_alarm);".  I believe this is correct
because any subprocess that was intending to use SIGALRM must have
called that before establishing any timeouts.

* BTW, doing that exposed the fact that walsender processes were failing
to establish a SIGALRM signal handler, which is a pre-existing bug,
because they run a normal authentication transaction during InitPostgres
and hence need to be able to cope with deadlock and statement timeouts.
I will do something about back-patching a fix for that.

* I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
done in walsender and autovacuum processes.  I'm not totally satisfied
with that, but for sure it didn't work to only establish them in
regular backends.

* I didn't like the "TimeoutName" nomenclature, because to me "name"
suggests that the value is a string, not just an enum.  So I renamed
that to TimeoutId.

* I whacked around the logic in timeout.c a fair amount, because it
had race conditions if SIGALRM happened while enabling or disabling
a timeout.  I believe the attached coding is safe, but I'm not totally
happy with it from a performance standpoint, because it will do two
setitimer calls (a disable and then a re-enable) in many cases where
the old code did only one.

I think what we ought to do is go ahead and apply this, so that we
can have the API nailed down, and then we can revisit the internals
of timeout.c to see if we can't get the performance back up.
It's clearly a much cleaner design than the old spaghetti logic in
proc.c, so I think we ought to go ahead with this independently of
whether the second patch gets accepted.

I haven't really looked at the second patch yet, but at minimum that
will need some rebasing to match the API tweaks here.

            regards, tom lane


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-11 21:47 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> Attached are the refreshed patches. InitializeTimeouts() can be called
>> twice and PGSemaphoreTimedLock() returns bool now. This saves
>> two calls to get_timeout_indicator().
> I'm starting to look at this patch now.  There are a number of cosmetic
> things I don't care for, the biggest one being the placement of
> timeout.c under storage/lmgr/.  That seems an entirely random place,
> since the functionality provided has got nothing to do with storage
> let alone locks.  I'm inclined to think that utils/misc/ is about
> the best option in the existing backend directory hierarchy.  Anybody
> object to that, or have a better idea?

Good idea, storage/lmgr/timeout.c was chosen simply because
it was born out of files living there.

> Another thing that needs some discussion is the handling of
> InitializeTimeouts.  As designed, I think it's completely unsafe,
> the reason being that if a process using timeouts forks off another
> one, the child will inherit the parent's timeout reasons and be unable
> to reset them.  Right now this might not be such a big problem because
> the postmaster doesn't need any timeouts, but what if it does in the
> future?  So I think we should drop the base_timeouts_initialized
> "protection", and that means we need a pretty consistent scheme for
> where to call InitializeTimeouts.  But we already have the same issue
> with respect to on_proc_exit callbacks, so we can just add
> InitializeTimeouts calls in the same places as on_exit_reset().
>
> Comments?
>
> I'll work up a revised patch and post it.
>
>             regards, tom lane
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-11 22:59 keltezéssel, Tom Lane írta:
> I wrote:
>> I'm starting to look at this patch now.
> After reading this further, I think that the "sched_next" option is a
> bad idea and we should get rid of it.  AFAICT, what it is meant to do
> is (if !sched_next) automatically do "disable_all_timeouts(true)" if
> the particular timeout happens to fire.  But there is no reason the
> timeout's callback function couldn't do that; and doing it in the
> callback is more flexible since you could have logic about whether to do
> it or not, rather than freezing the decision at RegisterTimeout time.
> Moreover, it does not seem to me to be a particularly good idea to
> encourage timeouts to have such behavior, anyway.  Each time we add
> another timeout we'd have to look to see if it's still sane for each
> existing timeout to use !sched_next.  It would likely be better, in
> most cases, for individual callbacks to explicitly disable any other
> individual timeout reasons that should no longer be fired.

+1

> I am also underwhelmed by the "timeout_start" callback function concept.

It was generalized out of "static TimestampTz timeout_start_time;"
in proc.c which is valid if deadlock_timeout is activated. It is used
in ProcSleep() for logging the time difference between the time when
the timeout was activated and "now" at several places in that function.

> In the first place, that's broken enable_timeout, which incorrectly
> assumes that the value it gets must be "now" (see its schedule_alarm
> call).

You're right, it must be fixed.

>    In the second place, it seems fairly likely that callers of
> get_timeout_start would likewise want the clock time at which the
> timeout was enabled, not the timeout_start reference time.  (If they
> did want the latter, why couldn't they get it from wherever the callback
> function had gotten it?)  I'm inclined to propose that we drop the
> timeout_start concept and instead provide two functions for scheduling
> interrupts:
>
>     enable_timeout_after(TimeoutName tn, int delay_ms);
>     enable_timeout_at(TimeoutName tn, TimestampTz fin_time);
>
> where you use the former if you want the standard GetCurrentTimestamp +
> n msec calculation, but if you want the stop time calculated in some
> other way, you calculate it yourself and use the second function.
>
>             regards, tom lane
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-12 19:05 keltezéssel, Tom Lane írta:
> Here is a revised version of the timeout-infrastructure patch.
> I whacked it around quite a bit, notably:
>
> * I decided that the most convenient way to handle the initialization
> issue was to combine establishment of the signal handler with resetting
> of the per-process variables.  So handle_sig_alarm is no longer global,
> and InitializeTimeouts is called at the places where we used to do
> "pqsignal(SIGALRM, handle_sig_alarm);".  I believe this is correct
> because any subprocess that was intending to use SIGALRM must have
> called that before establishing any timeouts.

OK.

> * BTW, doing that exposed the fact that walsender processes were failing
> to establish a SIGALRM signal handler, which is a pre-existing bug,
> because they run a normal authentication transaction during InitPostgres
> and hence need to be able to cope with deadlock and statement timeouts.
> I will do something about back-patching a fix for that.

Wow, my work uncovered a pre-existing bug in PostgreSQL. :-)

> * I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
> and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
> done in walsender and autovacuum processes.  I'm not totally satisfied
> with that, but for sure it didn't work to only establish them in
> regular backends.
>
> * I didn't like the "TimeoutName" nomenclature, because to me "name"
> suggests that the value is a string, not just an enum.  So I renamed
> that to TimeoutId.

OK.

> * I whacked around the logic in timeout.c a fair amount, because it
> had race conditions if SIGALRM happened while enabling or disabling
> a timeout.  I believe the attached coding is safe, but I'm not totally
> happy with it from a performance standpoint, because it will do two
> setitimer calls (a disable and then a re-enable) in many cases where
> the old code did only one.

Disabling deadlock timeout, a.k.a. disable_sig_alarm(false) in
the original code called setitimer() twice if statement timeout
was still in effect, it was done in CheckStatementTimeout().
Considering this, I think there is no performance problem with
the new code you came up with.

> I think what we ought to do is go ahead and apply this, so that we
> can have the API nailed down, and then we can revisit the internals
> of timeout.c to see if we can't get the performance back up.
> It's clearly a much cleaner design than the old spaghetti logic in
> proc.c, so I think we ought to go ahead with this independently of
> whether the second patch gets accepted.

There is one tiny bit you might have broken. You wrote previously:

> I am also underwhelmed by the "timeout_start" callback function concept.
> In the first place, that's broken enable_timeout, which incorrectly
> assumes that the value it gets must be "now" (see its schedule_alarm
> call).  In the second place, it seems fairly likely that callers of
> get_timeout_start would likewise want the clock time at which the
> timeout was enabled, not the timeout_start reference time.  (If they
> did want the latter, why couldn't they get it from wherever the callback
> function had gotten it?)  I'm inclined to propose that we drop the
> timeout_start concept and instead provide two functions for scheduling
> interrupts:
>
>     enable_timeout_after(TimeoutName tn, int delay_ms);
>     enable_timeout_at(TimeoutName tn, TimestampTz fin_time);
>
> where you use the former if you want the standard GetCurrentTimestamp +
> n msec calculation, but if you want the stop time calculated in some
> other way, you calculate it yourself and use the second function.

It's okay, but you haven't really followed it with STATEMENT_TIMEOUT:

-----8<----------8<----------8<----------8<----------8<-----
*** 2396,2404 ****                /* Set statement timeout running, if any */                /* NB: this mustn't be
enableduntil we are within an xact */                if (StatementTimeout > 0) 
!                       enable_sig_alarm(StatementTimeout, true);                else
!                       cancel_from_timeout = false;
                xact_started = true;        }
--- 2397,2405 ----                /* Set statement timeout running, if any */                /* NB: this mustn't be
enableduntil we are within an xact */                if (StatementTimeout > 0) 
!                       enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout);                else
!                       disable_timeout(STATEMENT_TIMEOUT, false);
                xact_started = true;        }
-----8<----------8<----------8<----------8<----------8<-----

It means that StatementTimeout losts its precision. It would trigger
in the future counting from "now" instead of counting from
GetCurrentStatementStartTimestamp(). It should be:

enable_timeout_at(STATEMENT_TIMEOUT,
TimestampTzPlusMilliseconds(GetCurrentStatementStartTimestamp(), StatementTimeout));

> I haven't really looked at the second patch yet, but at minimum that
> will need some rebasing to match the API tweaks here.

Yes, I will do that.

Thanks for your review and work.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-13 22:32 keltezéssel, Boszormenyi Zoltan írta:
> 2012-07-12 19:05 keltezéssel, Tom Lane írta:
>
>> I haven't really looked at the second patch yet, but at minimum that
>> will need some rebasing to match the API tweaks here.
>
> Yes, I will do that.

While doing it, I discovered another bug you introduced.
enable_timeout_after(..., 0); would set an alarm instead of ignoring it.
Try SET deadlock_timeout = 0;

Same for enable_timeout_at(..., fin_time): if fin_time points to the past,
it enables a huge timeout that wouldn't possibly trigger for short
transactions but it's a bug nevertheless.

>
> Thanks for your review and work.
>
> Best regards,
> Zoltán Böszörményi
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Boszormenyi Zoltan <zb@cybertec.at> writes:
> It means that StatementTimeout losts its precision. It would trigger
> in the future counting from "now" instead of counting from
> GetCurrentStatementStartTimestamp().

That is, in fact, better not worse.  Note the comment in the existing
code:
        * Begin statement-level timeout        *        * Note that we compute statement_fin_time with reference to the
      * statement_timestamp, but apply the specified delay without any        * correction; that is, we ignore whatever
timehas elapsed since        * statement_timestamp was set.  In the normal case only a small        * interval will
haveelapsed and so this doesn't matter, but there        * are corner cases (involving multi-statement query strings
with       * embedded COMMIT or ROLLBACK) where we might re-initialize the        * statement timeout long after
initialreceipt of the message. In        * such cases the enforcement of the statement timeout will be a bit        *
inconsistent. This annoyance is judged not worth the cost of        * performing an additional gettimeofday() here.
 

That is, measuring from GetCurrentStatementStartTimestamp is a hack to
save one gettimeofday call, at the cost of making the timeout less
accurate, sometimes significantly so.  In the new model there isn't any
good way to duplicate this kluge (in particular, there's no point in
using enable_timeout_at, because that will still make a gettimeofday
call).  That doesn't bother me too much.  I'd rather try to buy back
whatever performance was lost by seeing if we can reduce the number of
setitimer calls.
        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Boszormenyi Zoltan <zb@cybertec.at> writes:
> While doing it, I discovered another bug you introduced.
> enable_timeout_after(..., 0); would set an alarm instead of ignoring it.
> Try SET deadlock_timeout = 0;

Hm.  I don't think it's a bug for enable_timeout_after(..., 0) to cause
a timeout ... but we'll have to change the calling code.  Thanks for
catching that.

> Same for enable_timeout_at(..., fin_time): if fin_time points to the past,
> it enables a huge timeout

No, it should cause an immediate interrupt, or at least after 1
microsecond.  Look at TimestampDifference.
        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-13 23:51 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> While doing it, I discovered another bug you introduced.
>> enable_timeout_after(..., 0); would set an alarm instead of ignoring it.
>> Try SET deadlock_timeout = 0;
> Hm.  I don't think it's a bug for enable_timeout_after(..., 0) to cause
> a timeout ... but we'll have to change the calling code.  Thanks for
> catching that.

You're welcome. This caused a segfault in my second patch,
the code didn't expect enable_timeout_after(..., 0);
to set up a timer.

So, the calling code should check for the value and not call
enable_timeout_*() when it shouldn't, right? It's making the code
more obvious for the casual reader, I agree it's better that way.

Will you post a new version with callers checking their *Timeout settings
or commit it with this change? I can then post a new second patch.

Regarding the lock_timeout functionality: the patch can be reduced to
about half of its current size and it would be a lot less intrusive if the
LockAcquire() callers don't need to report the individual object types
and names or OIDs. Do you prefer the verbose ereport()s or a
generic one about "lock timeout triggered" in ProcSleep()?

>> Same for enable_timeout_at(..., fin_time): if fin_time points to the past,
>> it enables a huge timeout
> No, it should cause an immediate interrupt, or at least after 1
> microsecond.  Look at TimestampDifference.

Okay.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Boszormenyi Zoltan's message of vie jul 13 18:11:27 -0400 2012:

> Regarding the lock_timeout functionality: the patch can be reduced to
> about half of its current size and it would be a lot less intrusive if the
> LockAcquire() callers don't need to report the individual object types
> and names or OIDs. Do you prefer the verbose ereport()s or a
> generic one about "lock timeout triggered" in ProcSleep()?

For what it's worth, I would appreciate it if you would post the lock
timeout patch for the upcoming commitfest.  This one is already almost a
month long now.  That way we can close this CF item soon and concentrate
on the remaining patches.  This one has received its fair share of
committer attention already, ISTM.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Boszormenyi Zoltan <zb@cybertec.at> writes:
>>> Try SET deadlock_timeout = 0;

Actually, when I try that I get

ERROR:  0 is outside the valid range for parameter "deadlock_timeout" (1 .. 2147483647)

So I don't see any bug here.  The places that are unconditionally doing
"enable_timeout_after(..., DeadlockTimeout);" are perfectly fine.  The
only place that might need an if-test has already got one:
         if (StatementTimeout > 0)
!             enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout);         else
!             disable_timeout(STATEMENT_TIMEOUT, false);

        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> For what it's worth, I would appreciate it if you would post the lock
> timeout patch for the upcoming commitfest.

+1.  I think it's reasonable to get the infrastructure patch in now,
but we are running out of time in this commitfest (and there's still
a lot of patches that haven't been reviewed at all).
        regards, tom lane


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-14 00:36 keltezéssel, Tom Lane írta:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> For what it's worth, I would appreciate it if you would post the lock
>> timeout patch for the upcoming commitfest.
> +1.  I think it's reasonable to get the infrastructure patch in now,
> but we are running out of time in this commitfest (and there's still
> a lot of patches that haven't been reviewed at all).
>
>             regards, tom lane
>

OK, I will do that.

Thanks for the review.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of vie jul 13 18:23:31 -0400 2012:
>
> Boszormenyi Zoltan <zb@cybertec.at> writes:
> >>> Try SET deadlock_timeout = 0;
>
> Actually, when I try that I get
>
> ERROR:  0 is outside the valid range for parameter "deadlock_timeout" (1 .. 2147483647)
>
> So I don't see any bug here.

I committed this patch without changing this.  If this needs patched,
please speak up.  I also considered adding a comment on
enable_timeout_after about calling it with a delay of 0, but refrained;
if anybody thinks it's necessary, suggestions are welcome.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-07-17 06:32 keltezéssel, Alvaro Herrera írta:
> Excerpts from Tom Lane's message of vie jul 13 18:23:31 -0400 2012:
>> Boszormenyi Zoltan <zb@cybertec.at> writes:
>>>>> Try SET deadlock_timeout = 0;
>> Actually, when I try that I get
>>
>> ERROR:  0 is outside the valid range for parameter "deadlock_timeout" (1 .. 2147483647)
>>
>> So I don't see any bug here.
> I committed this patch without changing this.  If this needs patched,
> please speak up.  I also considered adding a comment on
> enable_timeout_after about calling it with a delay of 0, but refrained;
> if anybody thinks it's necessary, suggestions are welcome.

Thanks for committing this part.

Attached is the revised (and a lot leaner, more generic) lock timeout patch,
which introduces new functionality for the timeout registration framework.
The new functionality is called "extra timeouts", better naming is welcome.
Instead of only the previously defined (deadlock and statement) timeouts,
the "extra" timeouts can also be activated from within ProcSleep() in a linked
way.

The lock timeout is a special case of this functionality. To show this, this patch
is split into two again to make reviewing easier.

This way, the timeout framework is really usable for external modules, as
envisioned by you guys

Also, rewriting statement and deadlock timeouts using this functionality
and unifying the two registration interfaces may be possible later. But
messing up proven and working code is not in the scope of this patch or
my job at this point.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

new version with a lot more cleanup is attached.

2012-07-22 22:03 keltezéssel, Boszormenyi Zoltan írta:
> Attached is the revised (and a lot leaner, more generic) lock timeout patch,
> which introduces new functionality for the timeout registration framework.
> The new functionality is called "extra timeouts", better naming is welcome.
> Instead of only the previously defined (deadlock and statement) timeouts,
> the "extra" timeouts can also be activated from within ProcSleep() in a linked
> way.

This "mini-framework" is now called "lock manager timeouts" and
both deadlock timeout and the new lock timeout belong to it.
The little piece of standalone code managing these are in
storage/lmgr/lmgrtimeout.c.

There is no PGSemaphoreTimedLock() any more. Instead,
PGSemaphoreLock() gained a new function argument for
checking timeouts. This has three advantages:
- There is only one PGSemaphoreLock() implementation and bug fixes
   like ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec don't need to
   touch several places.
- There is no layering violation between pg_sema.c and proc.c.
- The extra function can check other type of conditions from different
   callers, should the need arise.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Boszormenyi Zoltan <zb@cybertec.at> writes:
> new version with a lot more cleanup is attached.

I looked at this patch, and frankly I'm rather dismayed.  It's a mess.

To start at the bottom level, the changes to PGSemaphoreLock broke it,
and seem probably unnecessary anyway.  As coded, calling the "condition
checker" breaks handling of error returns from semop(), unless the checker
is careful to preserve errno, which LmgrTimeoutCondition isn't (and really
shouldn't need to be anyway).  More, if the checker does return true,
it causes PGSemaphoreLock to utterly violate its contract: it returns to
the caller without having acquired the semaphore, and without even telling
the caller so.  Worse, if we *did* acquire the semaphore, we might still
exit via this path, since the placement of the condition check call
ignores the comment a few lines up:
    * Once we acquire the lock, we do NOT check for an interrupt before    * returning.  The caller needs to be able to
recordownership of the lock    * before any interrupt can be accepted.
 

We could possibly fix all this with a redesigned API contract for
PGSemaphoreLock, but frankly I do not see a good reason to be tinkering
with it at all.  We never needed to get it involved with deadlock check
handling, and I don't see why that needs to change for lock timeouts.

One very good reason why monkeying with PGSemaphoreLock is wrong is that
on some platforms a SIGALRM interrupt won't interrupt the semop() call,
and thus control would never reach the "checker" anyway.  If we're going
to throw an error, it must be thrown from the interrupt handler.

The whole lmgrtimeout module seems to me to be far more mechanism than is
warranted, for too little added functionality.  In the first place, there
is nothing on the horizon suggesting that we need to let any plug-in code
get control here, and if anything the delicacy of what's going on leads me
to not wish to expose such a possibility.  In the second place, it isn't
adding any actually useful functionality, it's just agglomerating some
checks.  The minimum thing I would want it to do is avoid calling
timeout.c multiple times, which is what would happen right now (leading
to four extra syscalls per lock acquisition, which is enough new overhead
to constitute a strong objection to committing this patch at all).

On the whole I think we could forget lmgrtimeout and just hardwire the
lock timeout and deadlock check cases.  But in any case we're going to
need support in timeout.c for enabling/disabling multiple timeouts at
once without extra setitimer calls.

I'm also not thrilled about the way in which the existing deadlock
checking code has been hacked up.  As an example, you added this to
DeadLockReport():

+    if (!DeadLockTimeoutCondition())
+        return;

which again causes it to violate its contract, namely to report a
deadlock, in the most fundamental way -- existing callers aren't
expecting it to return *at all*.  Surely we can decouple the deadlock
and lock timeout cases better than that; or at least if we can't it's
a delusion to propose anything like lmgrtimeout in the first place.

There's considerable lack of attention to updating comments, too.
For instance in WaitOnLock you only bothered to update the comment
immediately adjacent to the changed code, and not the two comment
blocks above that, which both have specific references to deadlocks
being the reason for failure.

Also, the "per statement" mode for lock timeout doesn't seem to be
any such thing, because it's implemented like this:

+            case LOCK_TIMEOUT_PER_STMT:
+                enable_timeout_at(LOCK_TIMEOUT,
+                            TimestampTzPlusMilliseconds(
+                                    GetCurrentStatementStartTimestamp(),
+                                    LockTimeout));
+                break;

That doesn't provide anything like "you can spend at most N milliseconds
waiting for locks during a statement".  What it is is "if you happen to be
waiting for a lock N milliseconds after the statement starts, or if you
attempt to acquire any lock more than N milliseconds after the statement
starts, you lose instantly".  I don't think that definition actually adds
any useful functionality compared to setting statement_timeout to N
milliseconds, and it's certainly wrongly documented.  To do what the
documentation implies would require tracking and adding up the time spent
waiting for locks during a statement.  Which might be a good thing to do,
especially if the required gettimeofday() calls could be shared with what
timeout.c probably has to do anyway at start and stop of a lock wait.
But this code doesn't do it.

Lastly, I'm not sure where is the best place to be adding the control
logic for this, but I'm pretty sure postinit.c is not it.  It oughta be
somewhere under storage/lmgr/, no?
        regards, tom lane



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-09-22 20:49 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> new version with a lot more cleanup is attached.
> I looked at this patch, and frankly I'm rather dismayed.  It's a mess.

Thank you for the kind words. :-)

> To start at the bottom level, the changes to PGSemaphoreLock broke it,
> and seem probably unnecessary anyway.  As coded, calling the "condition
> checker" breaks handling of error returns from semop(), unless the checker
> is careful to preserve errno, which LmgrTimeoutCondition isn't (and really
> shouldn't need to be anyway).

I would call it an oversight, nothing more.

>    More, if the checker does return true,
> it causes PGSemaphoreLock to utterly violate its contract: it returns to
> the caller without having acquired the semaphore, and without even telling
> the caller so.  Worse, if we *did* acquire the semaphore, we might still
> exit via this path, since the placement of the condition check call
> ignores the comment a few lines up:
>
>       * Once we acquire the lock, we do NOT check for an interrupt before
>       * returning.  The caller needs to be able to record ownership of the lock
>       * before any interrupt can be accepted.
>
> We could possibly fix all this with a redesigned API contract for
> PGSemaphoreLock, but frankly I do not see a good reason to be tinkering
> with it at all.  We never needed to get it involved with deadlock check
> handling, and I don't see why that needs to change for lock timeouts.
>
> One very good reason why monkeying with PGSemaphoreLock is wrong is that
> on some platforms a SIGALRM interrupt won't interrupt the semop() call,
> and thus control would never reach the "checker" anyway.  If we're going
> to throw an error, it must be thrown from the interrupt handler.

Then please, explain to me, how on Earth can the current deadlock_timeout
can report the error? Sure, I can see the PG_TRY() ... PG_END_TRY() block in
lock.c but as far as I can see, nothing in the CheckDeadLock() -> DeadLockCheck()
-> DeadLockCheckRecurse() path diverts the code to return to a different
address from the signal handler, i.e. there is no elog(ERROR) or ereport(ERROR)
even in the DS_HARD_DEADLOCK case, so nothing calls siglongjmp(). So logically
the code shouldn't end up in the PG_CATCH() branch.

semop() will only get an errno = EINTR when returning from the signal handler
and would loop again. Then what makes it return beside being able to lock the
semaphore? The conditions in ProcSleep() that e.g. print the lock stats work somehow.

> The whole lmgrtimeout module seems to me to be far more mechanism than is
> warranted, for too little added functionality.  In the first place, there
> is nothing on the horizon suggesting that we need to let any plug-in code
> get control here, and if anything the delicacy of what's going on leads me
> to not wish to expose such a possibility.  In the second place, it isn't
> adding any actually useful functionality, it's just agglomerating some
> checks.  The minimum thing I would want it to do is avoid calling
> timeout.c multiple times, which is what would happen right now (leading
> to four extra syscalls per lock acquisition, which is enough new overhead
> to constitute a strong objection to committing this patch at all).
>
> On the whole I think we could forget lmgrtimeout and just hardwire the
> lock timeout and deadlock check cases.  But in any case we're going to
> need support in timeout.c for enabling/disabling multiple timeouts at
> once without extra setitimer calls.

OK, so you prefer the previous hardcoding PGSemaphoreTimedLock()
that makes every LockAcquire() check its return code and the detailed
error message about failed to lock the given object?

I will add new functions to timeout.c to remove many timeout sources
at once to decrease the amount of syscalls needed.

> I'm also not thrilled about the way in which the existing deadlock
> checking code has been hacked up.  As an example, you added this to
> DeadLockReport():
>
> +    if (!DeadLockTimeoutCondition())
> +        return;
>
> which again causes it to violate its contract, namely to report a
> deadlock, in the most fundamental way -- existing callers aren't
> expecting it to return *at all*.

Existing caller*s*? There is only one caller. The reasoning behind this
change was that if the code reaches ReportLmgrTimeoutError() in lock.c
then at least one timeout triggered and one *Report function in the chain
will do ereport(ERROR). *THAT* would trigger te siglongjmp() ending
up in PG_CATCH(). The added lines in DeadLockReport() ensures that
the deadlock error is not reported if it didn't trigger.

>    Surely we can decouple the deadlock
> and lock timeout cases better than that; or at least if we can't it's
> a delusion to propose anything like lmgrtimeout in the first place.
>
> There's considerable lack of attention to updating comments, too.
> For instance in WaitOnLock you only bothered to update the comment
> immediately adjacent to the changed code, and not the two comment
> blocks above that, which both have specific references to deadlocks
> being the reason for failure.
>
> Also, the "per statement" mode for lock timeout doesn't seem to be
> any such thing, because it's implemented like this:
>
> +            case LOCK_TIMEOUT_PER_STMT:
> +                enable_timeout_at(LOCK_TIMEOUT,
> +                            TimestampTzPlusMilliseconds(
> +                                    GetCurrentStatementStartTimestamp(),
> +                                    LockTimeout));
> +                break;
>
> That doesn't provide anything like "you can spend at most N milliseconds
> waiting for locks during a statement".  What it is is "if you happen to be
> waiting for a lock N milliseconds after the statement starts, or if you
> attempt to acquire any lock more than N milliseconds after the statement
> starts, you lose instantly".  I don't think that definition actually adds
> any useful functionality compared to setting statement_timeout to N
> milliseconds, and it's certainly wrongly documented.

OK, I will add the code to "lose instantly" if the code happens to be executed
after the specified deadline, this is another oversight on my side. Where can
I get brown paper bags in large amounts? :-)

But I certainly don't agree with you that the "per statement" variant of the
lock timeout is nothing more than statement timeout. What happens if the
statement doesn't spend its timeout i.e. in the lockless SELECT case or anything
that has to lock but it can get all the locks under the specified time?

statement_timeout can trigger the error while the server is already returning
the data to the client and this is not useful. Thanks to the single-row processing
mode introduced in 9.2, the client can at least get partial data and an error
before the next row.

I have certainly seen use cases that said: I only allow the statement X time
to wait on locks but otherwise give me all data.

>    To do what the
> documentation implies would require tracking and adding up the time spent
> waiting for locks during a statement.  Which might be a good thing to do,
> especially if the required gettimeofday() calls could be shared with what
> timeout.c probably has to do anyway at start and stop of a lock wait.

I will look into it.

> But this code doesn't do it.
>
> Lastly, I'm not sure where is the best place to be adding the control
> logic for this, but I'm pretty sure postinit.c is not it.  It oughta be
> somewhere under storage/lmgr/, no?

Indeed.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/




Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

2012-09-22 20:49 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> new version with a lot more cleanup is attached.
> I looked at this patch, and frankly I'm rather dismayed.  It's a mess.

I hope you won't find this one a mess. I tried to address all your complaints.

> [rather long diatribe on modifying PGSemaphoreLock improperly]

I have returned to the previous version that used PGSemaphoreTimedLock
and this time I save and restore errno around calling the timeout condition
checker.

> The whole lmgrtimeout module seems to me to be far more mechanism than is
> warranted, for too little added functionality.  [...]

lmgrtimeout is no more, back to hardcoding things.

>    The minimum thing I would want it to do is avoid calling
> timeout.c multiple times, which is what would happen right now (leading
> to four extra syscalls per lock acquisition, which is enough new overhead
> to constitute a strong objection to committing this patch at all).

I have added enable_multiple_timeouts() and disable_multiple_timeouts()
that minimize the number setitimer() calls.

> There's considerable lack of attention to updating comments, too.
> For instance in WaitOnLock you only bothered to update the comment
> immediately adjacent to the changed code, and not the two comment
> blocks above that, which both have specific references to deadlocks
> being the reason for failure.

I modified the comment in question. I hope the wording is right.

> Also, the "per statement" mode for lock timeout doesn't seem to be
> any such thing, because it's implemented like this:
>
> +            case LOCK_TIMEOUT_PER_STMT:
> +                enable_timeout_at(LOCK_TIMEOUT,
> +                            TimestampTzPlusMilliseconds(
> +                                    GetCurrentStatementStartTimestamp(),
> +                                    LockTimeout));
> +                break;
>
> That doesn't provide anything like "you can spend at most N milliseconds
> waiting for locks during a statement".  What it is is "if you happen to be
> waiting for a lock N milliseconds after the statement starts, or if you
> attempt to acquire any lock more than N milliseconds after the statement
> starts, you lose instantly".  I don't think that definition actually adds
> any useful functionality compared to setting statement_timeout to N
> milliseconds, and it's certainly wrongly documented.  To do what the
> documentation implies would require tracking and adding up the time spent
> waiting for locks during a statement.  Which might be a good thing to do,
> especially if the required gettimeofday() calls could be shared with what
> timeout.c probably has to do anyway at start and stop of a lock wait.
> But this code doesn't do it.

The code now properly accumulates the time spent in waiting for
LOCK_TIMEOUT. This means that if STATEMENT_TIMEOUT and
LOCK_TIMEOUT are set to the same value, STATEMENT_TIMEOUT
will trigger because it considers the time as one contiguous unit,
LOCK_TIMEOUT only accounts the time spent in waiting, not the time
spent with useful work. This means that LOCK_TIMEOUT doesn't
need any special code in its handler function, it's a NOP. The
relation between timeouts is only handled by the timeout.c module.

> Lastly, I'm not sure where is the best place to be adding the control
> logic for this, but I'm pretty sure postinit.c is not it.  It oughta be
> somewhere under storage/lmgr/, no?

The above change means that there is no control logic outside of
storage/lmgr now.

I temporarily abandoned the idea of detailed error reporting on
the object type and name/ID. WaitOnLock() reports "canceling statement
due to lock timeout" and LockAcquire() kept its previous semantics.
This can be quickly revived in case of demand, it would be another ~15K patch.

I hope you can find another time slot in this CF to review this one.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
Hi,

this is the latest one, fixing a bug in the accounting
of per-statement lock timeout handling and tweaking
some comments.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

Re: [PATCH] lock_timeout and common SIGALRM framework

From
Alvaro Herrera
Date:
Boszormenyi Zoltan escribió:
> Hi,
>
> this is the latest one, fixing a bug in the accounting
> of per-statement lock timeout handling and tweaking
> some comments.

Tom, are you able to give this patch some more time on this commitfest?

(If not, I think it would be fair to boot it to CF3; this is final in a
series, there's nothing that depends on it, and there's been good
movement on it; there's plenty of time before the devel cycle closes.)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Boszormenyi Zoltan escribi�:
>> this is the latest one, fixing a bug in the accounting
>> of per-statement lock timeout handling and tweaking
>> some comments.

> Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.
        regards, tom lane



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-10-18 20:08 keltezéssel, Tom Lane írta:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Boszormenyi Zoltan escribió:
>>> this is the latest one, fixing a bug in the accounting
>>> of per-statement lock timeout handling and tweaking
>>> some comments.
>> Tom, are you able to give this patch some more time on this commitfest?
> I'm still hoping to get to it, but I've been spending a lot of time on
> bug fixing rather than patch review lately :-(.  If you're hoping to
> close out the current CF soon, maybe we should just slip it to the next
> one.

Fine by me. Thanks.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/




Re: [PATCH] lock_timeout and common SIGALRM framework

From
Andres Freund
Date:
On 2012-10-18 22:40:15 +0200, Boszormenyi Zoltan wrote:
> 2012-10-18 20:08 keltezéssel, Tom Lane írta:
> >Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> >>Boszormenyi Zoltan escribió:
> >>>this is the latest one, fixing a bug in the accounting
> >>>of per-statement lock timeout handling and tweaking
> >>>some comments.
> >>Tom, are you able to give this patch some more time on this commitfest?
> >I'm still hoping to get to it, but I've been spending a lot of time on
> >bug fixing rather than patch review lately :-(.  If you're hoping to
> >close out the current CF soon, maybe we should just slip it to the next
> >one.
>
> Fine by me. Thanks.

According to this the current state of the patch should be "Ready for
Committer" not "Needs Review" is that right? I changed the state for
now...

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: [PATCH] lock_timeout and common SIGALRM framework

From
Boszormenyi Zoltan
Date:
2012-12-08 15:30 keltezéssel, Andres Freund írta:
> On 2012-10-18 22:40:15 +0200, Boszormenyi Zoltan wrote:
>> 2012-10-18 20:08 keltezéssel, Tom Lane írta:
>>> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>>>> Boszormenyi Zoltan escribió:
>>>>> this is the latest one, fixing a bug in the accounting
>>>>> of per-statement lock timeout handling and tweaking
>>>>> some comments.
>>>> Tom, are you able to give this patch some more time on this commitfest?
>>> I'm still hoping to get to it, but I've been spending a lot of time on
>>> bug fixing rather than patch review lately :-(.  If you're hoping to
>>> close out the current CF soon, maybe we should just slip it to the next
>>> one.
>> Fine by me. Thanks.
> According to this the current state of the patch should be "Ready for
> Committer" not "Needs Review" is that right? I changed the state for
> now...

Thanks. I just tried the patch on current GIT HEAD and
gives some offset warnings but no rejects. Also, it compiles
without warnings and still works as it should.
Should I post a new patch that applies cleanly?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/




Re: [PATCH] lock_timeout and common SIGALRM framework

From
Tom Lane
Date:
Boszormenyi Zoltan <zb@cybertec.at> writes:
> Thanks. I just tried the patch on current GIT HEAD and
> gives some offset warnings but no rejects. Also, it compiles
> without warnings and still works as it should.
> Should I post a new patch that applies cleanly?

Offsets are not a problem --- if you tried to keep them exact you'd just
be making a lot of unnecessary updates.  If there were fuzz warnings I'd
be more concerned.
        regards, tom lane