Thread: Transaction timeout

Transaction timeout

From
Andrey Borodin
Date:
Hello,

We have statement_timeout, idle_in_transaction_timeout,
idle_session_timeout and many more! But we have no
transaction_timeout. I've skimmed thread [0,1] about existing timeouts
and found no contraindications to implement transaction_timeout.

Nikolay asked me if I can prototype the feature for testing by him,
and it seems straightforward. Please find attached. If it's not known
to be a bad idea - we'll work on it.

Thanks!

Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/763A0689-F189-459E-946F-F0EC4458980B%40hotmail.com

Attachment

Re: Transaction timeout

From
Nikolay Samokhvalov
Date:
On Fri, Dec 2, 2022 at 9:18 PM Andrey Borodin <amborodin86@gmail.com> wrote:
Hello,

We have statement_timeout, idle_in_transaction_timeout,
idle_session_timeout and many more! But we have no
transaction_timeout. I've skimmed thread [0,1] about existing timeouts
and found no contraindications to implement transaction_timeout.

Nikolay asked me if I can prototype the feature for testing by him,
and it seems straightforward. Please find attached. If it's not known
to be a bad idea - we'll work on it.

Thanks!! It was a super quick reaction to my proposal Honestly, I was thinking about it for several years, wondering why it's still not implemented.

The reasons to have it should be straightforward – here are a couple of them I can see.

First one. In the OLTP context, we usually have:
- a hard timeout set in application server
- a hard timeout set in HTTP server
- users not willing to wait more than several seconds – and almost never being ready to wait for more than 30-60s.

If Postgres allows longer transactions (it does since we cannot reliably limit their duration now, it's always virtually unlimited), it might be doing the work that nobody is waiting / is needing anymore, speeding resources, affecting health, etc.

Why we cannot limit transaction duration reliably? The existing timeouts (namely, statement_timeout + idle_session_timeout) don't protect from having transactions consisting of a series of small statements and short pauses between them. If such behavior happens (e.g., a long series of fast UPDATEs in a loop). It can be dangerous, affecting general DB health (bloat issues). This is reason number two – DBAs might want to decide to minimize the cases of long transactions, setting transaction limits globally (and allowing to set it locally for particular sessions or for some users in rare cases).

Speaking of the patch – I just tested it (gitpod: https://gitpod.io/#https://gitlab.com/NikolayS/postgres/tree/transaction_timeout); it applies, works as expected for single-statement transactions:

postgres=# set transaction_timeout to '2s';
SET
postgres=# select pg_sleep(3);
ERROR:  canceling transaction due to transaction timeout

But it fails in the "worst" case I've described above – a series of small statements:

postgres=# set transaction_timeout to '2s';
SET
postgres=# begin; select pg_sleep(1); select pg_sleep(1); select pg_sleep(1); select pg_sleep(1); select pg_sleep(1); commit;
BEGIN
 pg_sleep
----------
 
(1 row)

 pg_sleep
----------
 
(1 row)

 pg_sleep
----------
 
(1 row)

 pg_sleep
----------
 
(1 row)

 pg_sleep
----------
 
(1 row)

COMMIT
postgres=# 

Re: Transaction timeout

From
Andrey Borodin
Date:
On Fri, Dec 2, 2022 at 10:59 PM Nikolay Samokhvalov
<samokhvalov@gmail.com> wrote:
>
> But it fails in the "worst" case I've described above – a series of small statements:

Fixed. Added test for this.

Open questions:
1. Docs
2. Order of reporting if happened lock_timeout, statement_timeout, and
transaction_timeout simultaneously. Currently there's a lot of code
around this...

Thanks!

Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Nikolay Samokhvalov
Date:
On Sat, Dec 3, 2022 at 9:41 AM Andrey Borodin <amborodin86@gmail.com> wrote:
Fixed. Added test for this.
 

works as expected.
 

Re: Transaction timeout

From
Nikolay Samokhvalov
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

Tested, works as expected;

documentation is not yet added

Re: Transaction timeout

From
Andres Freund
Date:
Hi,

On 2022-12-03 09:41:04 -0800, Andrey Borodin wrote:
> @@ -2720,6 +2723,7 @@ finish_xact_command(void)
>  
>      if (xact_started)
>      {
> +
>          CommitTransactionCommand();
>  
>  #ifdef MEMORY_CONTEXT_CHECKING

Spurious newline added.


> @@ -4460,6 +4473,10 @@ PostgresMain(const char *dbname, const char *username)
>                      enable_timeout_after(IDLE_SESSION_TIMEOUT,
>                                           IdleSessionTimeout);
>                  }
> +
> +
> +                if (get_timeout_active(TRANSACTION_TIMEOUT))
> +                    disable_timeout(TRANSACTION_TIMEOUT, false);
>              }
>  
>              /* Report any recently-changed GUC options */

Too many newlines added.


I'm a bit worried about adding evermore branches and function calls for
the processing of single statements. We already spend a noticable
percentage of the cycles for a single statement in PostgresMain(), this
adds additional overhead.

I'm somewhat inclined to think that we need some redesign here before we
add more overhead.


> @@ -1360,6 +1363,16 @@ IdleInTransactionSessionTimeoutHandler(void)
>      SetLatch(MyLatch);
>  }
>  
> +static void
> +TransactionTimeoutHandler(void)
> +{
> +#ifdef HAVE_SETSID
> +    /* try to signal whole process group */
> +    kill(-MyProcPid, SIGINT);
> +#endif
> +    kill(MyProcPid, SIGINT);
> +}
> +

Why does this use signals instead of just setting the latch like
IdleInTransactionSessionTimeoutHandler() etc?



> diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
> index 0081873a72..5229fe3555 100644
> --- a/src/bin/pg_dump/pg_backup_archiver.c
> +++ b/src/bin/pg_dump/pg_backup_archiver.c
> @@ -3089,6 +3089,7 @@ _doSetFixedOutputState(ArchiveHandle *AH)
>      ahprintf(AH, "SET statement_timeout = 0;\n");
>      ahprintf(AH, "SET lock_timeout = 0;\n");
>      ahprintf(AH, "SET idle_in_transaction_session_timeout = 0;\n");
> +    ahprintf(AH, "SET transaction_timeout = 0;\n");

Hm - why is that the right thing to do?



> diff --git a/src/test/isolation/specs/timeouts.spec b/src/test/isolation/specs/timeouts.spec
> index c747b4ae28..a7f27811c7 100644
> --- a/src/test/isolation/specs/timeouts.spec
> +++ b/src/test/isolation/specs/timeouts.spec
> @@ -23,6 +23,9 @@ step sto    { SET statement_timeout = '10ms'; }
>  step lto    { SET lock_timeout = '10ms'; }
>  step lsto    { SET lock_timeout = '10ms'; SET statement_timeout = '10s'; }
>  step slto    { SET lock_timeout = '10s'; SET statement_timeout = '10ms'; }
> +step tto    { SET transaction_timeout = '10ms'; }
> +step sleep0    { SELECT pg_sleep(0.0001) }
> +step sleep10    { SELECT pg_sleep(0.01) }
>  step locktbl    { LOCK TABLE accounts; }
>  step update    { DELETE FROM accounts WHERE accountid = 'checking'; }
>  teardown    { ABORT; }
> @@ -47,3 +50,5 @@ permutation wrtbl lto update(*)
>  permutation wrtbl lsto update(*)
>  # statement timeout expires first, row-level lock
>  permutation wrtbl slto update(*)
> +# transaction timeout
> +permutation tto sleep0 sleep0 sleep10(*)
> \ No newline at end of file

I don't think this is quite sufficient. I think the test should verify
that transaction timeout interacts correctly with statement timeout /
idle in tx timeout.

Greetings,

Andres Freund



Re: Transaction timeout

From
Andrey Borodin
Date:
Thanks for looking into this Andres!

On Mon, Dec 5, 2022 at 3:07 PM Andres Freund <andres@anarazel.de> wrote:
>
> I'm a bit worried about adding evermore branches and function calls for
> the processing of single statements. We already spend a noticable
> percentage of the cycles for a single statement in PostgresMain(), this
> adds additional overhead.
>
> I'm somewhat inclined to think that we need some redesign here before we
> add more overhead.
>
We can cap statement_timeout\idle_session_timeout by the budget of
transaction_timeout left.
Either way we can do batch function enable_timeouts() instead
enable_timeout_after().

Does anything of it make sense?

>
> > @@ -1360,6 +1363,16 @@ IdleInTransactionSessionTimeoutHandler(void)
> >       SetLatch(MyLatch);
> >  }
> >
> > +static void
> > +TransactionTimeoutHandler(void)
> > +{
> > +#ifdef HAVE_SETSID
> > +     /* try to signal whole process group */
> > +     kill(-MyProcPid, SIGINT);
> > +#endif
> > +     kill(MyProcPid, SIGINT);
> > +}
> > +
>
> Why does this use signals instead of just setting the latch like
> IdleInTransactionSessionTimeoutHandler() etc?

I just copied statement_timeout behaviour. As I understand this
implementation is prefered if the timeout can catch the backend
running at full steam.

> > diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
> > index 0081873a72..5229fe3555 100644
> > --- a/src/bin/pg_dump/pg_backup_archiver.c
> > +++ b/src/bin/pg_dump/pg_backup_archiver.c
> > @@ -3089,6 +3089,7 @@ _doSetFixedOutputState(ArchiveHandle *AH)
> >       ahprintf(AH, "SET statement_timeout = 0;\n");
> >       ahprintf(AH, "SET lock_timeout = 0;\n");
> >       ahprintf(AH, "SET idle_in_transaction_session_timeout = 0;\n");
> > +     ahprintf(AH, "SET transaction_timeout = 0;\n");
>
> Hm - why is that the right thing to do?
Because transaction_timeout has effects of statement_timeout.

Thank you!

Best regards, Andrey Borodin.



Re: Transaction timeout

From
Andres Freund
Date:
Hi,

On 2022-12-05 15:41:29 -0800, Andrey Borodin wrote:
> Thanks for looking into this Andres!
>
> On Mon, Dec 5, 2022 at 3:07 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > I'm a bit worried about adding evermore branches and function calls for
> > the processing of single statements. We already spend a noticable
> > percentage of the cycles for a single statement in PostgresMain(), this
> > adds additional overhead.
> >
> > I'm somewhat inclined to think that we need some redesign here before we
> > add more overhead.
> >
> We can cap statement_timeout\idle_session_timeout by the budget of
> transaction_timeout left.

I don't know what you mean by that.


> @@ -3277,6 +3282,7 @@ ProcessInterrupts(void)
>           */
>          lock_timeout_occurred = get_timeout_indicator(LOCK_TIMEOUT, true);
>          stmt_timeout_occurred = get_timeout_indicator(STATEMENT_TIMEOUT, true);
> +        tx_timeout_occurred = get_timeout_indicator(TRANSACTION_TIMEOUT, true);
>  
>          /*
>           * If both were set, we want to report whichever timeout completed

This doesn't update the preceding comment, btw, which now reads oddly:

        /*
         * If LOCK_TIMEOUT and STATEMENT_TIMEOUT indicators are both set, we
         * need to clear both, so always fetch both.
         */



> > > @@ -1360,6 +1363,16 @@ IdleInTransactionSessionTimeoutHandler(void)
> > >       SetLatch(MyLatch);
> > >  }
> > >
> > > +static void
> > > +TransactionTimeoutHandler(void)
> > > +{
> > > +#ifdef HAVE_SETSID
> > > +     /* try to signal whole process group */
> > > +     kill(-MyProcPid, SIGINT);
> > > +#endif
> > > +     kill(MyProcPid, SIGINT);
> > > +}
> > > +
> >
> > Why does this use signals instead of just setting the latch like
> > IdleInTransactionSessionTimeoutHandler() etc?
>
> I just copied statement_timeout behaviour. As I understand this
> implementation is prefered if the timeout can catch the backend
> running at full steam.

Hm. I'm not particularly convinced by that code. Be that as it may, I
don't think it's a good idea to have one more copy of this code. At
least the patch should wrap the signalling code in a helper.


FWIW, the HAVE_SETSID code originates in:

commit 3ad0728c817bf8abd2c76bd11d856967509b307c
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   2006-11-21 20:59:53 +0000

    On systems that have setsid(2) (which should be just about everything except
    Windows), arrange for each postmaster child process to be its own process
    group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
    process group not only the direct child process.  This provides saner behavior
    for archive and recovery scripts; in particular, it's possible to shut down a
    warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
    of SIGQUIT to the startup subprocess will result in killing the waiting
    recovery_command.  Also, this makes Query Cancel and statement_timeout apply
    to scripts being run from backends via system().  (There is no support in the
    core backend for that, but it's widely done using untrusted PLs.)  Per gripe
    from Stephen Harris and subsequent discussion.



> > > diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
> > > index 0081873a72..5229fe3555 100644
> > > --- a/src/bin/pg_dump/pg_backup_archiver.c
> > > +++ b/src/bin/pg_dump/pg_backup_archiver.c
> > > @@ -3089,6 +3089,7 @@ _doSetFixedOutputState(ArchiveHandle *AH)
> > >       ahprintf(AH, "SET statement_timeout = 0;\n");
> > >       ahprintf(AH, "SET lock_timeout = 0;\n");
> > >       ahprintf(AH, "SET idle_in_transaction_session_timeout = 0;\n");
> > > +     ahprintf(AH, "SET transaction_timeout = 0;\n");
> >
> > Hm - why is that the right thing to do?
> Because transaction_timeout has effects of statement_timeout.

I guess it's just following precedent - but it seems a bit presumptuous
to just disable safety settings a DBA might have set up. That makes some
sense for e.g. idle_in_transaction_session_timeout, because I think
e.g. parallel backup can lead to a connection being idle for a bit.


A few more review comments:

> Either way we can do batch function enable_timeouts() instead
> enable_timeout_after().

> Does anything of it make sense?

I'm at least as worried about the various calls *after* the execution of
a statement.


> +        if (tx_timeout_occurred)
> +        {
> +            LockErrorCleanup();
> +            ereport(ERROR,
> +                    (errcode(ERRCODE_TRANSACTION_TIMEOUT),
> +                     errmsg("canceling transaction due to transaction timeout")));
> +        }

The number of calls to LockErrorCleanup() here feels wrong - there's
already 8 calls in ProcessInterrupts(). Besides the code duplication I
also think it's not a sane idea to rely on having LockErrorCleanup()
before all the relevant ereport(ERROR)s.

Greetings,

Andres Freund



Re: Transaction timeout

From
Kyotaro Horiguchi
Date:
At Mon, 5 Dec 2022 15:07:47 -0800, Andres Freund <andres@anarazel.de> wrote in 
> I'm a bit worried about adding evermore branches and function calls for
> the processing of single statements. We already spend a noticable
> percentage of the cycles for a single statement in PostgresMain(), this
> adds additional overhead.
> 
> I'm somewhat inclined to think that we need some redesign here before we
> add more overhead.

insert_timeout() and remove_timeout_index() move 40*(# of several
timeouts) bytes at every enabling/disabling a timeout. This is far
frequent than actually any timeout fires. schedule_alarm() is
interested only in the nearest timeout.

So, we can get rid of the insertion sort in
insert_timeout/remove_timeout_index then let them just search for the
nearest one and remember it. Finding the nearest should be faster than
the insertion sort. Instead we need to scan over the all timeouts
instead of the a few first ones, but that's overhead is not a matter
when a timeout fires.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Transaction timeout

From
Andres Freund
Date:
Hi,

On 2022-12-06 09:44:01 +0900, Kyotaro Horiguchi wrote:
> At Mon, 5 Dec 2022 15:07:47 -0800, Andres Freund <andres@anarazel.de> wrote in
> > I'm a bit worried about adding evermore branches and function calls for
> > the processing of single statements. We already spend a noticable
> > percentage of the cycles for a single statement in PostgresMain(), this
> > adds additional overhead.
> >
> > I'm somewhat inclined to think that we need some redesign here before we
> > add more overhead.
>
> insert_timeout() and remove_timeout_index() move 40*(# of several
> timeouts) bytes at every enabling/disabling a timeout. This is far
> frequent than actually any timeout fires. schedule_alarm() is
> interested only in the nearest timeout.

> So, we can get rid of the insertion sort in
> insert_timeout/remove_timeout_index then let them just search for the
> nearest one and remember it. Finding the nearest should be faster than
> the insertion sort. Instead we need to scan over the all timeouts
> instead of the a few first ones, but that's overhead is not a matter
> when a timeout fires.

I'm most concerned about the overhead when the timeouts are *not*
enabled. And this adds a branch to start_xact_command() and a function
call for get_timeout_active(TRANSACTION_TIMEOUT) in that case. On its
own, that's not a whole lot, but it does add up. There's 10+ function
calls for timeout and ps_display purposes for every single statement.

But it's definitely also worth optimizing the timeout enabled paths. And
you're right, it looks like there's a fair bit of optimization
potential.

Greetings,

Andres Freund



Re: Transaction timeout

From
Kyotaro Horiguchi
Date:
At Mon, 5 Dec 2022 17:10:50 -0800, Andres Freund <andres@anarazel.de> wrote in 
> I'm most concerned about the overhead when the timeouts are *not*
> enabled. And this adds a branch to start_xact_command() and a function
> call for get_timeout_active(TRANSACTION_TIMEOUT) in that case. On its
> own, that's not a whole lot, but it does add up. There's 10+ function
> calls for timeout and ps_display purposes for every single statement.

That path seems like existing just for robustness. I inserted
"Assert(0)" just before the disable_timeout(), but make check-world
didn't fail [1]. Couldn't we get rid of that path, adding an assertion
instead?  I'm not sure about other timeouts yet, though.

About disabling side, we cannot rely on StatementTimeout.

[1]
# 032_apply_delay.pl fails for me so I don't know any of the later
# tests fails.

> But it's definitely also worth optimizing the timeout enabled paths. And
> you're right, it looks like there's a fair bit of optimization
> potential.

Thanks. I'll work on that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Transaction timeout

From
Andres Freund
Date:
Hi,

On 2022-12-03 09:41:04 -0800, Andrey Borodin wrote:
> Fixed. Added test for this.

The tests don't pass: https://cirrus-ci.com/build/4811553145356288

[00:54:35.337](1.251s) not ok 1 - no parameters missing from postgresql.conf.sample
[00:54:35.338](0.000s) #   Failed test 'no parameters missing from postgresql.conf.sample'
#   at /tmp/cirrus-ci-build/src/test/modules/test_misc/t/003_check_guc.pl line 81.
[00:54:35.338](0.000s) #          got: '1'
#     expected: '0'


I am just looking through a bunch of failing CF entries, so I'm perhaps
over-sensitized right now. But I'm a bit confused why there's so many
occasions of the tests clearly not having been run...


Michael, any reason 003_check_guc doesn't show the missing GUCs? It's not
particularly helpful to see "0 is different from 1". Seems that even just
something like
  is_deeply(\@missing_from_file, [], "no parameters missing from postgresql.conf.sample");
would be a decent improvement?

Greetings,

Andres Freund



Re: Transaction timeout

From
Andrey Borodin
Date:
On Wed, Dec 7, 2022 at 10:23 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-12-03 09:41:04 -0800, Andrey Borodin wrote:
> > Fixed. Added test for this.
>
> The tests don't pass: https://cirrus-ci.com/build/4811553145356288
>
oops, sorry. Here's the fix. I hope to address other feedback on the
weekend. Thank you!

Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Andrey Borodin
Date:
On Wed, Dec 7, 2022 at 1:30 PM Andrey Borodin <amborodin86@gmail.com> wrote:
> I hope to address other feedback on the weekend.

Andres, here's my progress on working with your review notes.

> > @@ -3277,6 +3282,7 @@ ProcessInterrupts(void)
> >                */
> >               lock_timeout_occurred = get_timeout_indicator(LOCK_TIMEOUT, true);
> >               stmt_timeout_occurred = get_timeout_indicator(STATEMENT_TIMEOUT, true);
> > +             tx_timeout_occurred = get_timeout_indicator(TRANSACTION_TIMEOUT, true);
> >
> >               /*
> >                * If both were set, we want to report whichever timeout completed
>
> This doesn't update the preceding comment, btw, which now reads oddly:

I've rewritten this part to correctly report all timeouts that did
happen. However there's now a tricky comma-formatting code which was
tested only manually.

> > > > @@ -1360,6 +1363,16 @@ IdleInTransactionSessionTimeoutHandler(void)
> > > >       SetLatch(MyLatch);
> > > >  }
> > > >
> > > > +static void
> > > > +TransactionTimeoutHandler(void)
> > > > +{
> > > > +#ifdef HAVE_SETSID
> > > > +     /* try to signal whole process group */
> > > > +     kill(-MyProcPid, SIGINT);
> > > > +#endif
> > > > +     kill(MyProcPid, SIGINT);
> > > > +}
> > > > +
> > >
> > > Why does this use signals instead of just setting the latch like
> > > IdleInTransactionSessionTimeoutHandler() etc?
> >
> > I just copied statement_timeout behaviour. As I understand this
> > implementation is prefered if the timeout can catch the backend
> > running at full steam.
>
> Hm. I'm not particularly convinced by that code. Be that as it may, I
> don't think it's a good idea to have one more copy of this code. At
> least the patch should wrap the signalling code in a helper.

Done, now there is a single CancelOnTimeoutHandler() handler.

> > > > diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
> > > > index 0081873a72..5229fe3555 100644
> > > > --- a/src/bin/pg_dump/pg_backup_archiver.c
> > > > +++ b/src/bin/pg_dump/pg_backup_archiver.c
> > > > @@ -3089,6 +3089,7 @@ _doSetFixedOutputState(ArchiveHandle *AH)
> > > >       ahprintf(AH, "SET statement_timeout = 0;\n");
> > > >       ahprintf(AH, "SET lock_timeout = 0;\n");
> > > >       ahprintf(AH, "SET idle_in_transaction_session_timeout = 0;\n");
> > > > +     ahprintf(AH, "SET transaction_timeout = 0;\n");
> > >
> > > Hm - why is that the right thing to do?
> > Because transaction_timeout has effects of statement_timeout.
>
> I guess it's just following precedent - but it seems a bit presumptuous
> to just disable safety settings a DBA might have set up. That makes some
> sense for e.g. idle_in_transaction_session_timeout, because I think
> e.g. parallel backup can lead to a connection being idle for a bit.

I do not know. My reasoning - everywhere we turn off
statement_timeout, we should turn off transaction_timeout too.
But I have no strong opinion here. I left this code as is in the patch
so far. For the same reason I did not change anything in
pg_backup_archiver.c.

> > Either way we can do batch function enable_timeouts() instead
> > enable_timeout_after().
>
> > Does anything of it make sense?
>
> I'm at least as worried about the various calls *after* the execution of
> a statement.

I think this code is just a one bit check
if (get_timeout_active(TRANSACTION_TIMEOUT))
inside of get_timeout_active(). With all 14 timeouts we have, I don't
see a good way to optimize stuff so far.

> > +             if (tx_timeout_occurred)
> > +             {
> > +                     LockErrorCleanup();
> > +                     ereport(ERROR,
> > +                                     (errcode(ERRCODE_TRANSACTION_TIMEOUT),
> > +                                      errmsg("canceling transaction due to transaction timeout")));
> > +             }
>
> The number of calls to LockErrorCleanup() here feels wrong - there's
> already 8 calls in ProcessInterrupts(). Besides the code duplication I
> also think it's not a sane idea to rely on having LockErrorCleanup()
> before all the relevant ereport(ERROR)s.

I've refactored that code down to 7 calls of LockErrorCleanup() :)
Logic behind various branches is not clear for me, e.g. why we do not
LockErrorCleanup() when reading commands from a client?
So I did not risk refactoring further.

> I think the test should verify
> that transaction timeout interacts correctly with statement timeout /
> idle in tx timeout.

I've added tests that check statement_timeout vs transaction_timeout.
However I could not produce stable tests with
idle_in_transaction_timeout vs transaction_timeout so far. But I'll
look into this more.
Actually, stabilizing statement_timeout vs transaction_timeout was
tricky on Windows too. I had to remove the second call to
pg_sleep(0.0001) because it was triggering 10ьs timeout from time to
time. Also, test timeout was increased to 30ms, because unlike others
in spec it's not supposed to happen at the very first SQL statement.

Thank you!


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Nathan Bossart
Date:
On Sun, Dec 18, 2022 at 12:53:31PM -0800, Andrey Borodin wrote:
> I've rewritten this part to correctly report all timeouts that did
> happen. However there's now a tricky comma-formatting code which was
> tested only manually.

I suspect this will make translation difficult.

>> > > > +     ahprintf(AH, "SET transaction_timeout = 0;\n");
>> > >
>> > > Hm - why is that the right thing to do?
>> > Because transaction_timeout has effects of statement_timeout.
>>
>> I guess it's just following precedent - but it seems a bit presumptuous
>> to just disable safety settings a DBA might have set up. That makes some
>> sense for e.g. idle_in_transaction_session_timeout, because I think
>> e.g. parallel backup can lead to a connection being idle for a bit.
> 
> I do not know. My reasoning - everywhere we turn off
> statement_timeout, we should turn off transaction_timeout too.
> But I have no strong opinion here. I left this code as is in the patch
> so far. For the same reason I did not change anything in
> pg_backup_archiver.c.

From 8383486's commit message:

    We disable statement_timeout and lock_timeout during dump and restore,
    to prevent any global settings that might exist from breaking routine
    backups.

I imagine changing this could disrupt existing servers that depend on these
overrides during backups, although I think Andres has a good point about
disabling safety settings.  This might be a good topic for another thread.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Transaction timeout

From
Andrey Borodin
Date:
On Thu, Jan 12, 2023 at 11:24 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Sun, Dec 18, 2022 at 12:53:31PM -0800, Andrey Borodin wrote:
> > I've rewritten this part to correctly report all timeouts that did
> > happen. However there's now a tricky comma-formatting code which was
> > tested only manually.
>
> I suspect this will make translation difficult.
I use special functions for this like _()

char* lock_reason = lock_timeout_occurred ? _("lock timeout") : "";

and then
ereport(ERROR, (errcode(err_code),
 errmsg("canceling statement due to %s%s%s%s%s", lock_reason, comma1,
 stmt_reason, comma2, tx_reason)));

I hope it will be translatable...

> >> > > > +     ahprintf(AH, "SET transaction_timeout = 0;\n");
> >> > >
> >> > > Hm - why is that the right thing to do?
> >> > Because transaction_timeout has effects of statement_timeout.
> >>
> >> I guess it's just following precedent - but it seems a bit presumptuous
> >> to just disable safety settings a DBA might have set up. That makes some
> >> sense for e.g. idle_in_transaction_session_timeout, because I think
> >> e.g. parallel backup can lead to a connection being idle for a bit.
> >
> > I do not know. My reasoning - everywhere we turn off
> > statement_timeout, we should turn off transaction_timeout too.
> > But I have no strong opinion here. I left this code as is in the patch
> > so far. For the same reason I did not change anything in
> > pg_backup_archiver.c.
>
> From 8383486's commit message:
>
>         We disable statement_timeout and lock_timeout during dump and restore,
>         to prevent any global settings that might exist from breaking routine
>         backups.
>
> I imagine changing this could disrupt existing servers that depend on these
> overrides during backups, although I think Andres has a good point about
> disabling safety settings.  This might be a good topic for another thread.
>
+1.

Thanks for the review!

Best regards, Andrey Borodin.



Re: Transaction timeout

From
Nikolay Samokhvalov
Date:
On Thu, Jan 12, 2023 at 11:47 AM Andrey Borodin <amborodin86@gmail.com> wrote:
On Thu, Jan 12, 2023 at 11:24 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Sun, Dec 18, 2022 at 12:53:31PM -0800, Andrey Borodin wrote:
> > I've rewritten this part to correctly report all timeouts that did
> > happen. However there's now a tricky comma-formatting code which was
> > tested only manually.

Testing it again, a couple of questions

1) The current test set has only 2 simple cases – I'd suggest adding one more (that one that didn't work in v1):

gitpod=# set transaction_timeout to '20ms';
SET
gitpod=# begin; select pg_sleep(.01); select pg_sleep(.01); select pg_sleep(.01); commit;
BEGIN
 pg_sleep
----------
 
(1 row)

ERROR:  canceling statement due to transaction timeout


gitpod=# set statement_timeout to '20ms'; set transaction_timeout to 0; -- to test value for statement_timeout and see that it doesn't fail
SET
SET
gitpod=# begin; select pg_sleep(.01); select pg_sleep(.01); select pg_sleep(.01); commit;
BEGIN
 pg_sleep
----------
 
(1 row)

 pg_sleep
----------
 
(1 row)

 pg_sleep
----------
 
(1 row)

COMMIT


2) Testing for a longer transaction (2 min), in a gitpod VM (everything is local, no network involved)

// not sure what's happening here, maybe some overheads that are not related to the implementation,
// but the goal was to see how precise the limiting is for longer transactions

gitpod=# set transaction_timeout to '2min';
SET
gitpod=# begin;
BEGIN
gitpod=*# select now(), clock_timestamp(), pg_sleep(3) \watch 1
                Fri 13 Jan 2023 03:49:24 PM UTC (every 1s)

              now              |        clock_timestamp        | pg_sleep
-------------------------------+-------------------------------+----------
 2023-01-13 15:49:22.906924+00 | 2023-01-13 15:49:24.088728+00 |
(1 row)

[...]

                Fri 13 Jan 2023 03:51:18 PM UTC (every 1s)

              now              |        clock_timestamp        | pg_sleep
-------------------------------+-------------------------------+----------
 2023-01-13 15:49:22.906924+00 | 2023-01-13 15:51:18.179579+00 |
(1 row)

ERROR:  canceling statement due to transaction timeout

gitpod=!#
gitpod=!# rollback;
ROLLBACK
gitpod=# select timestamptz '2023-01-13 15:51:18.179579+00' - '2023-01-13 15:49:22.906924+00';
    ?column?    
-----------------
 00:01:55.272655
(1 row)

gitpod=# select interval '2min' - '00:01:55.272655';
    ?column?    
-----------------
 00:00:04.727345
(1 row)

gitpod=# select interval '2min' - '00:01:55.272655' - '4s';
    ?column?    
-----------------
 00:00:00.727345
(1 row)

– it seems we could (should) have one more successful "1s wait, 3s sleep" iteration here, ~727ms somehow wasted in a loop, quite a lot.

Re: Transaction timeout

From
Andrey Borodin
Date:
Thanks for the review Nikolay!

On Fri, Jan 13, 2023 at 8:03 AM Nikolay Samokhvalov
<samokhvalov@gmail.com> wrote:
>
> 1) The current test set has only 2 simple cases – I'd suggest adding one more (that one that didn't work in v1):
>
> gitpod=# set transaction_timeout to '20ms';
> SET
> gitpod=# begin; select pg_sleep(.01); select pg_sleep(.01); select pg_sleep(.01); commit;
I tried exactly these tests - tests were unstable on Windows. Maybe
that OS has a more coarse-grained timer resolution.
It's a tradeoff between time spent on tests, strength of the test and
probability of false failure. I chose small time without false alarms.

> – it seems we could (should) have one more successful "1s wait, 3s sleep" iteration here, ~727ms somehow wasted in a
loop,quite a lot. 

I think big chunk from these 727ms were spent between "BEGIN" and
"select now(), clock_timestamp(), pg_sleep(3) \watch 1". I doubt patch
really contains arithmetic errors.

Many thanks for looking into this!


Best regards, Andrey Borodin.



Re: Transaction timeout

From
Nikolay Samokhvalov
Date:
On Fri, Jan 13, 2023 at 10:16 AM Andrey Borodin <amborodin86@gmail.com> wrote:
> – it seems we could (should) have one more successful "1s wait, 3s sleep" iteration here, ~727ms somehow wasted in a loop, quite a lot.

I think big chunk from these 727ms were spent between "BEGIN" and
"select now(), clock_timestamp(), pg_sleep(3) \watch 1". 

Not really – there was indeed ~2s delay between BEGIN and the first pg_sleep query, but those ~727ms is something else.
 
here we measure the remainder between the beginning of the transaction measured by "now()' and the the beginning of the last successful pg_sleep() query:

gitpod=# select timestamptz '2023-01-13 15:51:18.179579+00' - '2023-01-13 15:49:22.906924+00';
    ?column?    
-----------------
 00:01:55.272655
(1 row)

It already includes all delays that we had from the beginning of our transaction.

The problem with my question was that I didn't take into attention that '2023-01-13 15:51:18.179579+00' is when the last successful query *started*. So the remainder of our 2-min quota – 00:00:04.727345 – includes the last successful loop (3s of successful query + 1s of waiting), and then we have failed after ~700ms.

In other words, there are no issues here, all good.

> Many thanks for looking into this!

many thanks for implementing it

Re: Transaction timeout

From
Peter Eisentraut
Date:
On 12.01.23 20:46, Andrey Borodin wrote:
>> On Sun, Dec 18, 2022 at 12:53:31PM -0800, Andrey Borodin wrote:
>>> I've rewritten this part to correctly report all timeouts that did
>>> happen. However there's now a tricky comma-formatting code which was
>>> tested only manually.
>> I suspect this will make translation difficult.
> I use special functions for this like _()
> 
> char* lock_reason = lock_timeout_occurred ? _("lock timeout") : "";
> 
> and then
> ereport(ERROR, (errcode(err_code),
>   errmsg("canceling statement due to %s%s%s%s%s", lock_reason, comma1,
>   stmt_reason, comma2, tx_reason)));
> 
> I hope it will be translatable...

No, you can't do that.  You have to write out all the strings separately.



Re: Transaction timeout

From
Fujii Masao
Date:

On 2022/12/19 5:53, Andrey Borodin wrote:
> On Wed, Dec 7, 2022 at 1:30 PM Andrey Borodin <amborodin86@gmail.com> wrote:
>> I hope to address other feedback on the weekend.

Thanks for implementing this feature!

While testing v4 patch, I noticed it doesn't handle the COMMIT AND CHAIN case correctly.
When COMMIT AND CHAIN is executed, I believe the transaction timeout counter should reset
and start from zero with the next transaction. However, it appears that the current
v4 patch doesn't reset the counter in this scenario. Can you confirm this?

With the v4 patch, I found that timeout errors no longer occur during the idle in
transaction phase. Instead, they occur when the next statement is executed. Is this
the intended behavior? I thought some users might want to use the transaction timeout
feature to prevent prolonged transactions and promptly release resources (e.g., locks)
in case of a timeout, similar to idle_in_transaction_session_timeout.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:
Thanks for looking into this!

> On 6 Sep 2023, at 13:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> While testing v4 patch, I noticed it doesn't handle the COMMIT AND CHAIN case correctly.
> When COMMIT AND CHAIN is executed, I believe the transaction timeout counter should reset
> and start from zero with the next transaction. However, it appears that the current
> v4 patch doesn't reset the counter in this scenario. Can you confirm this?
Yes, I was not aware of this feature. I'll test and fix this.

> With the v4 patch, I found that timeout errors no longer occur during the idle in
> transaction phase. Instead, they occur when the next statement is executed. Is this
> the intended behavior?
AFAIR I had been testing that behaviour of "idle in transaction" was intact. I'll check that again.

> I thought some users might want to use the transaction timeout
> feature to prevent prolonged transactions and promptly release resources (e.g., locks)
> in case of a timeout, similar to idle_in_transaction_session_timeout.
Yes, this is exactly how I was expecting the feature to behave: empty up max_connections slots for long-hanging
transactions.

Thanks for your findings, I'll check and post new version!


Best regards, Andrey Borodin.


Re: Transaction timeout

From
bt23nguyent
Date:
On 2023-09-06 20:32, Andrey M. Borodin wrote:
> Thanks for looking into this!
> 
>> On 6 Sep 2023, at 13:16, Fujii Masao <masao.fujii@oss.nttdata.com> 
>> wrote:
>> 
>> While testing v4 patch, I noticed it doesn't handle the COMMIT AND 
>> CHAIN case correctly.
>> When COMMIT AND CHAIN is executed, I believe the transaction timeout 
>> counter should reset
>> and start from zero with the next transaction. However, it appears 
>> that the current
>> v4 patch doesn't reset the counter in this scenario. Can you confirm 
>> this?
> Yes, I was not aware of this feature. I'll test and fix this.
> 
>> With the v4 patch, I found that timeout errors no longer occur during 
>> the idle in
>> transaction phase. Instead, they occur when the next statement is 
>> executed. Is this
>> the intended behavior?
> AFAIR I had been testing that behaviour of "idle in transaction" was
> intact. I'll check that again.
> 
>> I thought some users might want to use the transaction timeout
>> feature to prevent prolonged transactions and promptly release 
>> resources (e.g., locks)
>> in case of a timeout, similar to idle_in_transaction_session_timeout.
> Yes, this is exactly how I was expecting the feature to behave: empty
> up max_connections slots for long-hanging transactions.
> 
> Thanks for your findings, I'll check and post new version!
> 
> 
> Best regards, Andrey Borodin.
Hi,

Thank you for implementing this nice feature!
I tested the v4 patch in the interactive transaction mode with 3 
following cases:

1. Start a transaction with transaction_timeout=0 (i.e., timeout 
disabled), and then change the timeout value to more than 0 during the 
transaction.

=# SET transaction_timeout TO 0;
=# BEGIN;    //timeout is not enabled
=# SELECT pg_sleep(5);
=# SET transaction_timeout TO '1s';
=# SELECT pg_sleep(10);    //timeout is enabled with 1s
In this case, the transaction timeout happens during pg_sleep(10).

2. Start a transaction with transaction_timeout>0 (i.e., timeout 
enabled), and then change the timeout value to more than 0 during the 
transaction.

=# SET transaction_timeout TO '1000s';
=# BEGIN;    //timeout is enabled with 1000s
=# SELECT pg_sleep(5);
=# SET transaction_timeout TO '1s';
=# SELECT pg_sleep(10);    //timeout is not restarted and still running 
with 1000s
In this case, the transaction timeout does NOT happen during 
pg_sleep(10).

3. Start a transaction with transaction_timeout>0 (i.e., timeout 
enabled), and then change the timeout value to 0 during the transaction.

=# SET transaction_timeout TO '10s';
=# BEGIN;    //timeout is enabled with 10s
=# SELECT pg_sleep(5);
=# SET transaction_timeout TO 0;
=# SELECT pg_sleep(10);    //timeout is NOT disabled and still running 
with 10s
In this case, the transaction timeout happens during pg_sleep(10).

The first case where transaction_timeout is disabled before the 
transaction begins is totally fine. However, in the second and third 
cases, where transaction_timeout is enabled before the transaction 
begins, since the timeout has already enabled with a certain value, it 
will not be enabled again with a new setting value.

Furthermore, let's say I want to set a transaction_timeout value for all 
transactions in postgresql.conf file so it would affect all sessions. 
The same behavior happened but for all 3 cases, here is one example with 
the second case:

=# BEGIN; SHOW transaction_timeout; select pg_sleep(10); SHOW 
transaction_timeout; COMMIT;
BEGIN
  transaction_timeout
---------------------
  15s
(1 row)

2023-09-07 11:52:50.510 JST [23889] LOG:  received SIGHUP, reloading 
configuration files
2023-09-07 11:52:50.510 JST [23889] LOG:  parameter 
"transaction_timeout" changed to "5000"
  pg_sleep
----------

(1 row)

  transaction_timeout
---------------------
  5s
(1 row)

COMMIT

I am of the opinion that these behaviors might lead to confusion among 
users. Could you confirm if these are the intended behaviors?

Additionally, I think the short description should be "Sets the maximum 
allowed time to commit a transaction." or "Sets the maximum allowed time 
to wait before aborting a transaction." so that it could be more clear 
and consistent with other %_timeout descriptions.

Also, there is a small whitespace error here:
src/backend/tcop/postgres.c:3373: space before tab in indent.
+                                                               
stmt_reason, comma2, tx_reason)));

On a side note, while testing the patch with pgbench, it came to my 
attention that in scenarios involving the execution of multiple 
concurrent transactions within a high contention environment and with 
relatively short timeout durations, there is a potential for cascading 
blocking. This phenomenon can lead to multiple transactions exceeding 
their designated timeouts, consequently resulting in a degradation of 
transaction processing performance. No?
Do you think this feature should be co-implemented with the existing 
concurrency control protocol to maintain the transaction performance 
(e.g. a transaction scheduling mechanism based on transaction timeout)?

Regards,
Tung Nguyen



Re: Transaction timeout

From
Nikolay Samokhvalov
Date:
On Wed, Sep 6, 2023 at 1:16 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> With the v4 patch, I found that timeout errors no longer occur during the idle in
> transaction phase. Instead, they occur when the next statement is executed. Is this
> the intended behavior? I thought some users might want to use the transaction timeout
> feature to prevent prolonged transactions and promptly release resources (e.g., locks)
> in case of a timeout, similar to idle_in_transaction_session_timeout.

I agree – it seems reasonable to interrupt transaction immediately
when the timeout occurs. This was the idea – to determine the maximum
possible time for all transactions that is allowed on a server, to
avoid too long-lasting locking and not progressing xmin horizon.

That being said, I also think this wording in the docs:

+        Setting <varname>transaction_timeout</varname> in
+        <filename>postgresql.conf</filename> is not recommended
because it would
+        affect all sessions.

It was inherited from statement_timeout, where I also find this
wording too one-sided. There are certain situations where we do want
global setting to be set – actually, any large OLTP case (to be on
lower risk side; those users who need longer timeout, can set it when
needed, but by default we do need very restrictive timeouts, usually <
1 minute, like we do in HTTP or application servers). I propose this:
> Setting transaction_timeout in postgresql.conf should be done with caution because it affects all sessions.

Looking at the v4 of the patch, a couple of more comments that might
be helpful for v5 (which is planned, as I understand):

1)  it might be beneficial to add tests for more complex scenarios,
e.g., subtransactions

2) In the error message:

+ errmsg("canceling statement due to %s%s%s%s%s", lock_reason, comma1,
+ stmt_reason, comma2, tx_reason)));

– it seems we can have excessive commas here

3) Perhaps, we should say that we cancel the transaction, not
statement (especially in the case when it is happening in the
idle-in-transaction state).

Thanks for working on this feature!



Re: Transaction timeout

From
邱宇航
Date:
I test the V4 patch and found the backend does't process SIGINT while it's in secure_read.
And it seems not a good choice to report ERROR during secure_read, which will turns into
FATAL "terminating connection because protocol synchronization was lost".

It might be much easier to terminate the backend rather than cancel the backend just like
idle_in_transaction_session_timeout and idle_session_timeout did. But the name of the GUC
might be transaction_session_timeout.

And what about 2PC transaction? The hanging 2PC transaction also hurts server a lot. It’s
active transaction but not active backend. Can we cancel the 2PC transaction and how we
cancel it.

--
Yuhang Qiu




Re: Transaction timeout

From
"Andrey M. Borodin"
Date:
> On 20 Nov 2023, at 06:33, 邱宇航 <iamqyh@gmail.com> wrote:

Nikolay, Peter, Fujii, Tung, Yuhang, thank you for reviewing this.
I'll address feedback soon, this patch has been for a long time on my TODO list.
I've started with fixing problem of COMMIT AND CHAIN by restarting timeout counter.
Tomorrow I plan to fix raising of the timeout when the transaction is idle.
Renaming transaction_timeout to something else (to avoid confusion with prepared xacts) also seems correct to me.

Thanks!


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 30 Nov 2023, at 20:06, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
> Tomorrow I plan to fix raising of the timeout when the transaction is idle.
> Renaming transaction_timeout to something else (to avoid confusion with prepared xacts) also seems correct to me.


Here's a v6 version of the feature. Changes:
1. Now transaction_timeout will break connection with FATAL instead of hanging in "idle in transaction (aborted)"
2. It will kill equally idle and active transactions
3. New isolation tests are slightly more complex: isolation tester does not like when the connection is forcibly
killed,thus there must be only 1 permutation with killed connection. 

TODO: as Yuhang pointed out prepared transactions must not be killed, thus name "transaction_timeout" is not correct. I
thinkthe name must be like "session_transaction_timeout", but I'd like to have an opinion of someone more experienced
ingiving names to GUCs than me. Or, perhaps, a native speaker? 


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Wed, 06 Dec 2023 at 21:05, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 30 Nov 2023, at 20:06, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>
>>
>> Tomorrow I plan to fix raising of the timeout when the transaction is idle.
>> Renaming transaction_timeout to something else (to avoid confusion with prepared xacts) also seems correct to me.
>
>
> Here's a v6 version of the feature. Changes:
> 1. Now transaction_timeout will break connection with FATAL instead of hanging in "idle in transaction (aborted)"
> 2. It will kill equally idle and active transactions
> 3. New isolation tests are slightly more complex: isolation tester does not like when the connection is forcibly
killed,thus there must be only 1 permutation with killed connection.
 
>

Greate. If idle_in_transaction_timeout is bigger than transaction_timeout,
the idle-in-transaction timeout don't needed, right?

> TODO: as Yuhang pointed out prepared transactions must not be killed, thus name "transaction_timeout" is not correct.
Ithink the name must be like "session_transaction_timeout", but I'd like to have an opinion of someone more experienced
ingiving names to GUCs than me. Or, perhaps, a native speaker?
 
>
How about transaction_session_timeout? Similar to idle_session_timeout.

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
邱宇航
Date:
Hi,

I read the V6 patch and found something needs to be improved.

Prepared transactions should also be documented.
         A value of zero (the default) disables the timeout.
+        This timeout is not applied to prepared transactions. Only transactions
+        with user connections are affected.

Missing 'time'.
-                       gettext_noop("Sets the maximum allowed in a transaction."),
+                       gettext_noop("Sets the maximum allowed time in a transaction."),

16 is already released. It's 17 now.
-       if (AH->remoteVersion >= 160000)
+       if (AH->remoteVersion >= 170000)
                ExecuteSqlStatement(AH, "SET transaction_timeout = 0");

And I test the V6 patch and it works as expected.

--
Yuhang Qiu

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:
Thanks Yuhang!

On 7 Dec 2023, at 13:39, 邱宇航 <iamqyh@gmail.com> wrote:

I read the V6 patch and found something needs to be improved.

Fixed. PFA v7.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:


On 7 Dec 2023, at 06:25, Japin Li <japinli@hotmail.com> wrote:

 If idle_in_transaction_timeout is bigger than transaction_timeout,
the idle-in-transaction timeout don't needed, right?
Yes, I think so.


TODO: as Yuhang pointed out prepared transactions must not be killed, thus name "transaction_timeout" is not correct. I think the name must be like "session_transaction_timeout", but I'd like to have an opinion of someone more experienced in giving names to GUCs than me. Or, perhaps, a native speaker?

How about transaction_session_timeout? Similar to idle_session_timeout.

Well, Yuhang also suggested this name...

Honestly, I still have a gut feeling that transaction_timeout is a good name, despite being not exactly precise.

Thanks!


Best regards, Andrey Borodin.
PS Sorry for posting twice to the same thread, i noticed your message only after answering to Yuhang's review.

Re: Transaction timeout

From
Japin Li
Date:
On Thu, 07 Dec 2023 at 20:40, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 7 Dec 2023, at 06:25, Japin Li <japinli@hotmail.com> wrote:
>>
>>  If idle_in_transaction_timeout is bigger than transaction_timeout,
>> the idle-in-transaction timeout don't needed, right?
> Yes, I think so.
>

Should we disable the idle_in_transaction_timeout in this case? Of cursor, I'm
not strongly insist on this.

I think you forget disable transaction_timeout in AutoVacWorkerMain().
If not, can you elaborate on why you don't disable it?

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 8 Dec 2023, at 12:59, Japin Li <japinli@hotmail.com> wrote:
>
>
> On Thu, 07 Dec 2023 at 20:40, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>> On 7 Dec 2023, at 06:25, Japin Li <japinli@hotmail.com> wrote:
>>>
>>> If idle_in_transaction_timeout is bigger than transaction_timeout,
>>> the idle-in-transaction timeout don't needed, right?
>> Yes, I think so.
>>
>
> Should we disable the idle_in_transaction_timeout in this case? Of cursor, I'm
> not strongly insist on this.
Good idea!

> I think you forget disable transaction_timeout in AutoVacWorkerMain().
> If not, can you elaborate on why you don't disable it?

Seems like code in autovacuum.c was copied, but patch was not updated. I’ve fixed this oversight.

Thanks!


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Fri, 08 Dec 2023 at 18:08, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 8 Dec 2023, at 12:59, Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> On Thu, 07 Dec 2023 at 20:40, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>>> On 7 Dec 2023, at 06:25, Japin Li <japinli@hotmail.com> wrote:
>>>>
>>>> If idle_in_transaction_timeout is bigger than transaction_timeout,
>>>> the idle-in-transaction timeout don't needed, right?
>>> Yes, I think so.
>>>
>>
>> Should we disable the idle_in_transaction_timeout in this case? Of cursor, I'm
>> not strongly insist on this.
> Good idea!
>
>> I think you forget disable transaction_timeout in AutoVacWorkerMain().
>> If not, can you elaborate on why you don't disable it?
>
> Seems like code in autovacuum.c was copied, but patch was not updated. I’ve fixed this oversight.
>

Thanks for updating the patch. LGTM.

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 8 Dec 2023, at 15:29, Japin Li <japinli@hotmail.com> wrote:
>
> Thanks for updating the patch. LGTM.

PFA v9. Changes:
1. Added tests for idle_in_transaction_timeout
2. Suppress statement_timeout if it’s shorter than transaction_timeout

Consider changing status of the commitfest entry if you think it’s ready for committer.

Thanks!


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Fri, 15 Dec 2023 at 17:51, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 8 Dec 2023, at 15:29, Japin Li <japinli@hotmail.com> wrote:
>>
>> Thanks for updating the patch. LGTM.
>
> PFA v9. Changes:
> 1. Added tests for idle_in_transaction_timeout
> 2. Suppress statement_timeout if it’s shorter than transaction_timeout
>
+       if (StatementTimeout > 0
+               && IdleInTransactionSessionTimeout < TransactionTimeout)
                   ^

Should be StatementTimeout?


Maybe we should add documentation to describe this behavior.

> Consider changing status of the commitfest entry if you think it’s ready for committer.
>

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 16 Dec 2023, at 05:58, Japin Li <japinli@hotmail.com> wrote:
>
>
> On Fri, 15 Dec 2023 at 17:51, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>> On 8 Dec 2023, at 15:29, Japin Li <japinli@hotmail.com> wrote:
>>>
>>> Thanks for updating the patch. LGTM.
>>
>> PFA v9. Changes:
>> 1. Added tests for idle_in_transaction_timeout
>> 2. Suppress statement_timeout if it’s shorter than transaction_timeout
>>
> +       if (StatementTimeout > 0
> +               && IdleInTransactionSessionTimeout < TransactionTimeout)
>                   ^
>
> Should be StatementTimeout?
Yes, that’s an oversight. I’ve adjusted tests so they catch this problem.

> Maybe we should add documentation to describe this behavior.

I've added a paragraph about it to config.sgml, but I'm not sure about the comprehensiveness of the wording.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Mon, 18 Dec 2023 at 13:49, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 16 Dec 2023, at 05:58, Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> On Fri, 15 Dec 2023 at 17:51, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>>> On 8 Dec 2023, at 15:29, Japin Li <japinli@hotmail.com> wrote:
>>>>
>>>> Thanks for updating the patch. LGTM.
>>>
>>> PFA v9. Changes:
>>> 1. Added tests for idle_in_transaction_timeout
>>> 2. Suppress statement_timeout if it’s shorter than transaction_timeout
>>>
>> +       if (StatementTimeout > 0
>> +               && IdleInTransactionSessionTimeout < TransactionTimeout)
>>                   ^
>>
>> Should be StatementTimeout?
> Yes, that’s an oversight. I’ve adjusted tests so they catch this problem.
>
>> Maybe we should add documentation to describe this behavior.
>
> I've added a paragraph about it to config.sgml, but I'm not sure about the comprehensiveness of the wording.
>

Thanks for updating the patch, no objections.

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 18 Dec 2023, at 14:32, Japin Li <japinli@hotmail.com> wrote:
>
>
> Thanks for updating the patch

Sorry for the noise, but commitfest bot found one more bug in handling statement timeout. PFA v11.


Best regards, Andrey Borodin.



Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Mon, 18 Dec 2023 at 17:40, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 18 Dec 2023, at 14:32, Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> Thanks for updating the patch
>
> Sorry for the noise, but commitfest bot found one more bug in handling statement timeout. PFA v11.
>

On Windows, there still have an error:

diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out
C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
--- C:/cirrus/src/test/isolation/expected/timeouts.out    2023-12-18 10:22:21.772537200 +0000
+++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out    2023-12-18 10:26:08.039831800 +0000
@@ -103,24 +103,7 @@
     0
 (1 row)

-step stt2_check: SELECT 1;
-FATAL:  terminating connection due to transaction timeout
-server closed the connection unexpectedly
+PQconsumeInput failed: server closed the connection unexpectedly
     This probably means the server terminated abnormally
     before or while processing the request.

-step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET statement_timeout = '10s'; SET lock_timeout =
'10s';SET transaction_timeout = '10s';
 
-step itt4_begin: BEGIN ISOLATION LEVEL READ COMMITTED;
-step sleep_there: SELECT pg_sleep(0.01);
-pg_sleep
---------
-
-(1 row)
-
-step stt3_check_itt4: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/itt4'
<waiting...>
 
-step stt3_check_itt4: <... completed>
-count
------
-    0
-(1 row)
-


--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 19 Dec 2023, at 06:25, Japin Li <japinli@hotmail.com> wrote:
>
> On Windows, there still have an error:

Uhhmm, yes. Connection termination looks different on windows machine.
I’ve checked how this looks in relication slot tests and removed select that was observing connection failure.
I don’t have Windows machine, so I hope CF bot will pick this.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> I don’t have Windows machine, so I hope CF bot will pick this.

I used Github CI to produce version of tests that seems to be is stable on Windows.
Sorry for the noise.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>
>> I don’t have Windows machine, so I hope CF bot will pick this.
>
> I used Github CI to produce version of tests that seems to be is stable on Windows.

It still failed on Windows Server 2019 [1].

diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out
C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
--- C:/cirrus/src/test/isolation/expected/timeouts.out    2023-12-19 10:34:30.354721100 +0000
+++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out    2023-12-19 10:38:25.877981600 +0000
@@ -100,7 +100,7 @@
 step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/stt2'
 count
 -----
-    0
+    1
 (1 row)

 step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET statement_timeout = '10s'; SET lock_timeout =
'10s';SET transaction_timeout = '10s'; 

[1]
https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
Junwang Zhao
Date:
On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > I don’t have Windows machine, so I hope CF bot will pick this.
>
> I used Github CI to produce version of tests that seems to be is stable on Windows.
> Sorry for the noise.
>
>
> Best regards, Andrey Borodin.

+       <para>
+        If <varname>transaction_timeout</varname> is shorter than
+        <varname>idle_in_transaction_session_timeout</varname> or
<varname>statement_timeout</varname>
+        <varname>transaction_timeout</varname> will invalidate longer timeout.
+       </para>

When transaction_timeout is *equal* to idle_in_transaction_session_timeout
or statement_timeout, idle_in_transaction_session_timeout and statement_timeout
will also be invalidated, the logic in the code seems right, though
this document
is a little bit inaccurate.

--
Regards
Junwang Zhao



Re: Transaction timeout

From
Junwang Zhao
Date:
On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >
> >
> >
> > > On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > >
> > > I don’t have Windows machine, so I hope CF bot will pick this.
> >
> > I used Github CI to produce version of tests that seems to be is stable on Windows.
> > Sorry for the noise.
> >
> >
> > Best regards, Andrey Borodin.
>
> +       <para>
> +        If <varname>transaction_timeout</varname> is shorter than
> +        <varname>idle_in_transaction_session_timeout</varname> or
> <varname>statement_timeout</varname>
> +        <varname>transaction_timeout</varname> will invalidate longer timeout.
> +       </para>
>
> When transaction_timeout is *equal* to idle_in_transaction_session_timeout
> or statement_timeout, idle_in_transaction_session_timeout and statement_timeout
> will also be invalidated, the logic in the code seems right, though
> this document
> is a little bit inaccurate.
>
       <para>
        Unlike <varname>statement_timeout</varname>, this timeout can only occur
        while waiting for locks.  Note that if
<varname>statement_timeout</varname>
        is nonzero, it is rather pointless to set
<varname>lock_timeout</varname> to
        the same or larger value, since the statement timeout would always
        trigger first.  If <varname>log_min_error_statement</varname> is set to
        <literal>ERROR</literal> or lower, the statement that timed out will be
        logged.
       </para>

There is a note about statement_timeout and lock_timeout, set both
and lock_timeout >= statement_timeout is pointless, but this logic seems not
implemented in the code. I am wondering if lock_timeout >= transaction_timeout,
should we invalidate lock_timeout? Or maybe just document this.

> --
> Regards
> Junwang Zhao



--
Regards
Junwang Zhao



回复: Transaction timeout

From
Thomas wen
Date:
Hi Junwang Zhao
     #should we invalidate lock_timeout? Or maybe just document this.
I think you mean when lock_time is greater than trasaction-time invalidate lock_timeout or  needs to be logged ?




Best whish 

发件人: Junwang Zhao <zhjwpku@gmail.com>
发送时间: 2023年12月20日 9:48
收件人: Andrey M. Borodin <x4mmm@yandex-team.ru>
抄送: Japin Li <japinli@hotmail.com>; 邱宇航 <iamqyh@gmail.com>; Fujii Masao <masao.fujii@oss.nttdata.com>; Andrey Borodin <amborodin86@gmail.com>; Andres Freund <andres@anarazel.de>; Michael Paquier <michael.paquier@gmail.com>; Nikolay Samokhvalov <samokhvalov@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
主题: Re: Transaction timeout
 
On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >
> >
> >
> > > On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > >
> > > I don’t have Windows machine, so I hope CF bot will pick this.
> >
> > I used Github CI to produce version of tests that seems to be is stable on Windows.
> > Sorry for the noise.
> >
> >
> > Best regards, Andrey Borodin.
>
> +       <para>
> +        If <varname>transaction_timeout</varname> is shorter than
> +        <varname>idle_in_transaction_session_timeout</varname> or
> <varname>statement_timeout</varname>
> +        <varname>transaction_timeout</varname> will invalidate longer timeout.
> +       </para>
>
> When transaction_timeout is *equal* to idle_in_transaction_session_timeout
> or statement_timeout, idle_in_transaction_session_timeout and statement_timeout
> will also be invalidated, the logic in the code seems right, though
> this document
> is a little bit inaccurate.
>
       <para>
        Unlike <varname>statement_timeout</varname>, this timeout can only occur
        while waiting for locks.  Note that if
<varname>statement_timeout</varname>
        is nonzero, it is rather pointless to set
<varname>lock_timeout</varname> to
        the same or larger value, since the statement timeout would always
        trigger first.  If <varname>log_min_error_statement</varname> is set to
        <literal>ERROR</literal> or lower, the statement that timed out will be
        logged.
       </para>

There is a note about statement_timeout and lock_timeout, set both
and lock_timeout >= statement_timeout is pointless, but this logic seems not
implemented in the code. I am wondering if lock_timeout >= transaction_timeout,
should we invalidate lock_timeout? Or maybe just document this.

> --
> Regards
> Junwang Zhao



--
Regards
Junwang Zhao


Re: Transaction timeout

From
Junwang Zhao
Date:
On Wed, Dec 20, 2023 at 9:58 AM Thomas wen
<Thomas_valentine_365@outlook.com> wrote:
>
> Hi Junwang Zhao
>      #should we invalidate lock_timeout? Or maybe just document this.
> I think you mean when lock_time is greater than trasaction-time invalidate lock_timeout or  needs to be logged ?
>
I mean the interleaving of the gucs, which is lock_timeout and the new
introduced transaction_timeout,
if lock_timeout >= transaction_timeout, seems no need to enable lock_timeout.
>
>
>
> Best whish
> ________________________________
> 发件人: Junwang Zhao <zhjwpku@gmail.com>
> 发送时间: 2023年12月20日 9:48
> 收件人: Andrey M. Borodin <x4mmm@yandex-team.ru>
> 抄送: Japin Li <japinli@hotmail.com>; 邱宇航 <iamqyh@gmail.com>; Fujii Masao <masao.fujii@oss.nttdata.com>; Andrey Borodin
<amborodin86@gmail.com>;Andres Freund <andres@anarazel.de>; Michael Paquier <michael.paquier@gmail.com>; Nikolay
Samokhvalov<samokhvalov@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>; pgsql-hackers@lists.postgresql.org
<pgsql-hackers@lists.postgresql.org>
> 主题: Re: Transaction timeout
>
> On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
> >
> > On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > >
> > >
> > >
> > > > On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > > >
> > > > I don’t have Windows machine, so I hope CF bot will pick this.
> > >
> > > I used Github CI to produce version of tests that seems to be is stable on Windows.
> > > Sorry for the noise.
> > >
> > >
> > > Best regards, Andrey Borodin.
> >
> > +       <para>
> > +        If <varname>transaction_timeout</varname> is shorter than
> > +        <varname>idle_in_transaction_session_timeout</varname> or
> > <varname>statement_timeout</varname>
> > +        <varname>transaction_timeout</varname> will invalidate longer timeout.
> > +       </para>
> >
> > When transaction_timeout is *equal* to idle_in_transaction_session_timeout
> > or statement_timeout, idle_in_transaction_session_timeout and statement_timeout
> > will also be invalidated, the logic in the code seems right, though
> > this document
> > is a little bit inaccurate.
> >
>        <para>
>         Unlike <varname>statement_timeout</varname>, this timeout can only occur
>         while waiting for locks.  Note that if
> <varname>statement_timeout</varname>
>         is nonzero, it is rather pointless to set
> <varname>lock_timeout</varname> to
>         the same or larger value, since the statement timeout would always
>         trigger first.  If <varname>log_min_error_statement</varname> is set to
>         <literal>ERROR</literal> or lower, the statement that timed out will be
>         logged.
>        </para>
>
> There is a note about statement_timeout and lock_timeout, set both
> and lock_timeout >= statement_timeout is pointless, but this logic seems not
> implemented in the code. I am wondering if lock_timeout >= transaction_timeout,
> should we invalidate lock_timeout? Or maybe just document this.
>
> > --
> > Regards
> > Junwang Zhao
>
>
>
> --
> Regards
> Junwang Zhao
>
>


--
Regards
Junwang Zhao



Re: Transaction timeout

From
wenhui qiu
Date:
Hi Junwang Zhao
   Agree +1

Best whish

Junwang Zhao <zhjwpku@gmail.com> 于2023年12月20日周三 10:35写道:
On Wed, Dec 20, 2023 at 9:58 AM Thomas wen
<Thomas_valentine_365@outlook.com> wrote:
>
> Hi Junwang Zhao
>      #should we invalidate lock_timeout? Or maybe just document this.
> I think you mean when lock_time is greater than trasaction-time invalidate lock_timeout or  needs to be logged ?
>
I mean the interleaving of the gucs, which is lock_timeout and the new
introduced transaction_timeout,
if lock_timeout >= transaction_timeout, seems no need to enable lock_timeout.
>
>
>
> Best whish
> ________________________________
> 发件人: Junwang Zhao <zhjwpku@gmail.com>
> 发送时间: 2023年12月20日 9:48
> 收件人: Andrey M. Borodin <x4mmm@yandex-team.ru>
> 抄送: Japin Li <japinli@hotmail.com>; 邱宇航 <iamqyh@gmail.com>; Fujii Masao <masao.fujii@oss.nttdata.com>; Andrey Borodin <amborodin86@gmail.com>; Andres Freund <andres@anarazel.de>; Michael Paquier <michael.paquier@gmail.com>; Nikolay Samokhvalov <samokhvalov@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
> 主题: Re: Transaction timeout
>
> On Tue, Dec 19, 2023 at 10:51 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
> >
> > On Tue, Dec 19, 2023 at 6:27 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > >
> > >
> > >
> > > > On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > > >
> > > > I don’t have Windows machine, so I hope CF bot will pick this.
> > >
> > > I used Github CI to produce version of tests that seems to be is stable on Windows.
> > > Sorry for the noise.
> > >
> > >
> > > Best regards, Andrey Borodin.
> >
> > +       <para>
> > +        If <varname>transaction_timeout</varname> is shorter than
> > +        <varname>idle_in_transaction_session_timeout</varname> or
> > <varname>statement_timeout</varname>
> > +        <varname>transaction_timeout</varname> will invalidate longer timeout.
> > +       </para>
> >
> > When transaction_timeout is *equal* to idle_in_transaction_session_timeout
> > or statement_timeout, idle_in_transaction_session_timeout and statement_timeout
> > will also be invalidated, the logic in the code seems right, though
> > this document
> > is a little bit inaccurate.
> >
>        <para>
>         Unlike <varname>statement_timeout</varname>, this timeout can only occur
>         while waiting for locks.  Note that if
> <varname>statement_timeout</varname>
>         is nonzero, it is rather pointless to set
> <varname>lock_timeout</varname> to
>         the same or larger value, since the statement timeout would always
>         trigger first.  If <varname>log_min_error_statement</varname> is set to
>         <literal>ERROR</literal> or lower, the statement that timed out will be
>         logged.
>        </para>
>
> There is a note about statement_timeout and lock_timeout, set both
> and lock_timeout >= statement_timeout is pointless, but this logic seems not
> implemented in the code. I am wondering if lock_timeout >= transaction_timeout,
> should we invalidate lock_timeout? Or maybe just document this.
>
> > --
> > Regards
> > Junwang Zhao
>
>
>
> --
> Regards
> Junwang Zhao
>
>


--
Regards
Junwang Zhao


Re: Transaction timeout

From
Japin Li
Date:
On Tue, 19 Dec 2023 at 22:06, Japin Li <japinli@hotmail.com> wrote:
> On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>> On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>>
>>> I don’t have Windows machine, so I hope CF bot will pick this.
>>
>> I used Github CI to produce version of tests that seems to be is stable on Windows.
>
> It still failed on Windows Server 2019 [1].
>
> diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out
C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
> --- C:/cirrus/src/test/isolation/expected/timeouts.out    2023-12-19 10:34:30.354721100 +0000
> +++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out    2023-12-19 10:38:25.877981600 +0000
> @@ -100,7 +100,7 @@
>  step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/stt2'
>  count
>  -----
> -    0
> +    1
>  (1 row)
>
>  step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET statement_timeout = '10s'; SET lock_timeout =
'10s';SET transaction_timeout = '10s'; 
>
> [1]
https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs

Hi,

I try to split the test for transaction timeout, and all passed on my CI [1].

OTOH, I find if I set transaction_timeout in a transaction, it will not take
effect immediately.  For example:

[local]:2049802 postgres=# BEGIN;
BEGIN
[local]:2049802 postgres=*# SET transaction_timeout TO '1s';
SET
[local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;  -- wait 10s
   relname
--------------
 pg_statistic
(1 row)

[local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;
FATAL:  terminating connection due to transaction timeout
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

It looks odd.  Does this is expected? I'm not read all the threads,
am I missing something?

[1] https://cirrus-ci.com/build/6574686130143232

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.


Attachment

Re: Transaction timeout

From
Junwang Zhao
Date:
On Fri, Dec 22, 2023 at 1:39 PM Japin Li <japinli@hotmail.com> wrote:
>
>
> On Tue, 19 Dec 2023 at 22:06, Japin Li <japinli@hotmail.com> wrote:
> > On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >>> On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >>>
> >>> I don’t have Windows machine, so I hope CF bot will pick this.
> >>
> >> I used Github CI to produce version of tests that seems to be is stable on Windows.
> >
> > It still failed on Windows Server 2019 [1].
> >
> > diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out
C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
> > --- C:/cirrus/src/test/isolation/expected/timeouts.out        2023-12-19 10:34:30.354721100 +0000
> > +++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out  2023-12-19 10:38:25.877981600 +0000
> > @@ -100,7 +100,7 @@
> >  step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/stt2'
> >  count
> >  -----
> > -    0
> > +    1
> >  (1 row)
> >
> >  step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET statement_timeout = '10s'; SET lock_timeout =
'10s';SET transaction_timeout = '10s'; 
> >
> > [1]
https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs
>
> Hi,
>
> I try to split the test for transaction timeout, and all passed on my CI [1].
>
> OTOH, I find if I set transaction_timeout in a transaction, it will not take
> effect immediately.  For example:
>
> [local]:2049802 postgres=# BEGIN;
> BEGIN
> [local]:2049802 postgres=*# SET transaction_timeout TO '1s';
when this execute, TransactionTimeout is still 0, this command will
not set timeout
> SET
> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;  -- wait 10s
when this command get execute, start_xact_command will enable the timer
>    relname
> --------------
>  pg_statistic
> (1 row)
>
> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;
> FATAL:  terminating connection due to transaction timeout
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
>
> It looks odd.  Does this is expected? I'm not read all the threads,
> am I missing something?

I think this is by design, if you debug statement_timeout, it's the same
behaviour, the timeout will be set for each command after the second
command was called, you just aren't aware of this.

I doubt people will set this in a transaction.
>
> [1] https://cirrus-ci.com/build/6574686130143232
>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.
>


--
Regards
Junwang Zhao



Re: Transaction timeout

From
Japin Li
Date:
On Fri, 22 Dec 2023 at 20:29, Junwang Zhao <zhjwpku@gmail.com> wrote:
> On Fri, Dec 22, 2023 at 1:39 PM Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> On Tue, 19 Dec 2023 at 22:06, Japin Li <japinli@hotmail.com> wrote:
>> > On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> >>> On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> >>>
>> >>> I don’t have Windows machine, so I hope CF bot will pick this.
>> >>
>> >> I used Github CI to produce version of tests that seems to be is stable on Windows.
>> >
>> > It still failed on Windows Server 2019 [1].
>> >
>> > diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out
C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
>> > --- C:/cirrus/src/test/isolation/expected/timeouts.out        2023-12-19 10:34:30.354721100 +0000
>> > +++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out  2023-12-19 10:38:25.877981600 +0000
>> > @@ -100,7 +100,7 @@
>> >  step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/stt2'
>> >  count
>> >  -----
>> > -    0
>> > +    1
>> >  (1 row)
>> >
>> >  step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET statement_timeout = '10s'; SET lock_timeout =
'10s';SET transaction_timeout = '10s'; 
>> >
>> > [1]
https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs
>>
>> Hi,
>>
>> I try to split the test for transaction timeout, and all passed on my CI [1].
>>
>> OTOH, I find if I set transaction_timeout in a transaction, it will not take
>> effect immediately.  For example:
>>
>> [local]:2049802 postgres=# BEGIN;
>> BEGIN
>> [local]:2049802 postgres=*# SET transaction_timeout TO '1s';
> when this execute, TransactionTimeout is still 0, this command will
> not set timeout
>> SET
>> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;  -- wait 10s
> when this command get execute, start_xact_command will enable the timer

Thanks for your exaplantion, got it.

>>    relname
>> --------------
>>  pg_statistic
>> (1 row)
>>
>> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;
>> FATAL:  terminating connection due to transaction timeout
>> server closed the connection unexpectedly
>>         This probably means the server terminated abnormally
>>         before or while processing the request.
>> The connection to the server was lost. Attempting reset: Succeeded.
>>
>> It looks odd.  Does this is expected? I'm not read all the threads,
>> am I missing something?
>
> I think this is by design, if you debug statement_timeout, it's the same
> behaviour, the timeout will be set for each command after the second
> command was called, you just aren't aware of this.
>

I try to set idle_in_transaction_session_timeout after begin transaction,
it changes immediately, so I think transaction_timeout should also be take
immediately.

> I doubt people will set this in a transaction.

Maybe not,


--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
Junwang Zhao
Date:
On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
>
>
> On Fri, 22 Dec 2023 at 20:29, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > On Fri, Dec 22, 2023 at 1:39 PM Japin Li <japinli@hotmail.com> wrote:
> >>
> >>
> >> On Tue, 19 Dec 2023 at 22:06, Japin Li <japinli@hotmail.com> wrote:
> >> > On Tue, 19 Dec 2023 at 18:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >> >>> On 19 Dec 2023, at 13:26, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >> >>>
> >> >>> I don’t have Windows machine, so I hope CF bot will pick this.
> >> >>
> >> >> I used Github CI to produce version of tests that seems to be is stable on Windows.
> >> >
> >> > It still failed on Windows Server 2019 [1].
> >> >
> >> > diff -w -U3 C:/cirrus/src/test/isolation/expected/timeouts.out
C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out
> >> > --- C:/cirrus/src/test/isolation/expected/timeouts.out        2023-12-19 10:34:30.354721100 +0000
> >> > +++ C:/cirrus/build/testrun/isolation/isolation/results/timeouts.out  2023-12-19 10:38:25.877981600 +0000
> >> > @@ -100,7 +100,7 @@
> >> >  step stt3_check_stt2: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/stt2'
> >> >  count
> >> >  -----
> >> > -    0
> >> > +    1
> >> >  (1 row)
> >> >
> >> >  step itt4_set: SET idle_in_transaction_session_timeout = '1ms'; SET statement_timeout = '10s'; SET lock_timeout
='10s'; SET transaction_timeout = '10s'; 
> >> >
> >> > [1]
https://api.cirrus-ci.com/v1/artifact/task/4707530400595968/testrun/build/testrun/isolation/isolation/regression.diffs
> >>
> >> Hi,
> >>
> >> I try to split the test for transaction timeout, and all passed on my CI [1].
> >>
> >> OTOH, I find if I set transaction_timeout in a transaction, it will not take
> >> effect immediately.  For example:
> >>
> >> [local]:2049802 postgres=# BEGIN;
> >> BEGIN
> >> [local]:2049802 postgres=*# SET transaction_timeout TO '1s';
> > when this execute, TransactionTimeout is still 0, this command will
> > not set timeout
> >> SET
> >> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;  -- wait 10s
> > when this command get execute, start_xact_command will enable the timer
>
> Thanks for your exaplantion, got it.
>
> >>    relname
> >> --------------
> >>  pg_statistic
> >> (1 row)
> >>
> >> [local]:2049802 postgres=*# SELECT relname FROM pg_class LIMIT 1;
> >> FATAL:  terminating connection due to transaction timeout
> >> server closed the connection unexpectedly
> >>         This probably means the server terminated abnormally
> >>         before or while processing the request.
> >> The connection to the server was lost. Attempting reset: Succeeded.
> >>
> >> It looks odd.  Does this is expected? I'm not read all the threads,
> >> am I missing something?
> >
> > I think this is by design, if you debug statement_timeout, it's the same
> > behaviour, the timeout will be set for each command after the second
> > command was called, you just aren't aware of this.
> >
>
> I try to set idle_in_transaction_session_timeout after begin transaction,
> it changes immediately, so I think transaction_timeout should also be take
> immediately.

Ah, right, idle_in_transaction_session_timeout is set after the set
command finishes and before the backend send *ready for query*
to the client, so the value of the GUC is already set before
next command.

I bet you must have checked this ;)

>
> > I doubt people will set this in a transaction.
>
> Maybe not,
>
>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.



--
Regards
Junwang Zhao



Re: Transaction timeout

From
Japin Li
Date:
On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
> On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
>> I try to set idle_in_transaction_session_timeout after begin transaction,
>> it changes immediately, so I think transaction_timeout should also be take
>> immediately.
>
> Ah, right, idle_in_transaction_session_timeout is set after the set
> command finishes and before the backend send *ready for query*
> to the client, so the value of the GUC is already set before
> next command.
>

I mean, is it possible to set transaction_timeout before next comand?


--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
Junwang Zhao
Date:
On Fri, Dec 22, 2023 at 10:44 PM Japin Li <japinli@hotmail.com> wrote:
>
>
> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
> >> I try to set idle_in_transaction_session_timeout after begin transaction,
> >> it changes immediately, so I think transaction_timeout should also be take
> >> immediately.
> >
> > Ah, right, idle_in_transaction_session_timeout is set after the set
> > command finishes and before the backend send *ready for query*
> > to the client, so the value of the GUC is already set before
> > next command.
> >
>
> I mean, is it possible to set transaction_timeout before next comand?
>
Yeah, it's possible, set transaction_timeout in the when it first
goes into *idle in transaction* mode, see the attached files.

>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.



--
Regards
Junwang Zhao

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Fri, 22 Dec 2023 at 23:30, Junwang Zhao <zhjwpku@gmail.com> wrote:
> On Fri, Dec 22, 2023 at 10:44 PM Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
>> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
>> >> I try to set idle_in_transaction_session_timeout after begin transaction,
>> >> it changes immediately, so I think transaction_timeout should also be take
>> >> immediately.
>> >
>> > Ah, right, idle_in_transaction_session_timeout is set after the set
>> > command finishes and before the backend send *ready for query*
>> > to the client, so the value of the GUC is already set before
>> > next command.
>> >
>>
>> I mean, is it possible to set transaction_timeout before next comand?
>>
> Yeah, it's possible, set transaction_timeout in the when it first
> goes into *idle in transaction* mode, see the attached files.
>

Thanks for updating the patch, LGTM.

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
Japin Li
Date:
On Sat, 23 Dec 2023 at 08:32, Japin Li <japinli@hotmail.com> wrote:
> On Fri, 22 Dec 2023 at 23:30, Junwang Zhao <zhjwpku@gmail.com> wrote:
>> On Fri, Dec 22, 2023 at 10:44 PM Japin Li <japinli@hotmail.com> wrote:
>>>
>>>
>>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
>>> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
>>> >> I try to set idle_in_transaction_session_timeout after begin transaction,
>>> >> it changes immediately, so I think transaction_timeout should also be take
>>> >> immediately.
>>> >
>>> > Ah, right, idle_in_transaction_session_timeout is set after the set
>>> > command finishes and before the backend send *ready for query*
>>> > to the client, so the value of the GUC is already set before
>>> > next command.
>>> >
>>>
>>> I mean, is it possible to set transaction_timeout before next comand?
>>>
>> Yeah, it's possible, set transaction_timeout in the when it first
>> goes into *idle in transaction* mode, see the attached files.
>>
>
> Thanks for updating the patch, LGTM.

Sorry for the noise!

Read the previous threads, I find why the author enable transaction_timeout
in start_xact_command().

The v15 patch cannot handle COMMIT AND CHAIN, see [1]. For example:

SET transaction_timeout TO '2s'; BEGIN; SELECT 1, pg_sleep(1); COMMIT AND CHAIN; SELECT 2, pg_sleep(1); COMMIT;

The transaction_timeout do not reset when executing COMMIT AND CHAIN.

[1] https://www.postgresql.org/message-id/a906dea1-76a1-4f26-76c5-a7efad3ef5b8%40oss.nttdata.com

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
Junwang Zhao
Date:
On Sat, Dec 23, 2023 at 10:40 AM Japin Li <japinli@hotmail.com> wrote:
>
>
> On Sat, 23 Dec 2023 at 08:32, Japin Li <japinli@hotmail.com> wrote:
> > On Fri, 22 Dec 2023 at 23:30, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >> On Fri, Dec 22, 2023 at 10:44 PM Japin Li <japinli@hotmail.com> wrote:
> >>>
> >>>
> >>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >>> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
> >>> >> I try to set idle_in_transaction_session_timeout after begin transaction,
> >>> >> it changes immediately, so I think transaction_timeout should also be take
> >>> >> immediately.
> >>> >
> >>> > Ah, right, idle_in_transaction_session_timeout is set after the set
> >>> > command finishes and before the backend send *ready for query*
> >>> > to the client, so the value of the GUC is already set before
> >>> > next command.
> >>> >
> >>>
> >>> I mean, is it possible to set transaction_timeout before next comand?
> >>>
> >> Yeah, it's possible, set transaction_timeout in the when it first
> >> goes into *idle in transaction* mode, see the attached files.
> >>
> >
> > Thanks for updating the patch, LGTM.
>
> Sorry for the noise!
>
> Read the previous threads, I find why the author enable transaction_timeout
> in start_xact_command().
>
> The v15 patch cannot handle COMMIT AND CHAIN, see [1]. For example:

I didn't read the previous threads, sorry for that, let's stick to v14.

>
> SET transaction_timeout TO '2s'; BEGIN; SELECT 1, pg_sleep(1); COMMIT AND CHAIN; SELECT 2, pg_sleep(1); COMMIT;
>
> The transaction_timeout do not reset when executing COMMIT AND CHAIN.
>
> [1] https://www.postgresql.org/message-id/a906dea1-76a1-4f26-76c5-a7efad3ef5b8%40oss.nttdata.com
>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.



--
Regards
Junwang Zhao



Re: Transaction timeout

From
Japin Li
Date:
a
On Sat, 23 Dec 2023 at 10:40, Japin Li <japinli@hotmail.com> wrote:
> On Sat, 23 Dec 2023 at 08:32, Japin Li <japinli@hotmail.com> wrote:
>> On Fri, 22 Dec 2023 at 23:30, Junwang Zhao <zhjwpku@gmail.com> wrote:
>>> On Fri, Dec 22, 2023 at 10:44 PM Japin Li <japinli@hotmail.com> wrote:
>>>>
>>>>
>>>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
>>>> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
>>>> >> I try to set idle_in_transaction_session_timeout after begin transaction,
>>>> >> it changes immediately, so I think transaction_timeout should also be take
>>>> >> immediately.
>>>> >
>>>> > Ah, right, idle_in_transaction_session_timeout is set after the set
>>>> > command finishes and before the backend send *ready for query*
>>>> > to the client, so the value of the GUC is already set before
>>>> > next command.
>>>> >
>>>>
>>>> I mean, is it possible to set transaction_timeout before next comand?
>>>>
>>> Yeah, it's possible, set transaction_timeout in the when it first
>>> goes into *idle in transaction* mode, see the attached files.
>>>
>>
>> Thanks for updating the patch, LGTM.
>
> Sorry for the noise!
>
> Read the previous threads, I find why the author enable transaction_timeout
> in start_xact_command().
>
> The v15 patch cannot handle COMMIT AND CHAIN, see [1]. For example:
>
> SET transaction_timeout TO '2s'; BEGIN; SELECT 1, pg_sleep(1); COMMIT AND CHAIN; SELECT 2, pg_sleep(1); COMMIT;
>
> The transaction_timeout do not reset when executing COMMIT AND CHAIN.
>
> [1] https://www.postgresql.org/message-id/a906dea1-76a1-4f26-76c5-a7efad3ef5b8%40oss.nttdata.com

Attach v16 to solve this. Any suggestions?

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.


Attachment

Re: Transaction timeout

From
Junwang Zhao
Date:
On Sat, Dec 23, 2023 at 11:17 AM Japin Li <japinli@hotmail.com> wrote:
>
> a
> On Sat, 23 Dec 2023 at 10:40, Japin Li <japinli@hotmail.com> wrote:
> > On Sat, 23 Dec 2023 at 08:32, Japin Li <japinli@hotmail.com> wrote:
> >> On Fri, 22 Dec 2023 at 23:30, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >>> On Fri, Dec 22, 2023 at 10:44 PM Japin Li <japinli@hotmail.com> wrote:
> >>>>
> >>>>
> >>>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >>>> > On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
> >>>> >> I try to set idle_in_transaction_session_timeout after begin transaction,
> >>>> >> it changes immediately, so I think transaction_timeout should also be take
> >>>> >> immediately.
> >>>> >
> >>>> > Ah, right, idle_in_transaction_session_timeout is set after the set
> >>>> > command finishes and before the backend send *ready for query*
> >>>> > to the client, so the value of the GUC is already set before
> >>>> > next command.
> >>>> >
> >>>>
> >>>> I mean, is it possible to set transaction_timeout before next comand?
> >>>>
> >>> Yeah, it's possible, set transaction_timeout in the when it first
> >>> goes into *idle in transaction* mode, see the attached files.
> >>>
> >>
> >> Thanks for updating the patch, LGTM.
> >
> > Sorry for the noise!
> >
> > Read the previous threads, I find why the author enable transaction_timeout
> > in start_xact_command().
> >
> > The v15 patch cannot handle COMMIT AND CHAIN, see [1]. For example:
> >
> > SET transaction_timeout TO '2s'; BEGIN; SELECT 1, pg_sleep(1); COMMIT AND CHAIN; SELECT 2, pg_sleep(1); COMMIT;
> >
> > The transaction_timeout do not reset when executing COMMIT AND CHAIN.
> >
> > [1] https://www.postgresql.org/message-id/a906dea1-76a1-4f26-76c5-a7efad3ef5b8%40oss.nttdata.com
>
> Attach v16 to solve this. Any suggestions?

I've checked this with *COMMIT AND CHAIN* and *ABORT AND CHAIN*,
both work as expected. Thanks for the update.

>
> --
> Regrads,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.
>


--
Regards
Junwang Zhao



Re: Transaction timeout

From
Li Japin
Date:

> 在 2023年12月23日,11:35,Junwang Zhao <zhjwpku@gmail.com> 写道:
> 
> On Sat, Dec 23, 2023 at 11:17 AM Japin Li <japinli@hotmail.com> wrote:
>> 
>> a
>>> On Sat, 23 Dec 2023 at 10:40, Japin Li <japinli@hotmail.com> wrote:
>>> On Sat, 23 Dec 2023 at 08:32, Japin Li <japinli@hotmail.com> wrote:
>>>> On Fri, 22 Dec 2023 at 23:30, Junwang Zhao <zhjwpku@gmail.com> wrote:
>>>>> On Fri, Dec 22, 2023 at 10:44 PM Japin Li <japinli@hotmail.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> On Fri, 22 Dec 2023 at 22:37, Junwang Zhao <zhjwpku@gmail.com> wrote:
>>>>>>> On Fri, Dec 22, 2023 at 10:25 PM Japin Li <japinli@hotmail.com> wrote:
>>>>>>>> I try to set idle_in_transaction_session_timeout after begin transaction,
>>>>>>>> it changes immediately, so I think transaction_timeout should also be take
>>>>>>>> immediately.
>>>>>>> 
>>>>>>> Ah, right, idle_in_transaction_session_timeout is set after the set
>>>>>>> command finishes and before the backend send *ready for query*
>>>>>>> to the client, so the value of the GUC is already set before
>>>>>>> next command.
>>>>>>> 
>>>>>> 
>>>>>> I mean, is it possible to set transaction_timeout before next comand?
>>>>>> 
>>>>> Yeah, it's possible, set transaction_timeout in the when it first
>>>>> goes into *idle in transaction* mode, see the attached files.
>>>>> 
>>>> 
>>>> Thanks for updating the patch, LGTM.
>>> 
>>> Sorry for the noise!
>>> 
>>> Read the previous threads, I find why the author enable transaction_timeout
>>> in start_xact_command().
>>> 
>>> The v15 patch cannot handle COMMIT AND CHAIN, see [1]. For example:
>>> 
>>> SET transaction_timeout TO '2s'; BEGIN; SELECT 1, pg_sleep(1); COMMIT AND CHAIN; SELECT 2, pg_sleep(1); COMMIT;
>>> 
>>> The transaction_timeout do not reset when executing COMMIT AND CHAIN.
>>> 
>>> [1] https://www.postgresql.org/message-id/a906dea1-76a1-4f26-76c5-a7efad3ef5b8%40oss.nttdata.com
>> 
>> Attach v16 to solve this. Any suggestions?
> 
> I've checked this with *COMMIT AND CHAIN* and *ABORT AND CHAIN*,
> both work as expected. Thanks for the update.
> 

Thanks for your testing and reviewing!

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 22 Dec 2023, at 10:39, Japin Li <japinli@hotmail.com> wrote:
>
>
> I try to split the test for transaction timeout, and all passed on my CI [1].


I like the refactoring you did in timeout.spec. I thought it is impossible, because permutations would try to
reinitializeFATALed sessions. But, obviously, tests work the way you refactored it. 
However I don't think ignoring test failures on Windows without understanding root cause is a good idea.
Let's get back to v13 version of tests, understand why it failed, apply your test refactorings afterwards. BTW are you
surethat v14 refactorings are functional equivalent of v13 tests? 

To go with this plan I attach slightly modified version of v13 tests in v16 patchset. The only change is timing in
"sleep_there"step. I suspect that failure was induced by more coarse timer granularity on Windows. Tests were giving
only9 milliseconds for a timeout to entirely wipe away backend from pg_stat_activity. This saves testing time, but
mightinduce false positive test flaps. So I've raised wait times to 100ms. This seems too much, but I do not have other
ideashow to ensure tests stability. Maybe 50ms would be enough, I do not know. Isolation runs ~50 seconds now. I'm
temptedto say that 200ms for timeouts worth it. 


As to 2nd step "Try to enable transaction_timeout during transaction", I think this makes sense. But if we are doing
so,shouldn't we also allow to enable idle_in_transaction timeout in a same manner? Currently we only allow to disable
othertimeouts... Also, if we are already in transaction, shouldn't we also subtract current transaction span from
timeout?
I think making this functionality as another step of the patchset was a good idea.

Thanks!


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Junwang Zhao
Date:
Hi Andrey,

On Sun, Dec 24, 2023 at 1:14 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 22 Dec 2023, at 10:39, Japin Li <japinli@hotmail.com> wrote:
> >
> >
> > I try to split the test for transaction timeout, and all passed on my CI [1].
>
>
> I like the refactoring you did in timeout.spec. I thought it is impossible, because permutations would try to
reinitializeFATALed sessions. But, obviously, tests work the way you refactored it. 
> However I don't think ignoring test failures on Windows without understanding root cause is a good idea.
> Let's get back to v13 version of tests, understand why it failed, apply your test refactorings afterwards. BTW are
yousure that v14 refactorings are functional equivalent of v13 tests? 
>
> To go with this plan I attach slightly modified version of v13 tests in v16 patchset. The only change is timing in
"sleep_there"step. I suspect that failure was induced by more coarse timer granularity on Windows. Tests were giving
only9 milliseconds for a timeout to entirely wipe away backend from pg_stat_activity. This saves testing time, but
mightinduce false positive test flaps. So I've raised wait times to 100ms. This seems too much, but I do not have other
ideashow to ensure tests stability. Maybe 50ms would be enough, I do not know. Isolation runs ~50 seconds now. I'm
temptedto say that 200ms for timeouts worth it. 
>
>
> As to 2nd step "Try to enable transaction_timeout during transaction", I think this makes sense. But if we are doing
so,shouldn't we also allow to enable idle_in_transaction timeout in a same manner? Currently we only allow to disable
othertimeouts... Also, if we are already in transaction, shouldn't we also subtract current transaction span from
timeout?
idle_in_transaction_session_timeout is already the behavior Japin suggested,
it is enabled before backend sends *read for query* to client.

> I think making this functionality as another step of the patchset was a good idea.
>
> Thanks!
>
>
> Best regards, Andrey Borodin.



--
Regards
Junwang Zhao



Re: Transaction timeout

From
Japin Li
Date:
On Sun, 24 Dec 2023 at 01:14, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 22 Dec 2023, at 10:39, Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> I try to split the test for transaction timeout, and all passed on my CI [1].
>
>
> I like the refactoring you did in timeout.spec. I thought it is impossible, because permutations would try to
reinitializeFATALed sessions. But, obviously, tests work the way you refactored it.
 
> However I don't think ignoring test failures on Windows without understanding root cause is a good idea.

Yeah.

> Let's get back to v13 version of tests, understand why it failed, apply your test refactorings afterwards. BTW are
yousure that v14 refactorings are functional equivalent of v13 tests?
 
>
I think it is equivalent.  Maybe I missing something.  Please let me known
if they are not equivalent.

> To go with this plan I attach slightly modified version of v13 tests in v16 patchset. The only change is timing in
"sleep_there"step. I suspect that failure was induced by more coarse timer granularity on Windows. Tests were giving
only9 milliseconds for a timeout to entirely wipe away backend from pg_stat_activity. This saves testing time, but
mightinduce false positive test flaps. So I've raised wait times to 100ms. This seems too much, but I do not have other
ideashow to ensure tests stability. Maybe 50ms would be enough, I do not know. Isolation runs ~50 seconds now. I'm
temptedto say that 200ms for timeouts worth it.
 
>
So this is caused by Windows timer granularity?

> As to 2nd step "Try to enable transaction_timeout during transaction", I think this makes sense. But if we are doing
so,shouldn't we also allow to enable idle_in_transaction timeout in a same manner?
 

I think the current idle_in_transaction_session_timeout work correctly.

> Currently we only allow to disable other timeouts... Also, if we are already in transaction, shouldn't we also
subtractcurrent transaction span from timeout?
 

Agreed.

> I think making this functionality as another step of the patchset was a good idea.
>

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: Transaction timeout

From
Junwang Zhao
Date:
Hey Andrey,

On Sun, Dec 24, 2023 at 1:14 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 22 Dec 2023, at 10:39, Japin Li <japinli@hotmail.com> wrote:
> >
> >
> > I try to split the test for transaction timeout, and all passed on my CI [1].
>
>
> I like the refactoring you did in timeout.spec. I thought it is impossible, because permutations would try to
reinitializeFATALed sessions. But, obviously, tests work the way you refactored it. 
> However I don't think ignoring test failures on Windows without understanding root cause is a good idea.
> Let's get back to v13 version of tests, understand why it failed, apply your test refactorings afterwards. BTW are
yousure that v14 refactorings are functional equivalent of v13 tests? 
>
> To go with this plan I attach slightly modified version of v13 tests in v16 patchset. The only change is timing in
"sleep_there"step. I suspect that failure was induced by more coarse timer granularity on Windows. Tests were giving
only9 milliseconds for a timeout to entirely wipe away backend from pg_stat_activity. This saves testing time, but
mightinduce false positive test flaps. So I've raised wait times to 100ms. This seems too much, but I do not have other
ideashow to ensure tests stability. Maybe 50ms would be enough, I do not know. Isolation runs ~50 seconds now. I'm
temptedto say that 200ms for timeouts worth it. 
>
>
> As to 2nd step "Try to enable transaction_timeout during transaction", I think this makes sense. But if we are doing
so,shouldn't we also allow to enable idle_in_transaction timeout in a same manner? Currently we only allow to disable
othertimeouts... Also, if we are already in transaction, shouldn't we also subtract current transaction span from
timeout?
> I think making this functionality as another step of the patchset was a good idea.
>
> Thanks!
Seems V5~V17 doesn't work as expected for Nikolay's case:

postgres=# set transaction_timeout to '2s';
SET
postgres=# begin; select pg_sleep(1); select pg_sleep(1); select
pg_sleep(1); select pg_sleep(1); select pg_sleep(1); commit;
BEGIN

The reason for this seems the timer has been refreshed for each
command, xact_started along can not indicate it's a new
transaction or not, there is a TransactionState contains some
infos.

So I propose the following change, what do you think?

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a2611cf8e6..cffd2c44d0 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2746,7 +2746,7 @@ start_xact_command(void)
                StartTransactionCommand();

                /* Schedule or reschedule transaction timeout */
-               if (TransactionTimeout > 0)
+               if (TransactionTimeout > 0 &&
!get_timeout_active(TRANSACTION_TIMEOUT))
                        enable_timeout_after(TRANSACTION_TIMEOUT,
TransactionTimeout);

                xact_started = true;

>
>
> Best regards, Andrey Borodin.



--
Regards
Junwang Zhao



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 28 Dec 2023, at 21:02, Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> Seems V5~V17 doesn't work as expected for Nikolay's case:
>

Yeah, that's a problem.
> So I propose the following change, what do you think?
This breaks COMMIT AND CHAIN.

PFA v18: I've added a test for Nik's case and for COMMIT AND CHAIN. Now we need to fix stuff to pass this tests (I've
craftedoutput). 
We also need test for patchset step "Try to enable transaction_timeout before next command".

Thanks!


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Junwang Zhao
Date:
On Fri, Dec 29, 2023 at 6:00 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 28 Dec 2023, at 21:02, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >
> > Seems V5~V17 doesn't work as expected for Nikolay's case:
> >
>
> Yeah, that's a problem.
> > So I propose the following change, what do you think?
> This breaks COMMIT AND CHAIN.
>
> PFA v18: I've added a test for Nik's case and for COMMIT AND CHAIN. Now we need to fix stuff to pass this tests (I've
craftedoutput). 
> We also need test for patchset step "Try to enable transaction_timeout before next command".
>
> Thanks!

After exploring the code, I found scheduling the timeout in
`StartTransaction` might be a reasonable idea, all the chain
commands will call this function.

What concerns me is that it is also called by StartParallelWorkerTransaction,
I'm not sure if we should enable this timeout for parallel execution.

Thought?

>
>
> Best regards, Andrey Borodin.



--
Regards
Junwang Zhao

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 29 Dec 2023, at 16:00, Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> After exploring the code, I found scheduling the timeout in
> `StartTransaction` might be a reasonable idea, all the chain
> commands will call this function.
>
> What concerns me is that it is also called by StartParallelWorkerTransaction,
> I'm not sure if we should enable this timeout for parallel execution.

I think for parallel workers we should mimic statement_timeout. Because these workers have per-statemenent lifetime.


Best regards, Andrey Borodin.


Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 29 Dec 2023, at 16:15, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

PFA v20. Code steps are intact.

Further refactored tests:
    1. Check termination of active and idle queries (previously tests from Li were testing only termination of idle
query)
    2. Check timeout reschedule (even when last active query was 'SET transaction_timeout')
    3. Check that timeout is not rescheduled by new queries (Nik's case)


Do we have any other open items?
I've left 'make check-timeouts' in isolation directory, it's for development purposes. I think we should remove this
beforecommitting. Obviously, all patch steps are expected to be squashed before commit. 


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:


On 1 Jan 2024, at 19:28, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

 3. Check that timeout is not rescheduled by new queries (Nik's case)

The test of Nik's case was not stable enough together with COMMIT AND CHAIN. So I've separated these cases into different permutations.
Looking through CI logs it seems variation in sleeps and actual timeouts easily reach 30+ms. I'm not entirely sure we can reach 100% stable tests without too big timeouts.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:


On 3 Jan 2024, at 11:39, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:



On 1 Jan 2024, at 19:28, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

 3. Check that timeout is not rescheduled by new queries (Nik's case)

The test of Nik's case was not stable enough together with COMMIT AND CHAIN. So I've separated these cases into different permutations.
Looking through CI logs it seems variation in sleeps and actual timeouts easily reach 30+ms. I'm not entirely sure we can reach 100% stable tests without too big timeouts.


Best regards, Andrey Borodin.

<v21-0001-Introduce-transaction_timeout.patch>
<v21-0002-Add-better-tests-for-transaction_timeout.patch>
<v21-0003-Try-to-enable-transaction_timeout-before-next-co.patch>
<v21-0004-fix-reschedule-timeout-for-each-commmand.patch>
 I do not understand why, but mailing list did not pick patches that I sent. I'll retry.

Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 3 Jan 2024, at 16:46, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>  I do not understand why, but mailing list did not pick patches that I sent. I'll retry.


Sorry for the noise. Seems like Apple updated something in Mail.App couple of days ago and it started to use strange
"Apple-Mail"stuff by default. 
I see patches were attached, but were not recognized by mailing list archives and CFbot.
Now I've flipped everything to "plain text by default" everywhere. Hope that helps.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Wed, 03 Jan 2024 at 20:04, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 3 Jan 2024, at 16:46, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>
>>  I do not understand why, but mailing list did not pick patches that I sent. I'll retry.
>
>
> Sorry for the noise. Seems like Apple updated something in Mail.App couple of days ago and it started to use strange
"Apple-Mail"stuff by default.
 
> I see patches were attached, but were not recognized by mailing list archives and CFbot.
> Now I've flipped everything to "plain text by default" everywhere. Hope that helps.
>

Thanks for updating the patch, I find the test on Debian with mason failed [1].

Does the timeout is too short for testing? I see the timeouts for lock_timeout
and statement_timeout is more bigger than transaction_timeout.

[1]
https://api.cirrus-ci.com/v1/artifact/task/5490718928535552/testrun/build-32/testrun/isolation/isolation/regression.diffs



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 4 Jan 2024, at 07:14, Japin Li <japinli@hotmail.com> wrote:
>
> Does the timeout is too short for testing? I see the timeouts for lock_timeout
> and statement_timeout is more bigger than transaction_timeout.

Makes sense. Done. I've also put some effort into fine-tuning timeouts Nik's case tests. To have 100ms gap between
check,false positive and actual bug we had I had to use transaction_timeout = 300ms. Currently all tests take more than
1000ms!
But I do not see a way to make these tests both stable and short.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Peter Smith
Date:
2024-01 Commitfest.

Hi, This patch has a CF status of "Needs Review" [1], but it seems
there was a CFbot test failure last time it was run [2]. Please have a
look and post an updated version if necessary.

======
[1] https://commitfest.postgresql.org/46/4040/
[2] https://cirrus-ci.com/task/4721191139672064

Kind Regards,
Peter Smith.



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 22 Jan 2024, at 11:23, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi, This patch has a CF status of "Needs Review" [1], but it seems
> there was a CFbot test failure last time it was run [2]. Please have a
> look and post an updated version if necessary.
Thanks Peter!

I’ve inspected CI fails and they were caused by two different problems:
1. It’s unsafe for isaoltion tester to await transaction_timeout within a query. Usually it gets
FATAL: terminating connection due to transaction timeout
But if VM is a bit slow it can get occasional
PQconsumeInput failed: server closed the connection unexpectedly
So, currently all tests use “passive waiting”, in a session that will not timeout.

2. In some cases pg_sleep(0.1) were sleeping up to 200 ms. That was making s7 and s8 fail, because they rely on this
margin.
I’ve separated these tests into different test timeouts-long and increased margin to 300ms. Now tests run horrible 2431
ms.Moreover I’m afraid that on buildfarm we can have much randomly-slower machines so this test might be excluded. 
This test checks COMMIT AND CHAIN and flow of small queries (Nik’s case).

Also I’ve verified that every "enable_timeout_after(TRANSACTION_TIMEOUT)” and “disable_timeout(TRANSACTION_TIMEOUT)” is
necessaryand found that case of aborting "idle in transaction (aborted)” is not covered by tests. I’m not sure we need
atest for this. 
Japin, Junwang, what do you think?

Thanks!


Best regards, Andrey Borodin.


Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 26 Jan 2024, at 11:44, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
> 1. It’s unsafe for isaoltion tester to await transaction_timeout within a query. Usually it gets
> FATAL: terminating connection due to transaction timeout
> But if VM is a bit slow it can get occasional
> PQconsumeInput failed: server closed the connection unexpectedly
> So, currently all tests use “passive waiting”, in a session that will not timeout.


Oops, sorry, I’ve accidentally sent version without this fix.
Here it is.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Fri, 26 Jan 2024 at 14:44, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 22 Jan 2024, at 11:23, Peter Smith <smithpb2250@gmail.com> wrote:
>>
>> Hi, This patch has a CF status of "Needs Review" [1], but it seems
>> there was a CFbot test failure last time it was run [2]. Please have a
>> look and post an updated version if necessary.
> Thanks Peter!
>

Thanks for updating the patch.  Here are some comments for v24.

+       <para>
+        Terminate any session that spans longer than the specified amount of
+        time in transaction. The limit applies both to explicit transactions
+        (started with <command>BEGIN</command>) and to implicitly started
+        transaction corresponding to single statement. But this limit is not
+        applied to prepared transactions.
+        If this value is specified without units, it is taken as milliseconds.
+        A value of zero (the default) disables the timeout.
+       </para>
The sentence "But this limit is not applied to prepared transactions" is redundant,
since we have a paragraph to describe this later.

+
+       <para>
+        If <varname>transaction_timeout</varname> is shorter than
+        <varname>idle_in_transaction_session_timeout</varname> or <varname>statement_timeout</varname>
+        <varname>transaction_timeout</varname> will invalidate longer timeout.
+       </para>
+

Since we are already try to disable the timeouts, should we try to disable
them even if they are equal.

+
+       <para>
+        Prepared transactions are not subject for this timeout.
+       </para>

Maybe wrap this with <note> is a good idea.

> I’ve inspected CI fails and they were caused by two different problems:
> 1. It’s unsafe for isaoltion tester to await transaction_timeout within a query. Usually it gets
> FATAL: terminating connection due to transaction timeout
> But if VM is a bit slow it can get occasional
> PQconsumeInput failed: server closed the connection unexpectedly
> So, currently all tests use “passive waiting”, in a session that will not timeout.
>
> 2. In some cases pg_sleep(0.1) were sleeping up to 200 ms. That was making s7 and s8 fail, because they rely on this
margin.

I'm curious why this happened.

> I’ve separated these tests into different test timeouts-long and increased margin to 300ms. Now tests run horrible
2431ms. Moreover I’m afraid that on buildfarm we can have much randomly-slower machines so this test might be excluded. 
> This test checks COMMIT AND CHAIN and flow of small queries (Nik’s case).
>
> Also I’ve verified that every "enable_timeout_after(TRANSACTION_TIMEOUT)” and “disable_timeout(TRANSACTION_TIMEOUT)”
isnecessary and found that case of aborting "idle in transaction (aborted)” is not covered by tests. I’m not sure we
needa test for this. 

I see there is a test about idle_in_transaction_timeout and transaction_timeout.

Both of them only check the session, but don't check the reason, so we cannot
distinguish the reason they are terminated.  Right?

> Japin, Junwang, what do you think?

However, checking the reason on the timeout session may cause regression test
failed (as you point in 1), I don't strongly insist on it.

--
Best regards,
Japin Li.



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 26 Jan 2024, at 19:58, Japin Li <japinli@hotmail.com> wrote:
>
> Thanks for updating the patch.  Here are some comments for v24.
>
> +       <para>
> +        Terminate any session that spans longer than the specified amount of
> +        time in transaction. The limit applies both to explicit transactions
> +        (started with <command>BEGIN</command>) and to implicitly started
> +        transaction corresponding to single statement. But this limit is not
> +        applied to prepared transactions.
> +        If this value is specified without units, it is taken as milliseconds.
> +        A value of zero (the default) disables the timeout.
> +       </para>
> The sentence "But this limit is not applied to prepared transactions" is redundant,
> since we have a paragraph to describe this later.
Fixed.
>
> +
> +       <para>
> +        If <varname>transaction_timeout</varname> is shorter than
> +        <varname>idle_in_transaction_session_timeout</varname> or <varname>statement_timeout</varname>
> +        <varname>transaction_timeout</varname> will invalidate longer timeout.
> +       </para>
> +
>
> Since we are already try to disable the timeouts, should we try to disable
> them even if they are equal.

Well, we disable timeouts on equality. Fixed docs.

>
> +
> +       <para>
> +        Prepared transactions are not subject for this timeout.
> +       </para>
>
> Maybe wrap this with <note> is a good idea.
Done.

>
>> I’ve inspected CI fails and they were caused by two different problems:
>> 1. It’s unsafe for isaoltion tester to await transaction_timeout within a query. Usually it gets
>> FATAL: terminating connection due to transaction timeout
>> But if VM is a bit slow it can get occasional
>> PQconsumeInput failed: server closed the connection unexpectedly
>> So, currently all tests use “passive waiting”, in a session that will not timeout.
>>
>> 2. In some cases pg_sleep(0.1) were sleeping up to 200 ms. That was making s7 and s8 fail, because they rely on this
margin.
>
> I'm curious why this happened.
I think pg_sleep() cannot provide guarantees on when next query will be executed. In our case we need that isolation
testersee that sleep is over and continue in other session... 

>> I’ve separated these tests into different test timeouts-long and increased margin to 300ms. Now tests run horrible
2431ms. Moreover I’m afraid that on buildfarm we can have much randomly-slower machines so this test might be excluded. 
>> This test checks COMMIT AND CHAIN and flow of small queries (Nik’s case).
>>
>> Also I’ve verified that every "enable_timeout_after(TRANSACTION_TIMEOUT)” and “disable_timeout(TRANSACTION_TIMEOUT)”
isnecessary and found that case of aborting "idle in transaction (aborted)” is not covered by tests. I’m not sure we
needa test for this. 
>
> I see there is a test about idle_in_transaction_timeout and transaction_timeout.
>
> Both of them only check the session, but don't check the reason, so we cannot
> distinguish the reason they are terminated.  Right?
Yes.
>
>> Japin, Junwang, what do you think?
>
> However, checking the reason on the timeout session may cause regression test
> failed (as you point in 1), I don't strongly insist on it.

Indeed, if we check a reason of FATAL timeouts - we get flaky tests.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Japin Li
Date:
On Tue, 30 Jan 2024 at 14:22, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 26 Jan 2024, at 19:58, Japin Li <japinli@hotmail.com> wrote:
>>
>> Thanks for updating the patch.  Here are some comments for v24.
>>
>> +       <para>
>> +        Terminate any session that spans longer than the specified amount of
>> +        time in transaction. The limit applies both to explicit transactions
>> +        (started with <command>BEGIN</command>) and to implicitly started
>> +        transaction corresponding to single statement. But this limit is not
>> +        applied to prepared transactions.
>> +        If this value is specified without units, it is taken as milliseconds.
>> +        A value of zero (the default) disables the timeout.
>> +       </para>
>> The sentence "But this limit is not applied to prepared transactions" is redundant,
>> since we have a paragraph to describe this later.
> Fixed.
>>
>> +
>> +       <para>
>> +        If <varname>transaction_timeout</varname> is shorter than
>> +        <varname>idle_in_transaction_session_timeout</varname> or <varname>statement_timeout</varname>
>> +        <varname>transaction_timeout</varname> will invalidate longer timeout.
>> +       </para>
>> +
>>
>> Since we are already try to disable the timeouts, should we try to disable
>> them even if they are equal.
>
> Well, we disable timeouts on equality. Fixed docs.
>
>>
>> +
>> +       <para>
>> +        Prepared transactions are not subject for this timeout.
>> +       </para>
>>
>> Maybe wrap this with <note> is a good idea.
> Done.
>

Thanks for updating the patch.  LGTM.

If there is no other objections, I'll change it to ready for committer
next Monday.



Re: Transaction timeout

From
Andrey Borodin
Date:

> On 31 Jan 2024, at 14:27, Japin Li <japinli@hotmail.com> wrote:
>
> LGTM.
>
> If there is no other objections, I'll change it to ready for committer
> next Monday.

I think we have a quorum, so I decided to go ahead and flipped status to RfC. Thanks!


Best regards, Andrey Borodin.


Re: Transaction timeout

From
Alexander Korotkov
Date:
Hi!

On Wed, Jan 31, 2024 at 11:57 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > On 31 Jan 2024, at 14:27, Japin Li <japinli@hotmail.com> wrote:
> >
> > LGTM.
> >
> > If there is no other objections, I'll change it to ready for committer
> > next Monday.
>
> I think we have a quorum, so I decided to go ahead and flipped status to RfC. Thanks!

I checked this patch.  Generally I look good.  I've slightly revised that.

I think there is one unaddressed concern by Andres Freund [1] about
the overhead of this patch by adding extra branches and function calls
in the case transaction_timeout is disabled.  I tried to measure the
overhead of this patch using a pgbench script containing 20 semicolons
(20 empty statements in 20 empty transactions).  I didn't manage to
find measurable overhead or change of performance profile (I used
XCode Instruments on my x86 MacBook).  One thing, which I still found
possible to do is to avoid unconditional calls to
get_timeout_active(TRANSACTION_TIMEOUT).  Instead I put responsibility
for disabling timeout after GUC disables the transaction_timeout
assign hook.

I removed the TODO comment from _doSetFixedOutputState().  I think
backup restore is the operation where slow commands and slow
transactions are expected, and it's natural to disable
transaction_timeout among other timeouts there.  And the existing
comment clarifies that.

Also I made some grammar fixes to docs and comments.

I'm going to push this if there are no objections.

Links.
1. https://www.postgresql.org/message-id/20221206011050.s6hapukjqha35hud%40alap3.anarazel.de

------
Regards,
Alexander Korotkov

Attachment

Re: Transaction timeout

From
Andres Freund
Date:
Hi,

On 2024-02-13 23:42:35 +0200, Alexander Korotkov wrote:
> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
> index 464858117e0..a124ba59330 100644
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -2139,6 +2139,10 @@ StartTransaction(void)
>       */
>      s->state = TRANS_INPROGRESS;
>  
> +    /* Schedule transaction timeout */
> +    if (TransactionTimeout > 0)
> +        enable_timeout_after(TRANSACTION_TIMEOUT, TransactionTimeout);
> +
>      ShowTransactionState("StartTransaction");
>  }

Isn't it a problem that all uses of StartTransaction() trigger a timeout, but
transaction commit/abort don't?  What if e.g. logical replication apply starts
a transaction, commits it, and then goes idle? The timer would still be
active, afaict?

I don't think it works well to enable timeouts in xact.c and to disable them
in PostgresMain().


> @@ -4491,12 +4511,18 @@ PostgresMain(const char *dbname, const char *username)
>                  pgstat_report_activity(STATE_IDLEINTRANSACTION_ABORTED, NULL);
>  
>                  /* Start the idle-in-transaction timer */
> -                if (IdleInTransactionSessionTimeout > 0)
> +                if (IdleInTransactionSessionTimeout > 0
> +                    && (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
>                  {
>                      idle_in_transaction_timeout_enabled = true;
>                      enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
>                                           IdleInTransactionSessionTimeout);
>                  }
> +
> +                /* Schedule or reschedule transaction timeout */
> +                if (TransactionTimeout > 0 && !get_timeout_active(TRANSACTION_TIMEOUT))
> +                    enable_timeout_after(TRANSACTION_TIMEOUT,
> +                                         TransactionTimeout);
>              }
>              else if (IsTransactionOrTransactionBlock())
>              {
> @@ -4504,12 +4530,18 @@ PostgresMain(const char *dbname, const char *username)
>                  pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);
>  
>                  /* Start the idle-in-transaction timer */
> -                if (IdleInTransactionSessionTimeout > 0)
> +                if (IdleInTransactionSessionTimeout > 0
> +                    && (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
>                  {
>                      idle_in_transaction_timeout_enabled = true;
>                      enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
>                                           IdleInTransactionSessionTimeout);
>                  }
> +
> +                /* Schedule or reschedule transaction timeout */
> +                if (TransactionTimeout > 0 && !get_timeout_active(TRANSACTION_TIMEOUT))
> +                    enable_timeout_after(TRANSACTION_TIMEOUT,
> +                                         TransactionTimeout);
>              }
>              else
>              {

Why do we need to do anything in these cases if the timer is started in
StartTransaction()?


> new file mode 100644
> index 00000000000..ce2c9a43011
> --- /dev/null
> +++ b/src/test/isolation/specs/timeouts-long.spec
> @@ -0,0 +1,35 @@
> +# Tests for transaction timeout that require long wait times
> +
> +session s7
> +step s7_begin
> +{
> +    BEGIN ISOLATION LEVEL READ COMMITTED;
> +    SET transaction_timeout = '1s';
> +}
> +step s7_commit_and_chain { COMMIT AND CHAIN; }
> +step s7_sleep    { SELECT pg_sleep(0.6); }
> +step s7_abort    { ABORT; }
> +
> +session s8
> +step s8_begin
> +{
> +    BEGIN ISOLATION LEVEL READ COMMITTED;
> +    SET transaction_timeout = '900ms';
> +}
> +# to test that quick query does not restart transaction_timeout
> +step s8_select_1 { SELECT 1; }
> +step s8_sleep    { SELECT pg_sleep(0.6); }
> +
> +session checker
> +step checker_sleep    { SELECT pg_sleep(0.3); }

Isn't this test going to be very fragile on busy / slow machines? What if the
pg_sleep() takes one second, because there were other tasks to schedule?  I'd
be surprised if this didn't fail under valgrind, for example.


Greetings,

Andres Freund



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:
Alexander, thanks for pushing this! This is small but very awaited feature.

> On 16 Feb 2024, at 02:08, Andres Freund <andres@anarazel.de> wrote:
>
> Isn't this test going to be very fragile on busy / slow machines? What if the
> pg_sleep() takes one second, because there were other tasks to schedule?  I'd
> be surprised if this didn't fail under valgrind, for example.

Even more robust tests that were bullet-proof in CI previously exhibited some failures on buildfarm. Currently there
are5 failures through this weekend. 
Failing tests are testing interaction of idle_in_transaction_session_timeout vs transaction_timeout(5), and
reschedulingtransaction_timeout(6). 
Symptoms:

[0] transaction timeout occurs when it is being scheduled. Seems like SET was running to long.
 step s6_begin: BEGIN ISOLATION LEVEL READ COMMITTED;
 step s6_tt: SET statement_timeout = '1s'; SET transaction_timeout = '10ms';
+s6: FATAL:  terminating connection due to transaction timeout
 step checker_sleep: SELECT pg_sleep(0.1);

[1] transaction timeout 10ms is not detected after 1s
step s6_check: SELECT count(*) FROM pg_stat_activity WHERE application_name = 'isolation/timeouts/s6';
 count
 -----
-    0
+    1

[2] transaction timeout is not detected in both session 5 and session 6.

So far not signle animal reported failures twice, so it's hard to say anything about frequency. But it seems to be
significantsource of failures. 

So far I have these ideas:

1. Remove test sessions 5 and 6. But it seems a little strange that session 3 did  not fail at all (it is testing
interactionof statement_timeout and transaction_timeout). This test is very similar to test sessiont 5... 
2. Increase wait times.
step checker_sleep    { SELECT pg_sleep(0.1); }
Seems not enough to observe backend timed out from pg_stat_activity. But this won't help from [0].
3. Reuse waiting INJECTION_POINT from [3] to make timeout tests deterministic and safe from race conditions. With
waitinginjection points we can wait as much as needed in current environment. 

Any advices are welcome.


Best regards, Andrey Borodin.


[0] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-02-16%2020%3A06%3A51
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-02-16%2001%3A45%3A10
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2024-02-17%2001%3A55%3A45
[3] https://www.postgresql.org/message-id/0925F9A9-4D53-4B27-A87E-3D83A757B0E0@yandex-team.ru


Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 18 Feb 2024, at 22:16, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> But it seems a little strange that session 3 did  not fail at all
It was only coincidence. Any test that verifies FATALing out in 100ms can fail, see new failure here [0].

In a nearby thread Michael is proposing injections points that can wait and be awaken. So I propose following course of
action:
1. Remove all tests that involve pg_stat_activity test of FATALed session (any permutation with checker_sleep step)
2. Add idle_in_transaction_session_timeout, statement_timeout and transaction_timeout tests when injection points
featuresget committed. 

Alexander, what do you think?

Best regards, Andrey Borodin.


[0] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grassquit&dt=2024-02-18%2022%3A23%3A45


Re: Transaction timeout

From
Japin Li
Date:
On Mon, 19 Feb 2024 at 17:14, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>> On 18 Feb 2024, at 22:16, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>
>> But it seems a little strange that session 3 did  not fail at all
> It was only coincidence. Any test that verifies FATALing out in 100ms can fail, see new failure here [0].
>
> In a nearby thread Michael is proposing injections points that can wait and be awaken. So I propose following course
ofaction:
 
> 1. Remove all tests that involve pg_stat_activity test of FATALed session (any permutation with checker_sleep step)
> 2. Add idle_in_transaction_session_timeout, statement_timeout and transaction_timeout tests when injection points
featuresget committed.
 
>

+1

> Alexander, what do you think?
>



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 19 Feb 2024, at 15:17, Japin Li <japinli@hotmail.com> wrote:
>
>
> +1

PFA patch set of 4 patches:
1. remove all potential flaky tests. BTW recently we had a bingo when 3 of them failed together [0]
2-3. waiting injection points patchset by Michael Paquier, intact v2 from nearby thread.
4. prototype of simple TAP tests for timeouts.

I did not add a test for statement_timeout, because it still have good coverage in isolation tests. But added test for
idle_sessoin_timeout.
Maybe these tests could be implemented with NOTICE injection points (not requiring steps 2-3), but I'm afraid that they
mightbe flaky too: FATALed connection might not send information necesary for test success (we will see something like
"PQconsumeInputfailed: server closed the connection unexpectedly" as in [1]). 


Best regards, Andrey Borodin.

[0] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-02-20%2010%3A20%3A13
[1] https://www.postgresql.org/message-id/flat/CAAhFRxiQsRs2Eq5kCo9nXE3HTugsAAJdSQSmxncivebAxdmBjQ%40mail.gmail.com


Attachment

Re: Transaction timeout

From
Alexander Korotkov
Date:
Hi, Andrey!

On Thu, Feb 22, 2024 at 7:23 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > On 19 Feb 2024, at 15:17, Japin Li <japinli@hotmail.com> wrote:
> >
> >
> > +1
>
> PFA patch set of 4 patches:
> 1. remove all potential flaky tests. BTW recently we had a bingo when 3 of them failed together [0]
> 2-3. waiting injection points patchset by Michael Paquier, intact v2 from nearby thread.
> 4. prototype of simple TAP tests for timeouts.
>
> I did not add a test for statement_timeout, because it still have good coverage in isolation tests. But added test
foridle_sessoin_timeout. 
> Maybe these tests could be implemented with NOTICE injection points (not requiring steps 2-3), but I'm afraid that
theymight be flaky too: FATALed connection might not send information necesary for test success (we will see something
like"PQconsumeInput failed: server closed the connection unexpectedly" as in [1]). 

Thank you for the patches.  I've pushed the 0001 patch to avoid
further failures on buildfarm.  Let 0004 wait till injections points
by Mechael are committed.

------
Regards,
Alexander Korotkov



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 25 Feb 2024, at 21:50, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> Thank you for the patches.  I've pushed the 0001 patch to avoid
> further failures on buildfarm.  Let 0004 wait till injections points
> by Mechael are committed.

Thanks!

All prerequisites are committed. I propose something in a line with this patch.


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Alexander Korotkov
Date:
On Wed, Mar 6, 2024 at 10:22 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > On 25 Feb 2024, at 21:50, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > Thank you for the patches.  I've pushed the 0001 patch to avoid
> > further failures on buildfarm.  Let 0004 wait till injections points
> > by Mechael are committed.
>
> Thanks!
>
> All prerequisites are committed. I propose something in a line with this patch.

Thank you.  I took a look at the patch.  Should we also check the
relevant message after the timeout is fired?  We could check it in
psql stderr or log for that.

------
Regards,
Alexander Korotkov



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 7 Mar 2024, at 00:55, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> On Wed, Mar 6, 2024 at 10:22 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>> On 25 Feb 2024, at 21:50, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>>>
>>> Thank you for the patches.  I've pushed the 0001 patch to avoid
>>> further failures on buildfarm.  Let 0004 wait till injections points
>>> by Mechael are committed.
>>
>> Thanks!
>>
>> All prerequisites are committed. I propose something in a line with this patch.
>
> Thank you.  I took a look at the patch.  Should we also check the
> relevant message after the timeout is fired?  We could check it in
> psql stderr or log for that.

PFA version which checks log output.
But I could not come up with a proper use of BackgroundPsql->query_until() to check outputs. And there are multiple
possibleerrors. 

We can copy test from src/bin/psql/t/001_basic.pl:

# test behavior and output on server crash
my ($ret, $out, $err) = $node->psql('postgres',
"SELECT 'before' AS running;\n"
. "SELECT pg_terminate_backend(pg_backend_pid());\n"
. "SELECT 'AFTER' AS not_running;\n");

is($ret, 2, 'server crash: psql exit code');
like($out, qr/before/, 'server crash: output before crash');
ok($out !~ qr/AFTER/, 'server crash: no output after crash');
is( $err,
'psql:<stdin>:2: FATAL: terminating connection due to administrator command
psql:<stdin>:2: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:<stdin>:2: error: connection to server was lost',
'server crash: error message’);

But I do not see much value in this.
What do you think?


Best regards, Andrey Borodin.


Attachment

Re: Transaction timeout

From
Alexander Korotkov
Date:
On Mon, Mar 11, 2024 at 12:53 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > On 7 Mar 2024, at 00:55, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > On Wed, Mar 6, 2024 at 10:22 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >>> On 25 Feb 2024, at 21:50, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >>>
> >>> Thank you for the patches.  I've pushed the 0001 patch to avoid
> >>> further failures on buildfarm.  Let 0004 wait till injections points
> >>> by Mechael are committed.
> >>
> >> Thanks!
> >>
> >> All prerequisites are committed. I propose something in a line with this patch.
> >
> > Thank you.  I took a look at the patch.  Should we also check the
> > relevant message after the timeout is fired?  We could check it in
> > psql stderr or log for that.
>
> PFA version which checks log output.
> But I could not come up with a proper use of BackgroundPsql->query_until() to check outputs. And there are multiple
possibleerrors. 
>
> We can copy test from src/bin/psql/t/001_basic.pl:
>
> # test behavior and output on server crash
> my ($ret, $out, $err) = $node->psql('postgres',
> "SELECT 'before' AS running;\n"
> . "SELECT pg_terminate_backend(pg_backend_pid());\n"
> . "SELECT 'AFTER' AS not_running;\n");
>
> is($ret, 2, 'server crash: psql exit code');
> like($out, qr/before/, 'server crash: output before crash');
> ok($out !~ qr/AFTER/, 'server crash: no output after crash');
> is( $err,
> 'psql:<stdin>:2: FATAL: terminating connection due to administrator command
> psql:<stdin>:2: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> psql:<stdin>:2: error: connection to server was lost',
> 'server crash: error message’);
>
> But I do not see much value in this.
> What do you think?

I think if checking psql stderr is problematic, checking just logs is
fine.  Could we wait for the relevant log messages one by one with
$node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?

------
Regards,
Alexander Korotkov



Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 11 Mar 2024, at 16:18, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> 
> I think if checking psql stderr is problematic, checking just logs is
> fine.  Could we wait for the relevant log messages one by one with
> $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?

PFA version with $node->wait_for_log()

Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Alexander Korotkov
Date:
On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > On 11 Mar 2024, at 16:18, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > I think if checking psql stderr is problematic, checking just logs is
> > fine.  Could we wait for the relevant log messages one by one with
> > $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?
>
> PFA version with $node->wait_for_log()

I've slightly revised the patch.  I'm going to push it if no objections.

------
Regards,
Alexander Korotkov

Attachment

Re: Transaction timeout

From
"Andrey M. Borodin"
Date:

> On 13 Mar 2024, at 05:23, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>> On 11 Mar 2024, at 16:18, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>>>
>>> I think if checking psql stderr is problematic, checking just logs is
>>> fine.  Could we wait for the relevant log messages one by one with
>>> $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?
>>
>> PFA version with $node->wait_for_log()
>
> I've slightly revised the patch.  I'm going to push it if no objections.

One small note: log_offset was updated, but was not used.

Thanks!


Best regards, Andrey Borodin.

Attachment

Re: Transaction timeout

From
Alexander Korotkov
Date:
On Wed, Mar 13, 2024 at 7:56 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> > On 13 Mar 2024, at 05:23, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >>> On 11 Mar 2024, at 16:18, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >>>
> >>> I think if checking psql stderr is problematic, checking just logs is
> >>> fine.  Could we wait for the relevant log messages one by one with
> >>> $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?
> >>
> >> PFA version with $node->wait_for_log()
> >
> > I've slightly revised the patch.  I'm going to push it if no objections.
>
> One small note: log_offset was updated, but was not used.

Thank you. This is the updated version.

------
Regards,
Alexander Korotkov

Attachment