Re: [HACKERS] Walsender timeouts and large transactions - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: [HACKERS] Walsender timeouts and large transactions
Date
Msg-id CAMsr+YF8kTgR3Mz7LcORrj4jF5jP9NBtaSd54Eu5-5fqM3x02g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Walsender timeouts and large transactions  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
On 9 August 2017 at 21:23, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
On 02/08/17 19:35, Yura Sokolov wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:       not tested
> Spec compliant:           not tested
> Documentation:            not tested
>
> There is no check for (last_reply_timestamp <= 0 || wal_sender_timeout <= 0) as in other places
> (in WalSndKeepaliveIfNecessary for example).
>
> I don't think moving update of 'now' down to end of loop body is correct:
> there are calls to ProcessConfigFile with SyncRepInitConfig, ProcessRepliesIfAny that can
> last non-negligible time. It could lead to over sleeping due to larger computed sleeptime.
> Though I could be mistaken.
>
> I'm not sure about moving `if (!pg_is_send_pending())` in a body loop after WalSndKeepaliveIfNecessary.
> Is it necessary? But it looks harmless at least.
>

We also need to do actual timeout handing so that the timeout is not
deferred to the end of the transaction (Which is why I moved `if
(!pg_is_send_pending())` under WalSndCheckTimeOut() and
WalSndKeepaliveIfNecessary() calls).

> Could patch be reduced to check after first `if (!pg_is_sendpending())` ? like:
>
>       if (!pq_is_send_pending())
> -             return;
> +     {
> +             if (last_reply_timestamp <= 0 || wal_sender_timeout <= 0)
> +             {
> +                     CHECK_FOR_INTERRUPTS();
> +                     return;
> +             }
> +             if (now <= TimestampTzPlusMilliseconds(last_reply_timestamp, wal_sender_timeout / 2))
> +                     return;
> +     }
>
> If not, what problem prevents?

We should do CHECK_FOR_INTERRUPTS() independently of pq_is_send_pending
so that it's possible to stop walsender while it's processing large
transaction.


Is there any chance of getting this bugfix into Pg 10?

We've just cut back branches, so it'd be a sensible time. 



--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: [HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was:[BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)
Next
From: Vaishnavi Prabakaran
Date:
Subject: Re: [HACKERS] PATCH: Batch/pipelining support for libpq