Thread: [HACKERS] Optional message to user when terminating/cancelling backend

[HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

19 June 2017, 21:24:43

When terminating, or cancelling, a backend it’s currently not possible to let
the signalled session know *why* it was dropped.  This has nagged me in the
past and now it happened to come up again, so I took a stab at this.  The
attached patch implements the ability to pass an optional text message to the
signalled session which is included in the error message:

  SELECT pg_terminate_backend(<pid> [, message]);
  SELECT pg_cancel_backend(<pid> [, message]);

Right now the message is simply appended on the error message, not sure if
errdetail or errhint would be better? Calling:

  select pg_terminate_backend(<pid>, 'server rebooting');

..leads to:

  FATAL:  terminating connection due to administrator command: "server rebooting"

Omitting the message invokes the command just like today.

The message is stored in a new shmem area which is checked when the session is
aborted.  To keep things simple a small buffer is kept per backend for the
message.  If deemed too costly, keeping a central buffer from which slabs are
allocated can be done (but seemed rather complicated for little gain compared
to the quite moderate memory spend.)

cheers ./daniel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

terminate_msg_v2.patch

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Pavel Stehule

Date:

19 June 2017, 21:40:35

2017-06-19 20:24 GMT+02:00 Daniel Gustafsson <daniel@yesql.se>:

When terminating, or cancelling, a backend it’s currently not possible to let
the signalled session know *why* it was dropped. This has nagged me in the
past and now it happened to come up again, so I took a stab at this. The
attached patch implements the ability to pass an optional text message to the
signalled session which is included in the error message:

SELECT pg_terminate_backend(<pid> [, message]);
SELECT pg_cancel_backend(<pid> [, message]);

Right now the message is simply appended on the error message, not sure if
errdetail or errhint would be better? Calling:

select pg_terminate_backend(<pid>, 'server rebooting');

..leads to:

FATAL: terminating connection due to administrator command: "server rebooting"

Omitting the message invokes the command just like today.

The message is stored in a new shmem area which is checked when the session is
aborted. To keep things simple a small buffer is kept per backend for the
message. If deemed too costly, keeping a central buffer from which slabs are
allocated can be done (but seemed rather complicated for little gain compared
to the quite moderate memory spend.)

cheers ./daniel

very good idea

Pavel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Satyanarayana Narlapuram

Date:

19 June 2017, 22:08:37

+1.

This really helps PostgreSQL Azure service as well. When we are doing the upgrades to the service, instead of abruptly terminating the sessions we can provide this message.

Thanks,

Satya

From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Pavel Stehule
Sent: Monday, June 19, 2017 11:41 AM
To: Daniel Gustafsson <daniel@yesql.se>
Cc: PostgreSQL mailing lists <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Optional message to user when terminating/cancelling backend

2017-06-19 20:24 GMT+02:00 Daniel Gustafsson <daniel@yesql.se>:

When terminating, or cancelling, a backend it’s currently not possible to let
the signalled session know *why* it was dropped. This has nagged me in the
past and now it happened to come up again, so I took a stab at this. The
attached patch implements the ability to pass an optional text message to the
signalled session which is included in the error message:

SELECT pg_terminate_backend(<pid> [, message]);
SELECT pg_cancel_backend(<pid> [, message]);

Right now the message is simply appended on the error message, not sure if
errdetail or errhint would be better? Calling:

select pg_terminate_backend(<pid>, 'server rebooting');

..leads to:

FATAL: terminating connection due to administrator command: "server rebooting"

Omitting the message invokes the command just like today.

The message is stored in a new shmem area which is checked when the session is
aborted. To keep things simple a small buffer is kept per backend for the
message. If deemed too costly, keeping a central buffer from which slabs are
allocated can be done (but seemed rather complicated for little gain compared
to the quite moderate memory spend.)

cheers ./daniel

very good idea

Pavel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Alvaro Herrera

Date:

20 June 2017, 21:54:25

Satyanarayana Narlapuram wrote:
> +1.
> 
> This really helps PostgreSQL Azure service as well. When we are doing
> the upgrades to the service, instead of abruptly terminating the
> sessions we can provide this message.

I think you mean "in addition to" rather than "instead of".

Unless you have a lot of users running psql manually, I don't see how
this is actually very useful or actionable.  What would the user do with
the information?  Hopefully your users already trust that you'd keep the
downtime to the minimum possible.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

"David G. Johnston"

Date:

20 June 2017, 22:43:39

On Tue, Jun 20, 2017 at 11:54 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Satyanarayana Narlapuram wrote:

> Unless you have a lot of users running psql manually, I don't see how
> this is actually very useful or actionable.  What would the user do with
> the information?  Hopefully your users already trust that you'd keep the
> downtime to the minimum possible.
>

Why wouldn't this be useful in application logs?  Spurious dropped
connections during application execution would be alarming.  Seeing a
message from the DBA when looking into those would be a painless and
quick way to alleviate stress.

pg_cancel_backend(<pid>, 'please try not to leave sessions in an "idle
in transaction" state...') would also seem like a useful message to
communicate; to user or application.  Sure, some of this can, and
maybe would also need to, be done out-of-band but this communication
channel seems worthy enough to at least evaluate the provided
implementation.

David J.

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Satyanarayana Narlapuram

Date:

20 June 2017, 22:49:59

Agree with David on the general usefulness of this channel. Not that Azure has this implementation or proposal today,
butas a managed service this channel of communication is worth. For example, DBA / service can set a policy that if
certainsession exceeds the resource usage DBA can kill it and communicate the same. For example, too many locks, lot of
IOactivity, large open transactions etc. The messages will help application developer to tune their workloads
appropriately.

Thanks,
Satya

-----Original Message-----
From: David G. Johnston [mailto:david.g.johnston@gmail.com] 
Sent: Tuesday, June 20, 2017 12:44 PM
To: Alvaro Herrera <alvherre@2ndquadrant.com>
Cc: Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com>; Pavel Stehule <pavel.stehule@gmail.com>; Daniel
Gustafsson<daniel@yesql.se>; PostgreSQL mailing lists <pgsql-hackers@postgresql.org>

Subject: Re: [HACKERS] Optional message to user when terminating/cancelling backend

On Tue, Jun 20, 2017 at 11:54 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Satyanarayana Narlapuram wrote:

> Unless you have a lot of users running psql manually, I don't see how 
> this is actually very useful or actionable.  What would the user do 
> with the information?  Hopefully your users already trust that you'd 
> keep the downtime to the minimum possible.
>

Why wouldn't this be useful in application logs?  Spurious dropped connections during application execution would be
alarming. Seeing a message from the DBA when looking into those would be a painless and quick way to alleviate stress.

pg_cancel_backend(<pid>, 'please try not to leave sessions in an "idle in transaction" state...') would also seem like
auseful message to communicate; to user or application.  Sure, some of this can, and maybe would also need to, be done
out-of-bandbut this communication channel seems worthy enough to at least evaluate the provided implementation.

David J.

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Andres Freund

Date:

20 June 2017, 23:01:35

Hi,

On 2017-06-19 20:24:43 +0200, Daniel Gustafsson wrote:
> When terminating, or cancelling, a backend it’s currently not possible to let
> the signalled session know *why* it was dropped.  This has nagged me in the
> past and now it happened to come up again, so I took a stab at this.  The
> attached patch implements the ability to pass an optional text message to the
> signalled session which is included in the error message:
> 
>   SELECT pg_terminate_backend(<pid> [, message]);
>   SELECT pg_cancel_backend(<pid> [, message]);
> 
> Right now the message is simply appended on the error message, not sure if
> errdetail or errhint would be better? Calling:
> 
>   select pg_terminate_backend(<pid>, 'server rebooting');
> 
> ..leads to:
> 
>   FATAL:  terminating connection due to administrator command: "server rebooting"
> 
> Omitting the message invokes the command just like today.

For extensions it'd also be useful if it'd be possible to overwrite the
error code.  E.g. for citus there's a distributed deadlock detector,
running out of process because there's no way to interrupt lock waits
locally, and we've to do some ugly hacking to generate proper error
messages and code from another session.

- Andres

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Michael Paquier

Date:

21 June 2017, 06:06:33

On Tue, Jun 20, 2017 at 3:24 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
> The message is stored in a new shmem area which is checked when the session is
> aborted.  To keep things simple a small buffer is kept per backend for the
> message.  If deemed too costly, keeping a central buffer from which slabs are
> allocated can be done (but seemed rather complicated for little gain compared
> to the quite moderate memory spend.)

I think that you are right to take the approach with a per-backend
slot. This will avoid complications related to entry removals and
locking issues. There would be scaling issues as well if things get
very signaled for a lot of backends.

+#define MAX_CANCEL_MSG 128
That looks enough.

+           LWLockAcquire(BackendCancelLock, LW_EXCLUSIVE);
+
+           strlcpy(slot->message, message, sizeof(slot->message));
+           slot->len = strlen(message);
Why not using one spin lock per slot and save it in BackendCancelShmemStruct?

+   pid_t       pid = PG_GETARG_INT32(0);
+   char       *msg = text_to_cstring(PG_GETARG_TEXT_PP(1));
+
+   PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg));
It would be more solid to add some error handling for messages that
are too long, or at least truncate the message if too long.
-- 
Michael

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Yugo Nagata

Date:

21 June 2017, 16:57:40

Hi,

Here are some comments for the patch.

+Datum
+pg_cancel_backend(PG_FUNCTION_ARGS)
+{
+   PG_RETURN_BOOL(pg_cancel_backend_internal(PG_GETARG_INT32(0), NULL));
+}
+Datum
+pg_cancel_backend_msg(PG_FUNCTION_ARGS)
+{
+   pid_t       pid = PG_GETARG_INT32(0);
+   char       *msg = text_to_cstring(PG_GETARG_TEXT_PP(1));
+
+   PG_RETURN_BOOL(pg_cancel_backend_internal(pid, msg));
+}

It would be better to insert a blank line between these functions.

+/*
+ * Sets a cancellation message for the backend with the specified pid and
+ * returns the length of message actually created. If the returned length
+ * is equal to the length of the message parameter, truncation may have
+ * occurred. If the backend wasn't found and no message was set, -1 is
+ * returned.
+ */

It seems to me that this comment is incorrect.
"If the returned length is not equal to the length of the message parameter,"

is right, isn't it?

In addition, the last statement would be 
"If the backend wasn't found, -1 is returned. Otherwize, if no message was set, 0 is returned."


+           strlcpy(slot->message, message, sizeof(slot->message));
+           slot->len = strlen(message);
+
+           LWLockRelease(BackendCancelLock);
+           return slot->len;

If SetBackendCancelMessage() has to return the length of message actually created,
slot->len should be strlen(slot->message) instead of strlen(message).
In the current code, when the return value and slot->len is always set
to the length of the passed message parameter.


Regards,

On Mon, 19 Jun 2017 20:24:43 +0200
Daniel Gustafsson <daniel@yesql.se> wrote:

> When terminating, or cancelling, a backend it’s currently not possible to let
> the signalled session know *why* it was dropped.  This has nagged me in the
> past and now it happened to come up again, so I took a stab at this.  The
> attached patch implements the ability to pass an optional text message to the
> signalled session which is included in the error message:
> 
>   SELECT pg_terminate_backend(<pid> [, message]);
>   SELECT pg_cancel_backend(<pid> [, message]);
> 
> Right now the message is simply appended on the error message, not sure if
> errdetail or errhint would be better? Calling:
> 
>   select pg_terminate_backend(<pid>, 'server rebooting');
> 
> ..leads to:
> 
>   FATAL:  terminating connection due to administrator command: "server rebooting"
> 
> Omitting the message invokes the command just like today.
> 
> The message is stored in a new shmem area which is checked when the session is
> aborted.  To keep things simple a small buffer is kept per backend for the
> message.  If deemed too costly, keeping a central buffer from which slabs are
> allocated can be done (but seemed rather complicated for little gain compared
> to the quite moderate memory spend.) 
> 
> cheers ./daniel
> 


-- 
Yugo Nagata <nagata@sraoss.co.jp>

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Yugo Nagata

Date:

21 June 2017, 17:30:48

On Wed, 21 Jun 2017 12:06:33 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:

> On Tue, Jun 20, 2017 at 3:24 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
> > The message is stored in a new shmem area which is checked when the session is
> > aborted.  To keep things simple a small buffer is kept per backend for the
> > message.  If deemed too costly, keeping a central buffer from which slabs are
> > allocated can be done (but seemed rather complicated for little gain compared
> > to the quite moderate memory spend.)
> 
> I think that you are right to take the approach with a per-backend
> slot. This will avoid complications related to entry removals and
> locking issues. There would be scaling issues as well if things get
> very signaled for a lot of backends.
> 
> +#define MAX_CANCEL_MSG 128
> That looks enough.
> 
> +           LWLockAcquire(BackendCancelLock, LW_EXCLUSIVE);
> +
> +           strlcpy(slot->message, message, sizeof(slot->message));
> +           slot->len = strlen(message);
> Why not using one spin lock per slot and save it in BackendCancelShmemStruct?

+1

I found an example that a spin lock is used during strlcpy in WalReceiverMain().

> 
> +   pid_t       pid = PG_GETARG_INT32(0);
> +   char       *msg = text_to_cstring(PG_GETARG_TEXT_PP(1));
> +
> +   PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg));
> It would be more solid to add some error handling for messages that
> are too long, or at least truncate the message if too long.

I agree that error handling for too long messages is needed.
Although long messages are truncated in SetBackendCancelMessage(),
it is better to inform users that the message they can read was
truncated one. Or, maybe we should prohibit too long message
is passed in pg_teminate_backend()


> -- 
> Michael
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Yugo Nagata <nagata@sraoss.co.jp>

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

21 June 2017, 17:42:47

> On 21 Jun 2017, at 16:30, Yugo Nagata <nagata@sraoss.co.jp> wrote:
>
> On Wed, 21 Jun 2017 12:06:33 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
>
>> On Tue, Jun 20, 2017 at 3:24 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
>>> The message is stored in a new shmem area which is checked when the session is
>>> aborted.  To keep things simple a small buffer is kept per backend for the
>>> message.  If deemed too costly, keeping a central buffer from which slabs are
>>> allocated can be done (but seemed rather complicated for little gain compared
>>> to the quite moderate memory spend.)
>>
>> I think that you are right to take the approach with a per-backend
>> slot. This will avoid complications related to entry removals and
>> locking issues. There would be scaling issues as well if things get
>> very signaled for a lot of backends.
>>
>> +#define MAX_CANCEL_MSG 128
>> That looks enough.
>>
>> +           LWLockAcquire(BackendCancelLock, LW_EXCLUSIVE);
>> +
>> +           strlcpy(slot->message, message, sizeof(slot->message));
>> +           slot->len = strlen(message);
>> Why not using one spin lock per slot and save it in BackendCancelShmemStruct?
>
> +1
>
> I found an example that a spin lock is used during strlcpy in WalReceiverMain().

Yeah I agree as well, will fix.

>> +   pid_t       pid = PG_GETARG_INT32(0);
>> +   char       *msg = text_to_cstring(PG_GETARG_TEXT_PP(1));
>> +
>> +   PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg));
>> It would be more solid to add some error handling for messages that
>> are too long, or at least truncate the message if too long.
>
> I agree that error handling for too long messages is needed.
> Although long messages are truncated in SetBackendCancelMessage(),
> it is better to inform users that the message they can read was
> truncated one. Or, maybe we should prohibit too long message
> is passed in pg_teminate_backend()

The message is truncated in SetBackendCancelMessage() for safety, but
pg_{cancel|terminate}_backend() could throw an error on too long message, or
warning truncation, to the caller as well.  Personally I think a warning is the
appropriate response, but I don’t really have a strong opinion.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Michael Paquier

Date:

22 June 2017, 03:24:54

On Wed, Jun 21, 2017 at 11:42 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
> The message is truncated in SetBackendCancelMessage() for safety, but
> pg_{cancel|terminate}_backend() could throw an error on too long message, or
> warning truncation, to the caller as well.  Personally I think a warning is the
> appropriate response, but I don’t really have a strong opinion.

And a NOTICE? That's what happens for relation name truncation. You
are right that having a check in SetBackendCancelMessage() makes the
most sense as bgworkers could just call the low level API. Isn't the
concept actually closer to just a backend message? This slot could be
used for other purposes than cancellation.
--
Michael

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Yugo Nagata

Date:

22 June 2017, 11:24:57

On Thu, 22 Jun 2017 09:24:54 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:

> On Wed, Jun 21, 2017 at 11:42 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
> > The message is truncated in SetBackendCancelMessage() for safety, but
> > pg_{cancel|terminate}_backend() could throw an error on too long message, or
> > warning truncation, to the caller as well.  Personally I think a warning is the
> > appropriate response, but I don’t really have a strong opinion.
> 
> And a NOTICE? That's what happens for relation name truncation. You
> are right that having a check in SetBackendCancelMessage() makes the
> most sense as bgworkers could just call the low level API. Isn't the
> concept actually closer to just a backend message? This slot could be
> used for other purposes than cancellation.

+1 for NOTICE. The message truncation seems to be a kind of helpful
information rather than a likely problem as long as pg_terminated_backend
exits successfully.

https://www.postgresql.org/docs/10/static/runtime-config-logging.html#runtime-config-severity-levels

> -- 
> Michael


-- 
Yugo Nagata <nagata@sraoss.co.jp>

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

22 June 2017, 16:48:27

> On 22 Jun 2017, at 10:24, Yugo Nagata <nagata@sraoss.co.jp> wrote:
>
> On Thu, 22 Jun 2017 09:24:54 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
>
>> On Wed, Jun 21, 2017 at 11:42 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>>> The message is truncated in SetBackendCancelMessage() for safety, but
>>> pg_{cancel|terminate}_backend() could throw an error on too long message, or
>>> warning truncation, to the caller as well.  Personally I think a warning is the
>>> appropriate response, but I don’t really have a strong opinion.
>>
>> And a NOTICE? That's what happens for relation name truncation. You
>> are right that having a check in SetBackendCancelMessage() makes the
>> most sense as bgworkers could just call the low level API. Isn't the
>> concept actually closer to just a backend message? This slot could be
>> used for other purposes than cancellation.
>
> +1 for NOTICE. The message truncation seems to be a kind of helpful
> information rather than a likely problem as long as pg_terminated_backend
> exits successfully.
>
> https://www.postgresql.org/docs/10/static/runtime-config-logging.html#runtime-config-severity-levels

Good point.  I’ve attached a new version which issues a NOTICE on truncation
and also addresses all other comments so far in this thread.  Thanks a lot for
the early patch reviews!

cheers ./daniel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

terminate_msg_v3.patch

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Dilip Kumar

Date:

22 June 2017, 18:52:22

On Thu, Jun 22, 2017 at 7:18 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
> Good point.  I’ve attached a new version which issues a NOTICE on truncation
> and also addresses all other comments so far in this thread.  Thanks a lot for
> the early patch reviews!
>
> cheers ./daniel

I have done an initial review of the patch. I have some comments/suggestions.


+int
+GetCancelMessage(char **buffer, size_t buf_len)
+{
+ volatile BackendCancelShmemStruct *slot = MyCancelSlot;
+ int msg_length = 0;
+

Returned value of this function is never used, better to convert it to
just void.

-------

+bool
+HasCancelMessage(void)
+{
+ volatile BackendCancelShmemStruct *slot = MyCancelSlot;


+/*
+ * Return the configured cancellation message and its length. If the returned
+ * length is greater than the size of the passed buffer, truncation has been
+ * performed. The message is cleared on reading.
+ */
+int
+GetCancelMessage(char **buffer, size_t buf_len)

I think it will be good to merge these two function where
GetCancelMessage will first check whether it has the message or not
if it does then allocate the buffer of size slot->len and copy the
slot message to allocated buffer otherwise just return NULL.

So it will look like "char *GetCancelMessage()"

---------

+ SpinLockAcquire(&slot->mutex);
+ strlcpy(*buffer, (const char *) slot->message, buf_len);

strlcpy(*buffer, (const char *) slot->message, slot->len);

Isn't it better to copy only upto slot->len, seems like it will always
be <= buf_len, and if not then
we can do min(buf_len, slot->len)

----

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

26 June 2017, 10:44:06

> On 22 Jun 2017, at 17:52, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Jun 22, 2017 at 7:18 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>> Good point.  I’ve attached a new version which issues a NOTICE on truncation
>> and also addresses all other comments so far in this thread.  Thanks a lot for
>> the early patch reviews!
>>
>> cheers ./daniel
>
> I have done an initial review of the patch. I have some comments/suggestions.

Thanks!

> +int
> +GetCancelMessage(char **buffer, size_t buf_len)
> +{
> + volatile BackendCancelShmemStruct *slot = MyCancelSlot;
> + int msg_length = 0;
> +
>
> Returned value of this function is never used, better to convert it to
> just void.

You’re probably right, I was thinking that someone might be interested in
knowing about truncation when extracting the message but I can’t really think
of a callsite which wouldn’t just pass in a large enough buffer in the first
place.

> +bool
> +HasCancelMessage(void)
> +{
> + volatile BackendCancelShmemStruct *slot = MyCancelSlot;
>
> +/*
> + * Return the configured cancellation message and its length. If the returned
> + * length is greater than the size of the passed buffer, truncation has been
> + * performed. The message is cleared on reading.
> + */
> +int
> +GetCancelMessage(char **buffer, size_t buf_len)
>
> I think it will be good to merge these two function where
> GetCancelMessage will first check whether it has the message or not
> if it does then allocate the buffer of size slot->len and copy the
> slot message to allocated buffer otherwise just return NULL.
>
> So it will look like "char *GetCancelMessage()”

It doesn’t seem like a good idea to perform memory allocation inside a spinlock
in a signalled backend, that would probably at least require an LWLock wouldn’t
it?  It seems safer to leave memory management to the signalled backend but it
may be paranoia on my part.

> + SpinLockAcquire(&slot->mutex);
> + strlcpy(*buffer, (const char *) slot->message, buf_len);
>
> strlcpy(*buffer, (const char *) slot->message, slot->len);
>
> Isn't it better to copy only upto slot->len, seems like it will always
> be <= buf_len, and if not then
> we can do min(buf_len, slot->len)

strlcpy(3) is defined as taking the size of the passed buffer and not the
copied string.  Since we guarantee that slot->message is NUL terminated it
seems wise to stick to the API. Or did I misunderstand your comment?

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Joel Jacobson

Date:

26 June 2017, 17:15:39

+1

On Tue, Jun 20, 2017 at 8:54 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Unless you have a lot of users running psql manually, I don't see how
> this is actually very useful or actionable.  What would the user do with
> the information?  Hopefully your users already trust that you'd keep the
> downtime to the minimum possible.

I think this feature would be useful for PgTerminator
(https://github.com/trustly/pgterminator)
a tool which automatically kills unprotected processes that could
potentially be the reason why
>X number of protected important processes have been waiting for >Y seconds.

When I'm guilty of locking this in the production DB and get killed by
PgTerminator,
it would be nice to know the reason, e.g. that it was PgTerminator
that killed me
and what process I was blocking.

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

"Joshua D. Drake"

Date:

26 June 2017, 17:39:00

On 06/26/2017 07:15 AM, Joel Jacobson wrote:
> +1
> 
> On Tue, Jun 20, 2017 at 8:54 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Unless you have a lot of users running psql manually, I don't see how
>> this is actually very useful or actionable.  What would the user do with
>> the information?  Hopefully your users already trust that you'd keep the
>> downtime to the minimum possible.
> 
> I think this feature would be useful for PgTerminator
> (https://github.com/trustly/pgterminator)
> a tool which automatically kills unprotected processes that could
> potentially be the reason why
>> X number of protected important processes have been waiting for >Y seconds.
> 
> When I'm guilty of locking this in the production DB and get killed by
> PgTerminator,
> it would be nice to know the reason, e.g. that it was PgTerminator
> that killed me
> and what process I was blocking.

And not just the pid but literally "what".

jD


-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
*****     Unless otherwise stated, opinions are my own.   *****

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Thomas Munro

Date:

02 September 2017, 03:21:30

On Fri, Jun 23, 2017 at 1:48 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
> Good point.  I’ve attached a new version which issues a NOTICE on truncation
> and also addresses all other comments so far in this thread.  Thanks a lot for
> the early patch reviews!

FYI this no longer builds because commit 81c5e46c490e just stole your OIDs:

make[3]: Entering directory
`/home/travis/build/postgresql-cfbot/postgresql/src/backend/catalog'
cd ../../../src/include/catalog && '/usr/bin/perl' ./duplicate_oids
772
972
make[3]: *** [postgres.bki] Error 1

--
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

03 September 2017, 23:47:10

> On 02 Sep 2017, at 02:21, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>
> On Fri, Jun 23, 2017 at 1:48 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
>> Good point.  I’ve attached a new version which issues a NOTICE on truncation
>> and also addresses all other comments so far in this thread.  Thanks a lot for
>> the early patch reviews!
>
> FYI this no longer builds because commit 81c5e46c490e just stole your OIDs:
>
> make[3]: Entering directory
> `/home/travis/build/postgresql-cfbot/postgresql/src/backend/catalog'
> cd ../../../src/include/catalog && '/usr/bin/perl' ./duplicate_oids
> 772
> 972
> make[3]: *** [postgres.bki] Error 1

Thanks, I hadn’t spotted that yet.  Attached is an updated patch using new OIDs
to make it compile again.

cheers ./daniel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

terminate_msg_v4.patch

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Yugo Nagata

Date:

28 September 2017, 15:55:24

On Sun, 3 Sep 2017 22:47:10 +0200
Daniel Gustafsson <daniel@yesql.se> wrote:

I have reviewed your latest patch. 

I can apply this to the master branch and build this successfully,
and the behavior is as expected. 

However, here are some comments and suggestions.

> 135 +               char   *buffer = palloc0(MAX_CANCEL_MSG);
> 136 +
> 137 +               GetCancelMessage(&buffer, MAX_CANCEL_MSG);
> 138 +               ereport(ERROR,
> 139 +                       (errcode(ERRCODE_QUERY_CANCELED),
> 140 +                        errmsg("canceling statement due to user request: \"%s\"",
> 141 +                               buffer)));

The memory for buffer is allocated here, but I think we can do this
in GetCancelMessage. Since the size of allocated memory is fixed
to MAX_CANCEL_MSG, it isn't neccesary to pass this to the function.
In addition, how about returning the message as the return value?
That is, we can define GetCancelMessage as following;
 char * GetCancelMessage(void)

> 180 +       r = SetBackendCancelMessage(pid, msg);
> 181 +
> 182 +       if (r != -1 && r != strlen(msg))
> 183 +           ereport(NOTICE,
> 184 +                   (errmsg("message is too long and has been truncated")));
> 185 +   }

We can this error handling in SetBackendCancelMessage. I think this is better
because the truncation of message is done in this function. In addition, 
we should use pg_mbcliplen for this truncation as done in truncate_identifier. 
Else, multibyte character boundary is broken, and noises can slip in log
messages.

> 235 -   int         r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM);
> 236 +   int r = pg_signal_backend(pid, SIGTERM, msg);

This line includes an unnecessary indentation change.

> 411 + * returns the length of message actually created. If the returned length

"the length of message" might be "the length of the message"

> 413 + * If the backend wasn't found and no message was set, -1 is returned. If two

This comment is incorrect since this function returns 0 when no message was set.

> 440 +           strlcpy(slot->message, message, sizeof(slot->message));
> 441 +           slot->len = strlen(slot->message);
> 442 +           message_len = slot->len;
> 443 +           SpinLockRelease(&slot->mutex);
> 444 +
> 445 +           return message_len;

This can return slot->len directly and the variable message_len is
unnecessary. However, if we handle the "too long message" error
in this function as suggested above, this does not have to
return anything.

Regards,

-- 
Yugo Nagata <nagata@sraoss.co.jp>

> > On 02 Sep 2017, at 02:21, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
> > 
> > On Fri, Jun 23, 2017 at 1:48 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
> >> Good point.  I’ve attached a new version which issues a NOTICE on truncation
> >> and also addresses all other comments so far in this thread.  Thanks a lot for
> >> the early patch reviews!"
> > 
> > FYI this no longer builds because commit 81c5e46c490e just stole your OIDs:
> > 
> > make[3]: Entering directory
> > `/home/travis/build/postgresql-cfbot/postgresql/src/backend/catalog'
> > cd ../../../src/include/catalog && '/usr/bin/perl' ./duplicate_oids
> > 772
> > 972
> > make[3]: *** [postgres.bki] Error 1
> 
> Thanks, I hadn’t spotted that yet.  Attached is an updated patch using new OIDs
> to make it compile again.
> 
> cheers ./daniel
> 

-- 
Yugo Nagata <nagata@sraoss.co.jp>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

02 October 2017, 03:33:12

> On 28 Sep 2017, at 14:55, Yugo Nagata <nagata@sraoss.co.jp> wrote:
>
> On Sun, 3 Sep 2017 22:47:10 +0200
> Daniel Gustafsson <daniel@yesql.se> wrote:
>
> I have reviewed your latest patch.
>
> I can apply this to the master branch and build this successfully,
> and the behavior is as expected.
>
> However, here are some comments and suggestions.

Thanks for the review!  As I haven’t had time to address the comments for a new
versin of this patch, I’m closing this as Returned with Feedback and will
re-submit for the next commitfest.

cheers ./daniel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

24 January 2018, 16:50:30

> On 28 Sep 2017, at 14:55, Yugo Nagata <nagata@sraoss.co.jp> wrote:
>
> On Sun, 3 Sep 2017 22:47:10 +0200
> Daniel Gustafsson <daniel@yesql.se> wrote:
>
> I have reviewed your latest patch.

Thanks!

> I can apply this to the master branch and build this successfully,
> and the behavior is as expected.
>
> However, here are some comments and suggestions.
>
>> 135 +               char   *buffer = palloc0(MAX_CANCEL_MSG);
>> 136 +
>> 137 +               GetCancelMessage(&buffer, MAX_CANCEL_MSG);
>> 138 +               ereport(ERROR,
>> 139 +                       (errcode(ERRCODE_QUERY_CANCELED),
>> 140 +                        errmsg("canceling statement due to user request: \"%s\"",
>> 141 +                               buffer)));
>
> The memory for buffer is allocated here, but I think we can do this
> in GetCancelMessage. Since the size of allocated memory is fixed
> to MAX_CANCEL_MSG, it isn't neccesary to pass this to the function.
> In addition, how about returning the message as the return value?
> That is, we can define GetCancelMessage as following;
>
>  char * GetCancelMessage(void)

I agree that it would be a much neater API, but it would mean pallocing inside
the spinlock wouldn’t it? That would be a no-no.

>> 180 +       r = SetBackendCancelMessage(pid, msg);
>> 181 +
>> 182 +       if (r != -1 && r != strlen(msg))
>> 183 +           ereport(NOTICE,
>> 184 +                   (errmsg("message is too long and has been truncated")));
>> 185 +   }
>
> We can this error handling in SetBackendCancelMessage. I think this is better
> because the truncation of message is done in this function. In addition,
> we should use pg_mbcliplen for this truncation as done in truncate_identifier.
> Else, multibyte character boundary is broken, and noises can slip in log
> messages.

Agreed on both points.  I did however leave the returnvalue as before even
though it’s not read in this coding.

>> 235 -   int         r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM);
>> 236 +   int r = pg_signal_backend(pid, SIGTERM, msg);
>
> This line includes an unnecessary indentation change.

That’s embarrasing, fixed.

>> 411 + * returns the length of message actually created. If the returned length
>
> "the length of message" might be "the length of the message”

Fixed.

>> 413 + * If the backend wasn't found and no message was set, -1 is returned. If two
>
> This comment is incorrect since this function returns 0 when no message was set.

Fixed.

>> 440 +           strlcpy(slot->message, message, sizeof(slot->message));
>> 441 +           slot->len = strlen(slot->message);
>> 442 +           message_len = slot->len;
>> 443 +           SpinLockRelease(&slot->mutex);
>> 444 +
>> 445 +           return message_len;
>
> This can return slot->len directly and the variable message_len is
> unnecessary. However, if we handle the "too long message" error
> in this function as suggested above, this does not have to
> return anything.

That would mean returning a variable which was set while holding the lock, when
the lock has been released.  That doesn’t seem like a good idea, even though it
may be of more theoretical importance here.

Attached patch is rebased over HEAD and addresses the above review comments.
Adding to the next commitfest as it was returned in a previous one.

cheers ./daniel

Attachment

terminate_msg_v5.patch

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Alvaro Herrera

Date:

24 January 2018, 18:45:48

A quick suggestion from a passer-by -- would you put the new code in
src/backend/storage/ipc/backend_signal.c instead?  Sounds like a better
place than utils/misc/; and "signal" is more general in nature than just
"cancel".  A bunch of stuff from utils/adt/misc.c (???) can migrate to
the new file -- everything from line 201 to 362 at least, that is:

pg_signal_backend
pg_cancel_backend
pg_terminate_backend
pg_reload_conf
pg_rotate_logfile

Maybe have two patches, 0001 creates the files moving the contents over,
then 0002 adds your new stuff on top.

/me wonders if there's anything that would suggest to make this
extensive to processes other than backends ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

25 January 2018, 04:10:39

On Wed, Jan 24, 2018 at 12:45:48PM -0300, Alvaro Herrera wrote:
> /me wonders if there's anything that would suggest to make this
> extensive to processes other than backends ...

That's a good thought. Now ProcessInterrupts() is not used by
non-backend processes. For example the WAL receiver has its own logic to
handle interrupts but I guess that we could have an interface which can
be plugged in for any processes, which is by default enabled for
bgworkers.
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

26 January 2018, 02:05:39

> On 24 Jan 2018, at 16:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> A quick suggestion from a passer-by -- would you put the new code in
> src/backend/storage/ipc/backend_signal.c instead?  Sounds like a better
> place than utils/misc/; and "signal" is more general in nature than just
> "cancel".  A bunch of stuff from utils/adt/misc.c (???) can migrate to
> the new file -- everything from line 201 to 362 at least, that is:
>
> pg_signal_backend
> pg_cancel_backend
> pg_terminate_backend
> pg_reload_conf
> pg_rotate_logfile

+1, this makes a lot of sense.  When writing this I didn’t find a good fit
anywhere so I modelled it after utils/misc/backend_random.c, not because it was
a good fit but it was the least bad one I could come up with.  This seems a lot
cleaner.

> Maybe have two patches, 0001 creates the files moving the contents over,
> then 0002 adds your new stuff on top.

The two attached patches implements this.

> /me wonders if there's anything that would suggest to make this
> extensive to processes other than backends ...

Perhaps, do you have an API in mind?

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

06 March 2018, 00:20:17

> On 26 Jan 2018, at 00:05, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 24 Jan 2018, at 16:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

>> Maybe have two patches, 0001 creates the files moving the contents over,
>> then 0002 adds your new stuff on top.
>
> The two attached patches implements this.

Attached are rebased patches to cope with the recent pgproc changes.  I also
made the function cope with NULL messages, not because it’s a sensible value
but I can see this function being fed from a sub-SELECT which could return
NULL.

As per the previous mail, 0001 refactors the signal code according to Alvaros
suggestion and 0002 implements $subject on top of the refactoring.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Eren Başak

Date:

20 March 2018, 14:12:17

Hi,

I reviewed the patch. Here are my notes:

I can confirm that the patches apply and pass the tests as of 03/19/2018 @ 11:00am (UTC).

I didn't test whether the docs compile or look good.

I have briefly tested and can confirm that the patch does what is intended. Here are my comments about the patch:

The patch does not add new tests (maybe isolation tests) for the new feature. Are there any opportunities to have tests for confirming the new behavior? At least some regression tests like the following:

SELECT pg_cancel_backend(pg_backend_pid());

ERROR: 57014: canceling statement due to user request

SELECT pg_cancel_backend(pg_backend_pid(), 'message');

ERROR: 57014: canceling statement due to user request: "message"

Not introduced with this patch, but the spacing between the functions is not consistent in src/backend/storage/ipc/backend_signal.c. There are 2 newlines after some functions (like pg_cancel_backend_msg) and 1 newline after others (like pg_terminate_backend_msg). It would be nice to fix these while refactoring.

I also thought about whether the patch should allow the message to be completely overwritten, instead of appending to the existing one and I think it is fine. Also, adding the admin message to the HINT or DETAIL part of the error message would make sense but these are ignored by some clients so it is fine this way.

Another thing is that, in a similar manner, we could allow changing the error code which might be useful for extensions. For example, Citus could use it to cancel remote backends when it detects a distributed deadlock and changes the error code to something retryable while doing so. For reference, see the hack that Citus is currently using:

https://github.com/citusdata/citus/blob/81cbb7c223f2eb9cfa8903f1d28869b6f972ded1/src/backend/distributed/shared_library_init.c#L237

+ If the optional <literal>message</literal> parameter is set, the text

+ will be appended to the error message returned to the signalled backend.

I think providing more detail would be useful. For example we can provide an example about how the error message looks like in its final form or what will happen if the message is too long.

-pg_signal_backend(int pid, int sig)

+pg_signal_backend(int pid, int sig, char *msg)

The new parameter (msg) is not mentioned in the function header comment. This applies to pg_signal_backend, pg_cancel_backend_internal, pg_terminate_backend_internal functions.

+Datum

+pg_cancel_backend(PG_FUNCTION_ARGS)

+ PG_RETURN_BOOL(pg_cancel_backend_internal(PG_GETARG_INT32(0), NULL));

+Datum

+pg_cancel_backend_msg(PG_FUNCTION_ARGS)

+ pid_t pid;

+ char *msg = NULL;

+ if (PG_ARGISNULL(0))

+ ereport(ERROR,

+ (errmsg("pid cannot be NULL")));

+ pid = PG_GETARG_INT32(0);

+ if (PG_NARGS() == 2 && !PG_ARGISNULL(1))

+ msg = text_to_cstring(PG_GETARG_TEXT_PP(1));

+ PG_RETURN_BOOL(pg_cancel_backend_internal(pid, msg));

The first function seem redundant here because the second one covers all the cases.

+ memset(slot->message, '\0', sizeof(slot->message));

SetBackendCancelMessage uses memset while BackendCancelShmemInit uses MemSet. Not a big deal but it would be nice to be consistent and use postgres macro versions for such calls.

+int

+SetBackendCancelMessage(pid_t backend, char *message)

+ BackendCancelShmemStruct *slot;

The variable "slot" is declared outside of the foor loop but only used in the inside, therefore it would be nicer to declare it inside the loop.

+ BackendCancelInit(MyBackendId);

Is the "normal multiuser case" the best place to initialize cancellation? For example, can't we cancel background workers? If this is the right place, maybe we should justify why this is the best place to initialize backend cancellation memory part with a comment.

+ char *buffer = palloc0(MAX_CANCEL_MSG);

Why not use a static array like char[MAX_CANCEL_MSG], instead of pallocing?

+/*

+ * Return the configured cancellation message and its length. If the returned

+ * length is greater than the size of the passed buffer, truncation has been

+ * performed. The message is cleared on reading.

+ */

+int

+GetCancelMessage(char **buffer, size_t buf_len)

+ volatile BackendCancelShmemStruct *slot = MyCancelSlot;

+ int msg_length = 0;

+ if (slot != NULL && slot->len > 0)

+ {

+ SpinLockAcquire(&slot->mutex);

+ strlcpy(*buffer, (const char *) slot->message, buf_len);

+ msg_length = slot->len;

+ slot->len = 0;

+ slot->message[0] = '\0';

+ SpinLockRelease(&slot->mutex);

+ }

+ return msg_length;

We never change what the buffer points to, therefore is it necessary to declare `buffer` as char** instead of char*?

+extern int SetBackendCancelMessage(pid_t backend, char *message);

Maybe rename backend to something like backend_pid.

+/*

+ * Return the configured cancellation message and its length. If the returned

+ * length is greater than the size of the passed buffer, truncation has been

+ * performed. The message is cleared on reading.

+ */

+int

+GetCancelMessage(char **buffer, size_t buf_len)

I think a function named GetStuff should not clear after getting the stuff. Therefore either the function should be named something like ConsumeCancelMessage or we should separate the "get" and "clear" concerns to different functions.

+/*

+ * The following routines handle registering an optional message when

+ * cancelling, or terminating a backend. The message will be stored in

+ * shared memory and is limited to MAX_CANCEL_MSG characters including

+ * the NULL terminator.

+ *

+ * Access to the message slots is protected by spinlocks.

+ */

I don't think providing a single header comment for the functions below this (CancelBackendMsgShmemSize, BackendCancelShmemInit, BackendCancelInit, CleanupCancelBackend) is enough. More detailed comments about what each function does would be useful.

+BackendCancelInit(int backend_id)

It took me some time to get used to the function names. The confusion was about whether the function name tells "cancel the initialization process" or "initialize backend cancellation memory".

--- /dev/null

+++ b/src/include/storage/backend_signal.h

First, I thought that this split should be done in the first patch, but I realized that everything that has been moved from src/backend/utils/adt/misc.c to src/backend/storage/ipc/backend_signal.c has been declared in src/backend/utils/fmgrprotos.h

Best,

Eren

On Tue, Mar 6, 2018 at 12:20 AM Daniel Gustafsson <daniel@yesql.se> wrote:

> On 26 Jan 2018, at 00:05, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 24 Jan 2018, at 16:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

>> Maybe have two patches, 0001 creates the files moving the contents over,
>> then 0002 adds your new stuff on top.
>
> The two attached patches implements this.

Attached are rebased patches to cope with the recent pgproc changes. I also
made the function cope with NULL messages, not because it’s a sensible value
but I can see this function being fed from a sub-SELECT which could return
NULL.

As per the previous mail, 0001 refactors the signal code according to Alvaros
suggestion and 0002 implements $subject on top of the refactoring.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Christoph Berg

Date:

20 March 2018, 15:41:42

Re: Eren Başak 2018-03-20 <CAFNTstPcstV8Brqkg00a84V72b_FfnLinhu2C2Top+QssmwFhg@mail.gmail.com>
> Another thing is that, in a similar manner, we could allow changing the
> error code which might be useful for extensions. For example, Citus could
> use it to cancel remote backends when it detects a distributed deadlock and
> changes the error code to something retryable while doing so.

Another useful thing to do on top of this patch would be to include
messages when the termination comes from postgres itself, e.g. on a
server shutdown. Possibly, the message for pg_terminate_backend() itself could
say that someone invoke that, unless overridden.

FATAL:  57P01: terminating connection due to administrator command: server shutting down
FATAL:  57P01: terminating connection due to administrator command: restarting because of a crash of another server
process
FATAL:  57P01: terminating connection due to administrator command: terminated by pg_terminate_backend()

Christoph

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

21 March 2018, 02:58:11

> On 20 Mar 2018, at 12:12, Eren Başak <eren@citusdata.com> wrote:

> I reviewed the patch. Here are my notes:

Thanks!

> The patch does not add new tests (maybe isolation tests) for the new feature. Are there any opportunities to have
testsfor confirming the new behavior? 

Good point, not sure why I’ve forgotten to add this.  No existing suite seemed
to match well, so I added a new one for system administration functions like
this one.

> Not introduced with this patch, but the spacing between the functions is not consistent in
src/backend/storage/ipc/backend_signal.c.There are 2 newlines after some functions (like pg_cancel_backend_msg) and 1
newlineafter others (like pg_terminate_backend_msg). It would be nice to fix these while refactoring. 

Fixed.

> Another thing is that, in a similar manner, we could allow changing the error code which might be useful for
extensions.For example, Citus could use it to cancel remote backends when it detects a distributed deadlock and changes
theerror code to something retryable while doing so. For reference, see the hack that Citus is currently using:  
>
https://github.com/citusdata/citus/blob/81cbb7c223f2eb9cfa8903f1d28869b6f972ded1/src/backend/distributed/shared_library_init.c#L237

In 20170620200134.ubnv4sked5seolyk@alap3.anarazel.de, Andres suggested the same
thing.  I don’t disagree, but I’m also not sure what the API would look like so
I’d prefer to address that in a follow-up patch should this one get accepted.

> +    If the optional <literal>message</literal> parameter is set, the text
> +    will be appended to the error message returned to the signalled backend.
>
> I think providing more detail would be useful. For example we can provide an example about how the error message
lookslike in its final form or what will happen if the message is too long. 

Expanded the documentation a little and added a (contrived) example.

> -pg_signal_backend(int pid, int sig)
> +pg_signal_backend(int pid, int sig, char *msg)
>
> The new parameter (msg) is not mentioned in the function header comment. This applies to pg_signal_backend,
pg_cancel_backend_internal,pg_terminate_backend_internal functions. 

Fixed.

> +Datum
> +pg_cancel_backend(PG_FUNCTION_ARGS)
> +{
> +    PG_RETURN_BOOL(pg_cancel_backend_internal(PG_GETARG_INT32(0), NULL));
> +}
> +
> +Datum
> +pg_cancel_backend_msg(PG_FUNCTION_ARGS)
> +{
> +    pid_t        pid;
> +    char        *msg = NULL;
> +
> +    if (PG_ARGISNULL(0))
> +        ereport(ERROR,
> +                (errmsg("pid cannot be NULL")));
> +
> +    pid = PG_GETARG_INT32(0);
> +
> +    if (PG_NARGS() == 2 && !PG_ARGISNULL(1))
> +        msg = text_to_cstring(PG_GETARG_TEXT_PP(1));
> +
> +    PG_RETURN_BOOL(pg_cancel_backend_internal(pid, msg));
> +}
>
> The first function seem redundant here because the second one covers all the cases.

pg_cancel_backend() is defined proisstrict, while pg_cancel_backend_msg() is
not.  I think we must be able to perform the cancellation if the message is
NULL, so made it two functions.

Thinking more about this, I think pg_cancel_backend_msg() should handle a NULL
pid in the same way as pg_cancel_backend() so fixed that as well.

> +            memset(slot->message, '\0', sizeof(slot->message));
>
> SetBackendCancelMessage uses memset while BackendCancelShmemInit uses MemSet. Not a big deal but it would be nice to
beconsistent and use postgres macro versions for such calls.  

Good point, moved to MemSet() for both.

> +int
> +SetBackendCancelMessage(pid_t backend, char *message)
> +{
> +    BackendCancelShmemStruct *slot;
>
> The variable "slot" is declared outside of the foor loop but only used in the inside, therefore it would be nicer to
declareit inside the loop. 

Moved to inside the loop.

> +        BackendCancelInit(MyBackendId);
>
> Is the "normal multiuser case" the best place to initialize cancellation? For example, can't we cancel background
workers?If this is the right place, maybe we should justify why this is the best place to initialize backend
cancellationmemory part with a comment. 

I didn’t envision this being in another setting than the multiuser case, but
it’s been clear throughout the thread that others have had good ideas around
extended uses of this.  Background workers can still be terminated/canceled,
this only affects setting the message, and that is to me a multiuser feature.

Renamed the functions BackendCancelMessage*() for clarity.

> +                char   *buffer = palloc0(MAX_CANCEL_MSG);
>
> Why not use a static array like char[MAX_CANCEL_MSG], instead of pallocing?

No specific reason, changed.

> +/*
> + * Return the configured cancellation message and its length. If the returned
> + * length is greater than the size of the passed buffer, truncation has been
> + * performed. The message is cleared on reading.
> + */
> +int
> +GetCancelMessage(char **buffer, size_t buf_len)
...
> We never change what the buffer points to, therefore is it necessary to declare `buffer` as char** instead of char*?

Fixed

> +extern int SetBackendCancelMessage(pid_t backend, char *message);
>
> Maybe rename backend to something like backend_pid.

Makes sense, changed.

> I think a function named GetStuff should not clear after getting the stuff. Therefore either the function should be
namedsomething like ConsumeCancelMessage or we should separate the "get" and "clear" concerns to different functions. 

Good point, changed to ConsumeCancelMessage()

> I don't think providing a single header comment for the functions below this (CancelBackendMsgShmemSize,
BackendCancelShmemInit,BackendCancelInit, CleanupCancelBackend) is enough. More detailed comments about what each
functiondoes would be useful. 

Comments expanded.

> It took me some time to get used to the function names. The confusion was about whether the function name tells
"cancelthe initialization process" or "initialize backend cancellation memory”. 

Naming is hard.  I’m not wed to any name used, and am open to any suggestions
that can improve code clarity.

Attached patches are rebased over HEAD with all of the above addressed.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Eren Başak

Date:

27 March 2018, 14:50:35

Hi,

I have reviewed the v8 patches.

I can confirm that the patches apply and pass the tests as of 03/27/2018 11:00am (UTC).

I didn't test whether the docs compile but they look good.

I have briefly tested the patch again and can confirm that the patch does what is intended.

The v8 patches address almost all of my review notes on v7 patch, thanks Daniel for that.

I have some more questions, notes and views on the patch but have no strong opinions at this moment. So I am fine with whatever decision is made:

I see you have removed extra newlines from backend_signal.c. However, I realized that there is still one extra newline after pg_reload_conf.

> In 20170620200134.ubnv4sked5seolyk@alap3.anarazel.de, Andres suggested the same
> thing. I don’t disagree, but I’m also not sure what the API would look like so
> I’d prefer to address that in a follow-up patch should this one get accepted.

That's fine for me, although I would prefer to get the ability to specify the error code sooner than later. My main question is that I am not sure whether the community prefers to ship two similar (in use case) changes on an API in a single patch or fine with two patches. Can that be a problem if the subsequent patch is released in a different postgres version? I am not sure.

> pg_cancel_backend() is defined proisstrict, while pg_cancel_backend_msg() is
> not. I think we must be able to perform the cancellation if the message is
> NULL, so made it two functions.

I agree that we should preserve the old usage as well. What I don't understand is that, can't we remove proisstrict from pg_cancel_backend and copy the body of pg_cancel_backend_msg into pg_cancel_backend. This way, I think we can accomplish the same thing without introducing two new functions.

+Datum

+pg_terminate_backend_msg(PG_FUNCTION_ARGS)

+ pid_t pid;

+ char *msg = NULL;

+ if (PG_ARGISNULL(0))

+ ereport(ERROR,

+ (errmsg("pid cannot be NULL")));

pg_terminate_backend_msg errors out if the pid is null but pg_cancel_backend_msg returns null and I think we should have consistency between these two in this regard.

Best,

Eren

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

29 March 2018, 17:54:42

> On 27 Mar 2018, at 13:50, Eren Başak <eren@citusdata.com> wrote:

> > pg_cancel_backend() is defined proisstrict, while pg_cancel_backend_msg() is
> > not.  I think we must be able to perform the cancellation if the message is
> > NULL, so made it two functions.
>
> I agree that we should preserve the old usage as well. What I don't understand is that, can't we remove proisstrict
frompg_cancel_backend and copy the body of pg_cancel_backend_msg into pg_cancel_backend. This way, I think we can
accomplishthe same thing without introducing two new functions. 

If we do that, wouldn’t SELECT pg_cancel_backend(NULL) fail unless NULL is
explicitly casted to integer?  Also, I think that would cause make check to
fail the opr_sanity suite on the “multiple uses of the same internal function”
test.  Maybe I’m misunderstanding your point here?

> pg_terminate_backend_msg errors out if the pid is null but pg_cancel_backend_msg returns null and I think we should
haveconsistency between these two in this regard. 

Absolutely, that was an oversight in the v8 patch, they should both handle NULL
in the same way.

Whitespace and null handling fixed, as well as rebased over HEAD.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Peter Eisentraut

Date:

06 April 2018, 00:16:10

The documentation change suggests the error will be

ERROR:  canceling statement due to user request: "Cancellation message text"

Why the quotes?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

06 April 2018, 00:33:57

> On 05 Apr 2018, at 23:16, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
>
> The documentation change suggests the error will be
>
> ERROR:  canceling statement due to user request: "Cancellation message text"
>
> Why the quotes?

It seemed like a good idea at the time to indicate which part was submitted by
the user, but looking at it now the colon sign is a pretty clear indicator
already.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

06 April 2018, 05:49:57

On Thu, Apr 05, 2018 at 11:33:57PM +0200, Daniel Gustafsson wrote:
> It seemed like a good idea at the time to indicate which part was submitted by
> the user, but looking at it now the colon sign is a pretty clear indicator
> already.

Dropping the quotes is more consistent with other error messages.  We
don't bother about quoting things for example with system-related errors
generated by strerror().
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

06 April 2018, 12:18:34

> On 06 Apr 2018, at 04:49, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Apr 05, 2018 at 11:33:57PM +0200, Daniel Gustafsson wrote:
>> It seemed like a good idea at the time to indicate which part was submitted by
>> the user, but looking at it now the colon sign is a pretty clear indicator
>> already.
>
> Dropping the quotes is more consistent with other error messages.  We
> don't bother about quoting things for example with system-related errors
> generated by strerror().

Yep, I completely agree.  Attached are patches with the quotes removed and
rebased since Oids were taken etc.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

09 April 2018, 03:47:06

On Fri, Apr 06, 2018 at 11:18:34AM +0200, Daniel Gustafsson wrote:
> Yep, I completely agree.  Attached are patches with the quotes removed and
> rebased since Oids were taken etc.

I still find this idea interesting for plugin authors.  However, as
feature freeze for v11 is in effect, I am marking the patch as returned
with feedback.
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Daniel Gustafsson

Date:

10 April 2018, 00:50:15

> On 09 Apr 2018, at 02:47, Michael Paquier <michael@paquier.xyz> wrote:
> 
> On Fri, Apr 06, 2018 at 11:18:34AM +0200, Daniel Gustafsson wrote:
>> Yep, I completely agree.  Attached are patches with the quotes removed and
>> rebased since Oids were taken etc.
> 
> I still find this idea interesting for plugin authors.  However, as
> feature freeze for v11 is in effect, I am marking the patch as returned
> with feedback.

Not sure what the feedback is though? The patch has been through thorough
review with all review comments addressed.

Will resubmit this for a v12 CF.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Andres Freund

Date:

10 April 2018, 00:55:09

Hi,

On 2017-06-20 13:01:35 -0700, Andres Freund wrote:
> For extensions it'd also be useful if it'd be possible to overwrite the
> error code.  E.g. for citus there's a distributed deadlock detector,
> running out of process because there's no way to interrupt lock waits
> locally, and we've to do some ugly hacking to generate proper error
> messages and code from another session.

What happened to this request?  Seems we're out of the crunch mode and
could round the feature out a littlebit more...

Greetings,

Andres Freund

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

13 June 2018, 21:54:03

> On 9 Apr 2018, at 23:55, Andres Freund <andres@anarazel.de> wrote:
> On 2017-06-20 13:01:35 -0700, Andres Freund wrote:
>> For extensions it'd also be useful if it'd be possible to overwrite the
>> error code.  E.g. for citus there's a distributed deadlock detector,
>> running out of process because there's no way to interrupt lock waits
>> locally, and we've to do some ugly hacking to generate proper error
>> messages and code from another session.
>
> What happened to this request?  Seems we're out of the crunch mode and
> could round the feature out a littlebit more…

Revisiting old patches, I took a stab at this request.

Since I don’t really have a use case for altering the sqlerrcode other than the
on that Citus..  cited, I modelled the API around that.  The slot now holds a
sqlerrcode as well as a message, with functions to just set the message keeping
the default sqlerrcode for when that is all one wants to do.  There is no
function for just altering the sqlerrcode without a message as that seems not
useful to me.

The combination of sqlerrcode and message was dubbed SignalFeedback for lack of
a better term.  With this I also addressed something that annoyed me; I had
called all the functions Cancel* which technically isn’t true since we also
support termination.

There are no new user facing changes in patch compared to the previous version.

This patchset still has the refactoring that Alvaro brought up upthread.

Parking this again the commitfest as it was returned with feedback from the
last one it was in (all review comments addressed, see upthread).

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Andres Freund

Date:

13 June 2018, 22:16:21

Hi,

Onder, I CCed you because it seems worthwhile to ensure that the
relevant Citus code could use this instead of the gross hack you and I
committed...

On 2018-06-13 20:54:03 +0200, Daniel Gustafsson wrote:
> > On 9 Apr 2018, at 23:55, Andres Freund <andres@anarazel.de> wrote:
> > On 2017-06-20 13:01:35 -0700, Andres Freund wrote:
> >> For extensions it'd also be useful if it'd be possible to overwrite the
> >> error code.  E.g. for citus there's a distributed deadlock detector,
> >> running out of process because there's no way to interrupt lock waits
> >> locally, and we've to do some ugly hacking to generate proper error
> >> messages and code from another session.
> > 
> > What happened to this request?  Seems we're out of the crunch mode and
> > could round the feature out a littlebit more…
> 
> Revisiting old patches, I took a stab at this request.
> 
> Since I don’t really have a use case for altering the sqlerrcode other than the
> on that Citus..  cited, I modelled the API around that.

Cool. Onder?


> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> index b851fe023a..ad9763698f 100644
> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml
> @@ -18540,7 +18540,7 @@ SELECT set_config('log_statement_stats', 'off', false);
>       <tbody>
>        <row>
>         <entry>
> -        <literal><function>pg_cancel_backend(<parameter>pid</parameter> <type>int</type>)</function></literal>
> +        <literal><function>pg_cancel_backend(<parameter>pid</parameter> <type>int</type> [,
<parameter>message</parameter><type>text</type>])</function></literal>
 
>          </entry>
>         <entry><type>boolean</type></entry>
>         <entry>Cancel a backend's current query.  This is also allowed if the
> @@ -18565,7 +18565,7 @@ SELECT set_config('log_statement_stats', 'off', false);
>        </row>
>        <row>
>         <entry>
> -        <literal><function>pg_terminate_backend(<parameter>pid</parameter> <type>int</type>)</function></literal>
> +        <literal><function>pg_terminate_backend(<parameter>pid</parameter> <type>int</type> [,
<parameter>message</parameter><type>text</type>])</function></literal>
 
>          </entry>
>         <entry><type>boolean</type></entry>
>         <entry>Terminate a backend.  This is also allowed if the calling role
> @@ -18596,6 +18596,14 @@ SELECT set_config('log_statement_stats', 'off', false);
>      The role of an active backend can be found from the
>      <structfield>usename</structfield> column of the
>      <structname>pg_stat_activity</structname> view.
> +    If the optional <literal>message</literal> parameter is set, the text
> +    will be appended to the error message returned to the signalled backend.
> +    <literal>message</literal> is limited to 128 bytes, any longer text
> +    will be truncated. An example where we cancel our own backend:
> +<programlisting>
> +postgres=# SELECT pg_cancel_backend(pg_backend_pid(), 'Cancellation message text');
> +ERROR:  canceling statement due to user request: Cancellation message text
> +</programlisting>
>     </para>

I'm not sure I really like the appending bit. There's a security
argument to be made about doing so, but from a user POV that mostly
seems restrictive.  I wonder if that aspect would be better handled by
adding an error context that contains information about which process
did the cancellation (or DETAIL?)?


> +/*
> + * Structure for registering a feedback payload to be sent to a cancelled, or
> + * terminated backend. Each backend is registered per pid in the array which is
> + * indexed by Backend ID. Reading and writing the message is protected by a
> + * per-slot spinlock.
> + */
> +typedef struct
> +{
> +    pid_t    pid;

This is the pid of the receiving process? If so, why do we need this in
here? That's just duplicated data, no?


> +    slock_t    mutex;
> +    char    message[MAX_CANCEL_MSG];
> +    int        len;
> +    int        sqlerrcode;

I'd vote for including the pid of the process that did the cancelling in
here.

> +Datum
> +pg_cancel_backend(PG_FUNCTION_ARGS)
> +{
> +    PG_RETURN_BOOL(pg_cancel_backend_internal(PG_GETARG_INT32(0), NULL));
> +}
> +
> +Datum
> +pg_cancel_backend_msg(PG_FUNCTION_ARGS)
> +{
> +    pid_t        pid;
> +    char        *msg = NULL;
> +
> +    if (PG_ARGISNULL(0))
> +        PG_RETURN_NULL();
> +
> +    pid = PG_GETARG_INT32(0);
> +
> +    if (PG_NARGS() == 2 && !PG_ARGISNULL(1))
> +        msg = text_to_cstring(PG_GETARG_TEXT_PP(1));
> +
> +    PG_RETURN_BOOL(pg_cancel_backend_internal(pid, msg));
> +}

Why isn't this just one function? Given that you already have a PG_NARGS
check, I don't quite see the point?

>  /*
>   * Signal to terminate a backend process.  This is allowed if you are a member
>   * of the role whose process is being terminated.
>   *
>   * Note that only superusers can signal superuser-owned processes.
>   */
> -Datum
> -pg_terminate_backend(PG_FUNCTION_ARGS)
> +static bool
> +pg_terminate_backend_internal(pid_t pid, char *msg)
>  {
> -    int            r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM);
> +    int        r = pg_signal_backend(pid, SIGTERM, msg);
>  
>      if (r == SIGNAL_BACKEND_NOSUPERUSER)
>          ereport(ERROR,
> @@ -146,7 +203,30 @@ pg_terminate_backend(PG_FUNCTION_ARGS)
>                  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                   (errmsg("must be a member of the role whose process is being terminated or member of
pg_signal_backend"))));
>  
> -    PG_RETURN_BOOL(r == SIGNAL_BACKEND_SUCCESS);
> +    return (r == SIGNAL_BACKEND_SUCCESS);
> +}
> +
> +Datum
> +pg_terminate_backend(PG_FUNCTION_ARGS)
> +{
> +    PG_RETURN_BOOL(pg_terminate_backend_internal(PG_GETARG_INT32(0), NULL));
> +}
> +
> +Datum
> +pg_terminate_backend_msg(PG_FUNCTION_ARGS)
> +{
> +    pid_t        pid;
> +    char        *msg = NULL;
> +
> +    if (PG_ARGISNULL(0))
> +        PG_RETURN_NULL();
> +
> +    pid = PG_GETARG_INT32(0);
> +
> +    if (PG_NARGS() == 2 && !PG_ARGISNULL(1))
> +        msg = text_to_cstring(PG_GETARG_TEXT_PP(1));
> +
> +    PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg));
>  }
>  
>  /*
> @@ -168,7 +248,6 @@ pg_reload_conf(PG_FUNCTION_ARGS)
>      PG_RETURN_BOOL(true);
>  }

Same.


> +/*
> + * Set a message for the cancellation of the backend with the specified pid,
> + * using the default sqlerrcode.
> + */
> +int
> +SetBackendCancelMessage(pid_t backend_pid, char *message)
> +{
> +    return backend_feedback(backend_pid, message, ERRCODE_QUERY_CANCELED);
> +}
> +
> +/*
> + * Set a message for the termination of the backend with the specified pid,
> + * using the default sqlerrcode.
> + */
> +int
> +SetBackendTerminationMessage(pid_t backend_pid, char *message)
> +{
> +    return backend_feedback(backend_pid, message, ERRCODE_ADMIN_SHUTDOWN);
> +}
> +
> +/*
> + * Set both a message and a sqlerrcode for use when signalling the backend
> + * with the specified pid.
> + */
> +int
> +SetBackendSignalFeedback(pid_t backend_pid, char *message, int sqlerrcode)
> +{
> +    return backend_feedback(backend_pid, message, sqlerrcode);
> +}

I'm not quite seeing the point of these variants.  What advantage are
they over just doing the same on the caller level?


> +/*
> + * Sets a cancellation message for the backend with the specified pid, and
> + * returns the length of the message actually created. If the returned length
> + * is less than the length of the message parameter, truncation has occurred.
> + * If the backend isn't found, -1 is returned. If no message is passed, zero is
> + * returned. If two backends collide in setting a message, the existing message
> + * will be overwritten by the last one in.
> + */
> +static int
> +backend_feedback(pid_t backend_pid, char *message, int sqlerrcode)
> +{
> +    int        i;
> +    int        len;
> +
> +    if (!message)
> +        return 1;
> +
> +    len = pg_mbcliplen(message, strlen(message), MAX_CANCEL_MSG - 1);
> +
> +    for (i = 0; i < MaxBackends; i++)
> +    {
> +        BackendSignalFeedbackShmemStruct *slot = &BackendSignalFeedbackSlots[i];
> +
> +        if (slot->pid != 0 && slot->pid == backend_pid)
> +        {
> +            SpinLockAcquire(&slot->mutex);
> +            if (slot->pid != backend_pid)
> +            {
> +                SpinLockRelease(&slot->mutex);
> +                goto error;
> +            }
> +
> +            MemSet(slot->message, '\0', sizeof(slot->message));
> +            memcpy(slot->message, message, len);

This seems unnecessarily expensive.

> +            slot->len = len;
> +            slot->sqlerrcode = sqlerrcode;
> +            SpinLockRelease(&slot->mutex);
> +
> +            if (len != strlen(message))
> +                ereport(NOTICE,
> +                        (errmsg("message is too long and has been truncated")));
> +            return 0;
> +        }
> +    }

So we made cancellation a O(N) proposition :/. Probably not too bad, but
still...

> +error:
> +
> +    elog(LOG, "Cancellation message requested for missing backend %d by %d",
> +         (int) backend_pid, MyProcPid);
> +
> +    return 1;
> +}

So we now do log spam if processes exit in the wrong moment?


> +/*
> + * Test whether there is feedback registered for the current backend that can
> + * be consumed and presented to the user.
> + */
> +bool
> +HasBackendSignalFeedback(void)
> +{
> +    volatile BackendSignalFeedbackShmemStruct *slot = MyCancelSlot;
> +    bool     has_message = false;
> +
> +    if (slot != NULL)
> +    {
> +        SpinLockAcquire(&slot->mutex);
> +        has_message = ((slot->len > 0) && (slot->sqlerrcode != 0));
> +        SpinLockRelease(&slot->mutex);
> +    }
> +
> +    return has_message;
> +}

I'm somewhat uncomfortable with acquiring a mutex in an error path.  We
used to prcoess at least some interrupts in signal handlers in some
cases - I think I removed all of them, but if not, we'd be in deep
trouble here.  Wonder if we shouldn't try to get around needing that...



Greetings,

Andres Freund

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

15 June 2018, 00:56:05

> On 13 Jun 2018, at 21:16, Andres Freund <andres@anarazel.de> wrote:

Thanks for the review!

> I'm not sure I really like the appending bit. There's a security
> argument to be made about doing so, but from a user POV that mostly
> seems restrictive.  I wonder if that aspect would be better handled by
> adding an error context that contains information about which process
> did the cancellation (or DETAIL?)?

That doesn’t sound like a bad idea at all, I took a stab at that in the
attached patch to see what it would look like.  Is that along the lines of what
you had in mind?

It does however make testing harder as the .out file can’t have the matching
pid (commented out the test in the attached hoping that there is a way to test
it that I fail to see).

>> +/*
>> + * Structure for registering a feedback payload to be sent to a cancelled, or
>> + * terminated backend. Each backend is registered per pid in the array which is
>> + * indexed by Backend ID. Reading and writing the message is protected by a
>> + * per-slot spinlock.
>> + */
>> +typedef struct
>> +{
>> +    pid_t    pid;
>
> This is the pid of the receiving process? If so, why do we need this in
> here? That's just duplicated data, no?

Correct, we need that for finding the right slot in backend_feedback() unless
I’m missing something?

>> +    slock_t    mutex;
>> +    char    message[MAX_CANCEL_MSG];
>> +    int        len;
>> +    int        sqlerrcode;
>
> I'd vote for including the pid of the process that did the cancelling in
> here.

Did that for the above discussed log message change.

>> [ .. ]
>
> Why isn't this just one function? Given that you already have a PG_NARGS
> check, I don't quite see the point?

I was trying to retain STRICT on pg_cancel_backend(int4) for no food reason,
and I’m not entirely sure what the reason was anymore.  Fixed in the attached
version.

>> [ .. ]
>
> I'm not quite seeing the point of these variants.  What advantage are
> they over just doing the same on the caller level?

It seemed a cleaner API to provide these for when the caller just want to
provide messaging (which is where I started this a long time ago).  I’m not wed
to the idea at all though, they are clearly just for convenience.

>> +            MemSet(slot->message, '\0', sizeof(slot->message));
>> +            memcpy(slot->message, message, len);
>
> This seems unnecessarily expensive.

My goal was to safeguard against the risk of fragments from an undelivered
message being delivered to another backend, which might be overly cautious.
Using strlcpy() instead to guarantee termination is perhaps enough?

>> +            slot->len = len;
>> +            slot->sqlerrcode = sqlerrcode;
>> +            SpinLockRelease(&slot->mutex);
>> +
>> +            if (len != strlen(message))
>> +                ereport(NOTICE,
>> +                        (errmsg("message is too long and has been truncated")));
>> +            return 0;
>> +        }
>> +    }
>
> So we made cancellation a O(N) proposition :/. Probably not too bad, but
> still…

When setting a message to be passed to the backend yes, with an upper bound of
MaxBackends.

>> [ .. ]
>
> So we now do log spam if processes exit in the wrong moment?

Yeah, that was rather pointless. Removed in the attached version.

> I'm somewhat uncomfortable with acquiring a mutex in an error path.  We
> used to prcoess at least some interrupts in signal handlers in some
> cases - I think I removed all of them, but if not, we'd be in deep
> trouble here.  Wonder if we shouldn't try to get around needing that…

I’m all ears if you have a better idea, this was the only option I could come
up with.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Thomas Munro

Date:

06 July 2018, 03:18:02

On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
> attached

Hi Daniel,

6118  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
6119  select pg_cancel_backend(pg_backend_pid(), NULL);
6120! ERROR:  canceling statement due to user request
6121--- 25,32 ----
6122
6123  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
6124  select pg_cancel_backend(pg_backend_pid(), NULL);
6125!  pg_cancel_backend
6126! -------------------
6127!  t

Apparently Windows can take or leave it as it pleases.

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

06 July 2018, 04:47:05

On Fri, Jul 06, 2018 at 12:18:02PM +1200, Thomas Munro wrote:
> 6118  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
> 6119  select pg_cancel_backend(pg_backend_pid(), NULL);
> 6120! ERROR:  canceling statement due to user request
> 6121--- 25,32 ----
> 6122
> 6123  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
> 6124  select pg_cancel_backend(pg_backend_pid(), NULL);
> 6125!  pg_cancel_backend
> 6126! -------------------
> 6127!  t
>
> Apparently Windows can take or leave it as it pleases.
>
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488

The test coverage looks adapted if it is possible to catch such
failures, so that's nice.

+select pg_cancel_backend();
+ERROR:  function pg_cancel_backend() does not exist
+LINE 1: select pg_cancel_backend();
This negative test is not really necessary.
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Pavel Stehule

Date:

06 July 2018, 15:12:24

I am testing last patch and looks so truncating message and appending "..." doesn't work.

The slot->len hold trimmed length, not original length.

Regards

Pavel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

24 July 2018, 23:45:33

> On 6 Jul 2018, at 14:12, Pavel Stehule <pavel.stehule@gmail.com> wrote:

> I am testing last patch and looks so truncating message and appending "..." doesn't work.

Thanks for testing, and sorry about the late response.

> The slot->len hold trimmed length, not original length.

Fixed in the attached rebased patchset together with a few small touch-ups such
as fixed documentation.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

24 July 2018, 23:46:41

> On 6 Jul 2018, at 03:47, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Jul 06, 2018 at 12:18:02PM +1200, Thomas Munro wrote:
>> 6118  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
>> 6119  select pg_cancel_backend(pg_backend_pid(), NULL);
>> 6120! ERROR:  canceling statement due to user request
>> 6121--- 25,32 ----
>> 6122
>> 6123  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
>> 6124  select pg_cancel_backend(pg_backend_pid(), NULL);
>> 6125!  pg_cancel_backend
>> 6126! -------------------
>> 6127!  t
>>
>> Apparently Windows can take or leave it as it pleases.
>>
>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488
>
> The test coverage looks adapted if it is possible to catch such
> failures, so that's nice.
>
> +select pg_cancel_backend();
> +ERROR:  function pg_cancel_backend() does not exist
> +LINE 1: select pg_cancel_backend();
> This negative test is not really necessary.

I guess you’re right, it’s a bit superfluous.  Removed in the last sent
patchset.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

24 July 2018, 23:57:09

> On 6 Jul 2018, at 02:18, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>
> On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
>> attached
>
> Hi Daniel,
>
> 6118  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
> 6119  select pg_cancel_backend(pg_backend_pid(), NULL);
> 6120! ERROR:  canceling statement due to user request
> 6121--- 25,32 ----
> 6122
> 6123  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
> 6124  select pg_cancel_backend(pg_backend_pid(), NULL);
> 6125!  pg_cancel_backend
> 6126! -------------------
> 6127!  t
>
> Apparently Windows can take or leave it as it pleases.

Well played =)

> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488

That reads to me like it’s cancelling another backend than the current one,
which clearly isn’t right as we’re passing pg_backend_pid().  I can’t really
see what Windows specific bug was introduced by this patch though (or well, the
bug exhibits itself on Windows but it may well be generic of course).

Will continue to hunt.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

25 July 2018, 10:27:48

> On 24 Jul 2018, at 22:57, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 6 Jul 2018, at 02:18, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>
>> On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
>>> attached
>>
>> Hi Daniel,
>>
>> 6118  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
>> 6119  select pg_cancel_backend(pg_backend_pid(), NULL);
>> 6120! ERROR:  canceling statement due to user request
>> 6121--- 25,32 ----
>> 6122
>> 6123  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
>> 6124  select pg_cancel_backend(pg_backend_pid(), NULL);
>> 6125!  pg_cancel_backend
>> 6126! -------------------
>> 6127!  t
>>
>> Apparently Windows can take or leave it as it pleases.
>
> Well played =)
>
>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488
>
> That reads to me like it’s cancelling another backend than the current one,
> which clearly isn’t right as we’re passing pg_backend_pid().  I can’t really
> see what Windows specific bug was introduced by this patch though (or well, the
> bug exhibits itself on Windows but it may well be generic of course).
>
> Will continue to hunt.

Seems the build of the updated patch built and tested Ok.  Still have no idea
why the previous one didn’t.

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.6695

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Heikki Linnakangas

Date:

06 August 2018, 10:47:53

Has there been any consideration to encodings? What happens if the 
message contains non-ASCII characters, and the sending backend is 
connected to database that uses a different encoding than the backend 
being signaled?

- Heikki

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

12 August 2018, 01:17:59

> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> Has there been any consideration to encodings?

Thats a good point, no =/

> What happens if the message contains non-ASCII characters, and the sending backend is connected to database that uses
adifferent encoding than the backend being signaled? 

In the current state of the patch, instead of the message you get:

    FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has
           no equivalent in encoding “ISO_8859_5"

Thats clearly not good enough, but I’m not entirely sure what would be the best
way forward.  Restrict messages to only be in SQL_ASCII?  Store the encoding of
the message and check the encoding of the receiving backend before issuing it
for a valid conversion, falling back to no message in case there is none?
Neither seems terribly appealing, do you have any better suggestions?

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Pavel Stehule

Date:

12 August 2018, 08:42:50

2018-08-12 0:17 GMT+02:00 Daniel Gustafsson <daniel@yesql.se>:

> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> Has there been any consideration to encodings?

Thats a good point, no =/

> What happens if the message contains non-ASCII characters, and the sending backend is connected to database that uses a different encoding than the backend being signaled?

In the current state of the patch, instead of the message you get:

FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has
no equivalent in encoding “ISO_8859_5"

Where this code fails? Isn't somewhere upper where string literals are translated? Then this message is ok.

Thats clearly not good enough, but I’m not entirely sure what would be the best
way forward. Restrict messages to only be in SQL_ASCII? Store the encoding of
the message and check the encoding of the receiving backend before issuing it
for a valid conversion, falling back to no message in case there is none?
Neither seems terribly appealing, do you have any better suggestions?

The client<->server encoding translation should do this work no?

Regards

Pavel

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

12 August 2018, 11:29:52

> On 12 Aug 2018, at 07:42, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2018-08-12 0:17 GMT+02:00 Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>>:
> > On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> wrote:

> > What happens if the message contains non-ASCII characters, and the sending backend is connected to database that
usesa different encoding than the backend being signaled? 
>
> In the current state of the patch, instead of the message you get:
>
>     FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has
>            no equivalent in encoding “ISO_8859_5"
>
> Where this code fails? Isn't somewhere upper where string literals are translated? Then this message is ok.

This happens for example when a UTF-8 backend sends a message with japanese
characters to a backend using ISO_8859_5.  So the code works as expected, but
it’s not a very good user experience.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Pavel Stehule

Date:

12 August 2018, 11:54:55

2018-08-12 10:29 GMT+02:00 Daniel Gustafsson <daniel@yesql.se>:

> On 12 Aug 2018, at 07:42, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2018-08-12 0:17 GMT+02:00 Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>>:
> > On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> wrote:

> > What happens if the message contains non-ASCII characters, and the sending backend is connected to database that uses a different encoding than the backend being signaled?
>
> In the current state of the patch, instead of the message you get:
>
> FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has
> no equivalent in encoding “ISO_8859_5"
>
> Where this code fails? Isn't somewhere upper where string literals are translated? Then this message is ok.

This happens for example when a UTF-8 backend sends a message with japanese
characters to a backend using ISO_8859_5. So the code works as expected, but
it’s not a very good user experience.

It is same like any other using of string literal.

Personally, I don't think so these functions should be a exception.

Pavel

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Andres Freund

Date:

12 August 2018, 12:01:13


On August 12, 2018 12:17:59 AM GMT+02:00, Daniel Gustafsson <daniel@yesql.se> wrote:
>> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>> Has there been any consideration to encodings?
>
>Thats a good point, no =/
>
>> What happens if the message contains non-ASCII characters, and the
>sending backend is connected to database that uses a different encoding
>than the backend being signaled?
>
>In the current state of the patch, instead of the message you get:
>
>FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8"
>has
>           no equivalent in encoding “ISO_8859_5"
>
>Thats clearly not good enough, but I’m not entirely sure what would be
>the best
>way forward.  Restrict messages to only be in SQL_ASCII?  Store the
>encoding of
>the message and check the encoding of the receiving backend before
>issuing it
>for a valid conversion, falling back to no message in case there is
>none?
>Neither seems terribly appealing, do you have any better suggestions?

Restricting to ASCII seems reasonable. But note that sqlascii isn't that (it's essentially just arbitrary null
terminateddata). Easier to relax later. 

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

13 August 2018, 01:53:00

> On 12 Aug 2018, at 11:01, Andres Freund <andres@anarazel.de> wrote:
>
> On August 12, 2018 12:17:59 AM GMT+02:00, Daniel Gustafsson <daniel@yesql.se> wrote:
>>> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>>
>>> Has there been any consideration to encodings?
>>
>> Thats a good point, no =/
>>
>>> What happens if the message contains non-ASCII characters, and the
>> sending backend is connected to database that uses a different encoding
>> than the backend being signaled?
>>
>> In the current state of the patch, instead of the message you get:
>>
>> FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8"
>> has
>>          no equivalent in encoding “ISO_8859_5"
>>
>> Thats clearly not good enough, but I’m not entirely sure what would be
>> the best
>> way forward.  Restrict messages to only be in SQL_ASCII?  Store the
>> encoding of
>> the message and check the encoding of the receiving backend before
>> issuing it
>> for a valid conversion, falling back to no message in case there is
>> none?
>> Neither seems terribly appealing, do you have any better suggestions?
>
> Restricting to ASCII seems reasonable.

It’s quite restrictive, but it’s the safe option.  I’ve hacked this into the
updated patch, but kept the backend_feedback() function using pg_mbstrlen() at
least for now since it seems the safe option should this be relaxed at some
point.  Also added a small test by copying text from a ja.po file in the tree.

> But note that sqlascii isn't that (it's essentially just arbitrary null terminated data). Easier to relax later.

Yeah, my fingers and brain were not in sync during typing, I meant to say ASCII
there. I blame a lack of coffee.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Thomas Munro

Date:

21 August 2018, 06:09:08

On Wed, Jul 25, 2018 at 7:27 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>> On 24 Jul 2018, at 22:57, Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>>> On 6 Jul 2018, at 02:18, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>>
>>> On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
>>>> attached
>>>
>>> Hi Daniel,
>>>
>>> 6118  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
>>> 6119  select pg_cancel_backend(pg_backend_pid(), NULL);
>>> 6120! ERROR:  canceling statement due to user request
>>> 6121--- 25,32 ----
>>> 6122
>>> 6123  --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes');
>>> 6124  select pg_cancel_backend(pg_backend_pid(), NULL);
>>> 6125!  pg_cancel_backend
>>> 6126! -------------------
>>> 6127!  t
>>>
>>> Apparently Windows can take or leave it as it pleases.
>>
>> Well played =)
>>
>>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488
>>
>> That reads to me like it’s cancelling another backend than the current one,
>> which clearly isn’t right as we’re passing pg_backend_pid().  I can’t really
>> see what Windows specific bug was introduced by this patch though (or well, the
>> bug exhibits itself on Windows but it may well be generic of course).
>>
>> Will continue to hunt.
>
> Seems the build of the updated patch built and tested Ok.  Still have no idea
> why the previous one didn’t.

That problem apparently didn't go away.  cfbot tested it 7 times in
the past week, and it passed only once on Windows:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9691

The other times all failed like this:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9833

--
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Tom Lane

Date:

23 August 2018, 21:53:37

Thomas Munro <thomas.munro@enterprisedb.com> writes:
> On Wed, Jul 25, 2018 at 7:27 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>> Seems the build of the updated patch built and tested Ok.  Still have no idea
>> why the previous one didn’t.

> That problem apparently didn't go away.  cfbot tested it 7 times in
> the past week, and it passed only once on Windows:
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9691
> The other times all failed like this:
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9833

I think this is just a timing problem: the signal gets sent,
but it might or might not get received before the current statement ends.
It's possible that a signal-sent-to-self can be expected to be received
synchronously on all Unix platforms, but I wouldn't entirely bet on that
(in particular, the POSIX text for kill(2) does NOT guarantee it); and
our Windows signal implementation certainly doesn't guarantee anything
of the sort.

I don't think there's necessarily anything wrong with the code, but
you can't test it with such a simplistic test as this and expect
stable results.  If you feel an urgent need to have a test case,
try building an isolationtester script.

            regards, tom lane

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Thomas Munro

Date:

24 August 2018, 03:46:02

On Fri, Aug 24, 2018 at 6:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> On Wed, Jul 25, 2018 at 7:27 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>>> Seems the build of the updated patch built and tested Ok.  Still have no idea
>>> why the previous one didn’t.
>
>> That problem apparently didn't go away.  cfbot tested it 7 times in
>> the past week, and it passed only once on Windows:
>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9691
>> The other times all failed like this:
>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9833
>
> I think this is just a timing problem: the signal gets sent,
> but it might or might not get received before the current statement ends.
> It's possible that a signal-sent-to-self can be expected to be received
> synchronously on all Unix platforms, but I wouldn't entirely bet on that
> (in particular, the POSIX text for kill(2) does NOT guarantee it); and
> our Windows signal implementation certainly doesn't guarantee anything
> of the sort.

How about we just wait forever if the function manages to return?

select case when pg_cancel_backend(pg_backend_pid(), '...') then
pg_sleep('infinity') end;

--
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Tom Lane

Date:

24 August 2018, 04:37:52

Thomas Munro <thomas.munro@enterprisedb.com> writes:
> On Fri, Aug 24, 2018 at 6:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think this is just a timing problem: the signal gets sent,
>> but it might or might not get received before the current statement ends.

> How about we just wait forever if the function manages to return?
> select case when pg_cancel_backend(pg_backend_pid(), '...') then
> pg_sleep('infinity') end;

Hm, that might work.  I'd pick a long but not infinite timeout --- maybe
60 sec would be good.

            regards, tom lane

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

27 August 2018, 13:06:18

> On 24 Aug 2018, at 03:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> On Fri, Aug 24, 2018 at 6:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think this is just a timing problem: the signal gets sent,
>>> but it might or might not get received before the current statement ends.
>
>> How about we just wait forever if the function manages to return?
>> select case when pg_cancel_backend(pg_backend_pid(), '...') then
>> pg_sleep('infinity') end;
>
> Hm, that might work.  I'd pick a long but not infinite timeout --- maybe
> 60 sec would be good.

I like that idea, so I’ve updated the patch to see what the cfbot thinks of it.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

27 August 2018, 13:09:47

> On 23 Aug 2018, at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I think this is just a timing problem: the signal gets sent,
> but it might or might not get received before the current statement ends.
> It's possible that a signal-sent-to-self can be expected to be received
> synchronously on all Unix platforms, but I wouldn't entirely bet on that
> (in particular, the POSIX text for kill(2) does NOT guarantee it); and
> our Windows signal implementation certainly doesn't guarantee anything
> of the sort.

That makes a lot of sense, I was thinking about it in too simplistic terms.
Thanks for the clarification.

> I don't think there's necessarily anything wrong with the code, but
> you can't test it with such a simplistic test as this and expect
> stable results.  If you feel an urgent need to have a test case,
> try building an isolationtester script.

During hacking on it I preferred to have tests for it.  Iff this makes it in,
the tests can be omitted per the judgement of the committers of course.  Since
the printed pid will vary between runs it’s hard to test more interesting
scenarios so they are of questionable worth.

cheers ./daniel

Re: Optional message to user when terminating/cancelling backend

From

Pavel Stehule

Date:

29 September 2018, 10:29:34

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

I tested this patch, and all looks well and functional. I reread a discussion and I don't see any unresolved objection
againstthis patch.
 

There are not warning, crashes, all tests are passed. New behave is documented well.

I'll mark this patch as ready for commiters

The new status of this patch is: Ready for Committer

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

30 September 2018, 16:51:44

On Mon, Aug 27, 2018 at 12:06:18PM +0200, Daniel Gustafsson wrote:
> I like that idea, so I’ve updated the patch to see what the cfbot
> thinks of it.

+   when pg_cancel_backend(pg_backend_pid(), 'ソケットをブロッキングモードに設定できませんでした')

You could have chosen something less complicated, like "ホゲ", which is
the equivalent of "foo" in English.  Anyway, non-ASCII characters should
not be included in the final patch.
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

01 October 2018, 02:19:01

On Sun, Sep 30, 2018 at 10:51:44PM +0900, Michael Paquier wrote:
> You could have chosen something less complicated, like "ホゲ", which is
> the equivalent of "foo" in English.  Anyway, non-ASCII characters should
> not be included in the final patch.

Looking at the refactoring patch 0001, wouldn't signalfuncs.c make a
better name for the new file?  There are already multiple examples of
this type, like logicalfuncs.c, slotfuncs.c, etc.

I have moved this patch set to the next commit fest for now.
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

01 October 2018, 15:37:42

> On 1 Oct 2018, at 01:19, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Sep 30, 2018 at 10:51:44PM +0900, Michael Paquier wrote:
>> You could have chosen something less complicated, like "ホゲ", which is
>> the equivalent of "foo" in English.  Anyway, non-ASCII characters should
>> not be included in the final patch.

Fixed in the attached v16 revision.

> Looking at the refactoring patch 0001, wouldn't signalfuncs.c make a
> better name for the new file?  There are already multiple examples of
> this type, like logicalfuncs.c, slotfuncs.c, etc.

I have no strong feelings on this, I was merely using the name that Alvaro
suggested when he brought up the refactoring as an extension of this patch.
signalfuncs.c is fine by me, so I did this rename in the attached revision.

> I have moved this patch set to the next commit fest for now.

Thanks.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

02 October 2018, 03:39:59

On Mon, Oct 01, 2018 at 02:37:42PM +0200, Daniel Gustafsson wrote:
>> On 1 Oct 2018, at 01:19, Michael Paquier <michael@paquier.xyz> wrote:
>> Looking at the refactoring patch 0001, wouldn't signalfuncs.c make a
>> better name for the new file?  There are already multiple examples of
>> this type, like logicalfuncs.c, slotfuncs.c, etc.
>
> I have no strong feelings on this, I was merely using the name that Alvaro
> suggested when he brought up the refactoring as an extension of this patch.
> signalfuncs.c is fine by me, so I did this rename in the attached revision.

Indeed, I missed the previous part posted here:
https://www.postgresql.org/message-id/20180124154548.gmpyvkzlsijren7u@alvherre.pgsql

Alvaro, do you have a strong preference over one or the other?
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Thomas Munro

Date:

03 October 2018, 02:09:54

On Tue, Oct 2, 2018 at 1:37 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> > On 1 Oct 2018, at 01:19, Michael Paquier <michael@paquier.xyz> wrote:
> > On Sun, Sep 30, 2018 at 10:51:44PM +0900, Michael Paquier wrote:
> >> You could have chosen something less complicated, like "ホゲ", which is
> >> the equivalent of "foo" in English.  Anyway, non-ASCII characters should
> >> not be included in the final patch.
>
> Fixed in the attached v16 revision.

Hi Daniel,

It looks like you missed another case that needs tolerance for late
signal delivery on Windows:

+select pg_cancel_backend(pg_backend_pid());

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15263

--
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

04 October 2018, 10:59:23

On Wed, Oct 03, 2018 at 12:09:54PM +1300, Thomas Munro wrote:
> It looks like you missed another case that needs tolerance for late
> signal delivery on Windows:
>
> +select pg_cancel_backend(pg_backend_pid());
>
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15263

Looking at 0001, why are the declarations needed in patch 0002 part of
0001 (see signalfuncs.h)?  I think that something like instead the
attached is enough for this part.  Daniel, could you confirm?
--
Michael

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

04 October 2018, 11:06:06

> On 4 Oct 2018, at 09:59, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Oct 03, 2018 at 12:09:54PM +1300, Thomas Munro wrote:
>> It looks like you missed another case that needs tolerance for late
>> signal delivery on Windows:
>>
>> +select pg_cancel_backend(pg_backend_pid());
>>
>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15263
>
> Looking at 0001, why are the declarations needed in patch 0002 part of
> 0001 (see signalfuncs.h)?  I think that something like instead the
> attached is enough for this part.  Daniel, could you confirm?

Yes, you are correct, the signalfuncs.h includes in 0001 are a rebase error
from when I renamed the file.  They are not present in the v15 patch but got
introduced in v16 when I clearly wasn’t caffeinated enough to rebase.

cheers ./daniel

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

04 October 2018, 14:00:51

On Thu, Oct 04, 2018 at 10:06:06AM +0200, Daniel Gustafsson wrote:
> Yes, you are correct, the signalfuncs.h includes in 0001 are a rebase error
> from when I renamed the file.  They are not present in the v15 patch but got
> introduced in v16 when I clearly wasn’t caffeinated enough to rebase.

No problem, thanks for confirming.  I have worked a bit more on the
thing, reducing the headers of the new file to a bare minimum, and I
have committed 0001.
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

05 October 2018, 11:11:45

> On 4 Oct 2018, at 13:00, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Oct 04, 2018 at 10:06:06AM +0200, Daniel Gustafsson wrote:
>> Yes, you are correct, the signalfuncs.h includes in 0001 are a rebase error
>> from when I renamed the file.  They are not present in the v15 patch but got
>> introduced in v16 when I clearly wasn’t caffeinated enough to rebase.
>
> No problem, thanks for confirming.  I have worked a bit more on the
> thing, reducing the headers of the new file to a bare minimum, and I
> have committed 0001.

Thanks!  Attached is a v17 which rebases the former 0002 patch on top of
current master, along with the test fix for Windows that Thomas reported
upthread (no other changes introduced over earlier versions).

cheers ./daniel

Attachment

terminate_msg_v17.patch

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

09 October 2018, 05:38:35

On Fri, Oct 05, 2018 at 10:11:45AM +0200, Daniel Gustafsson wrote:
> Thanks!  Attached is a v17 which rebases the former 0002 patch on top of
> current master, along with the test fix for Windows that Thomas reported
> upthread (no other changes introduced over earlier versions).

Thanks for the new version.

In order to make a test with non-ASCII characters in the error message,
we could use a trick similar to xml.sql: use a function which ignores
the test case if server_encoding is not UTF-8 and use something like
chr() to pass down a messages with non-ASCII characters.  Then if the
function catches the error we are good.

+       *pid = slot->src_pid;
+       slot->orig_length = 0;
+       slot->message[0] = '\0';
Message consumption should be the part using memset(0), no?  system
calls are avoided within a spinlock section, but memset and strlcpy
should be fast enough.  Those are used in WalReceiverMain() for
example.

The facility to store messages should be in its own file, and
signalfuncs.c should depend on it, something like signal_message.c?

+typedef struct
+{
+   pid_t   dest_pid;       /* The pid of the process being signalled */
+   pid_t   src_pid;        /* The pid of the processing signalling */
+   slock_t mutex;          /* Per-slot protection */
+   char    message[MAX_CANCEL_MSG]; /* Message to send to signalled backend */
+   int     orig_length;    /* Length of the message as passed by the user,
+                            * if this length exceeds MAX_CANCEL_MSG it will
+                            * be truncated but we store the original length
+                            * in order to be able to convey truncation
*/
+   int     sqlerrcode;     /* errcode to use when signalling backend */
+} BackendSignalFeedbackShmemStruct

A couple of thoughts here:
- More information could make this facility more generic: an elevel to
be able to fully manipulate the custom error message, and a breakpoint
definition.  As far as I can see you have two of them in the patch which
are the callers of ConsumeBackendSignalFeedback().  One event cancels
the query, and another terminates it.  In both cases, the breakpoint
implies the elevel.  So could we have more possible breakpoints possible
and should we try to make this API more pluggable from the start?
- Is orig_length really useful in the data stored?  Why not just
warning/noticing the caller defining the message and just store the
message.

So, looking at your patch, I am wondering also about splitting it into
a couple of pieces for clarity:
- The base facility to be able to register and consume messages which
can be attached to a backend slot, and then be accessed by other things.
Let's think about it as a base facility which is useful for extensions
and module developers.  If coupled with a hook, it would be actually
possible to scan a backend's slot for some information which could be
consumed afterwards for custom error messages...
- The set of breakpoints which can then be used to define if a given
code path should show up a custom error message.  I can think of three
of them: cancel, termination and extension, which is a custom path that
extensions can use.  The extension breakpoint would make sense as
something included from the start, could be stored as an integer and I
guess should not be an enum but a set of defined tables (see pgstat.h
for wait event categories for example).
- The set of SQL functions making use of it in-core, in your case these
are the SQL functions for cancellation and termination.
--
Michael

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

09 October 2018, 05:38:35

On Fri, Oct 05, 2018 at 10:11:45AM +0200, Daniel Gustafsson wrote:
> Thanks!  Attached is a v17 which rebases the former 0002 patch on top of
> current master, along with the test fix for Windows that Thomas reported
> upthread (no other changes introduced over earlier versions).

Thanks for the new version.

In order to make a test with non-ASCII characters in the error message,
we could use a trick similar to xml.sql: use a function which ignores
the test case if server_encoding is not UTF-8 and use something like
chr() to pass down a messages with non-ASCII characters.  Then if the
function catches the error we are good.

+       *pid = slot->src_pid;
+       slot->orig_length = 0;
+       slot->message[0] = '\0';
Message consumption should be the part using memset(0), no?  system
calls are avoided within a spinlock section, but memset and strlcpy
should be fast enough.  Those are used in WalReceiverMain() for
example.

The facility to store messages should be in its own file, and
signalfuncs.c should depend on it, something like signal_message.c?

+typedef struct
+{
+   pid_t   dest_pid;       /* The pid of the process being signalled */
+   pid_t   src_pid;        /* The pid of the processing signalling */
+   slock_t mutex;          /* Per-slot protection */
+   char    message[MAX_CANCEL_MSG]; /* Message to send to signalled backend */
+   int     orig_length;    /* Length of the message as passed by the user,
+                            * if this length exceeds MAX_CANCEL_MSG it will
+                            * be truncated but we store the original length
+                            * in order to be able to convey truncation
*/
+   int     sqlerrcode;     /* errcode to use when signalling backend */
+} BackendSignalFeedbackShmemStruct

A couple of thoughts here:
- More information could make this facility more generic: an elevel to
be able to fully manipulate the custom error message, and a breakpoint
definition.  As far as I can see you have two of them in the patch which
are the callers of ConsumeBackendSignalFeedback().  One event cancels
the query, and another terminates it.  In both cases, the breakpoint
implies the elevel.  So could we have more possible breakpoints possible
and should we try to make this API more pluggable from the start?
- Is orig_length really useful in the data stored?  Why not just
warning/noticing the caller defining the message and just store the
message.

So, looking at your patch, I am wondering also about splitting it into
a couple of pieces for clarity:
- The base facility to be able to register and consume messages which
can be attached to a backend slot, and then be accessed by other things.
Let's think about it as a base facility which is useful for extensions
and module developers.  If coupled with a hook, it would be actually
possible to scan a backend's slot for some information which could be
consumed afterwards for custom error messages...
- The set of breakpoints which can then be used to define if a given
code path should show up a custom error message.  I can think of three
of them: cancel, termination and extension, which is a custom path that
extensions can use.  The extension breakpoint would make sense as
something included from the start, could be stored as an integer and I
guess should not be an enum but a set of defined tables (see pgstat.h
for wait event categories for example).
- The set of SQL functions making use of it in-core, in your case these
are the SQL functions for cancellation and termination.
--
Michael

Attachment

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

10 October 2018, 12:20:53

> On 9 Oct 2018, at 07:38, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Oct 05, 2018 at 10:11:45AM +0200, Daniel Gustafsson wrote:
>> Thanks!  Attached is a v17 which rebases the former 0002 patch on top of
>> current master, along with the test fix for Windows that Thomas reported
>> upthread (no other changes introduced over earlier versions).
>
> Thanks for the new version.

Thanks for reviewing!

> In order to make a test with non-ASCII characters in the error message,
> we could use a trick similar to xml.sql: use a function which ignores
> the test case if server_encoding is not UTF-8 and use something like
> chr() to pass down a messages with non-ASCII characters.  Then if the
> function catches the error we are good.

Thats one approach, do you think it’s worth adding to ensure clipping during
truncation?

> +       *pid = slot->src_pid;
> +       slot->orig_length = 0;
> +       slot->message[0] = '\0';
> Message consumption should be the part using memset(0), no?  system
> calls are avoided within a spinlock section, but memset and strlcpy
> should be fast enough.  Those are used in WalReceiverMain() for
> example.

Good point.  IIRC I added it in the setting codepath to avoid memsetting in
case there were no more messages set.  That’s an incorrect optimization, so
fixed.

> The facility to store messages should be in its own file, and
> signalfuncs.c should depend on it, something like signal_message.c?

It was originally in backend_signal.c, and was moved into (what is now)
signalfuncs.c in the refactoring that Alvaro suggested.  Re-reading his email
I’m not sure he actually proposed merging this code with the signal code.

Moved the new functionality to signal_message.{ch}.

> +typedef struct
> +{
> +   pid_t   dest_pid;       /* The pid of the process being signalled */
> +   pid_t   src_pid;        /* The pid of the processing signalling */
> +   slock_t mutex;          /* Per-slot protection */
> +   char    message[MAX_CANCEL_MSG]; /* Message to send to signalled backend */
> +   int     orig_length;    /* Length of the message as passed by the user,
> +                            * if this length exceeds MAX_CANCEL_MSG it will
> +                            * be truncated but we store the original length
> +                            * in order to be able to convey truncation
> */
> +   int     sqlerrcode;     /* errcode to use when signalling backend */
> +} BackendSignalFeedbackShmemStruct
>
> A couple of thoughts here:
> - More information could make this facility more generic: an elevel to
> be able to fully manipulate the custom error message, and a breakpoint
> definition.  As far as I can see you have two of them in the patch which
> are the callers of ConsumeBackendSignalFeedback().  One event cancels
> the query, and another terminates it.  In both cases, the breakpoint
> implies the elevel.  So could we have more possible breakpoints possible
> and should we try to make this API more pluggable from the start?

I’m not sure I follow, can you elaborate on this?

> - Is orig_length really useful in the data stored?  Why not just
> warning/noticing the caller defining the message and just store the
> message.

The caller is being notified, but the original length is returned such that the
consumer can format the message with the truncation in mind.  In postgres.c we
currently do:

    ereport(FATAL,
            (errcode(sqlerrcode),
             errmsg("%s%s",
                    buffer, (len > sizeof(buffer) ? "..." : "")),
             errdetail("terminating connection due to administrator command by process %d",
                       pid)));

If that’s not deemed of value to keep, then orig_length can be dropped.

> So, looking at your patch, I am wondering also about splitting it into
> a couple of pieces for clarity:
> - The base facility to be able to register and consume messages which
> can be attached to a backend slot, and then be accessed by other things.
> Let's think about it as a base facility which is useful for extensions
> and module developers.  If coupled with a hook, it would be actually
> possible to scan a backend's slot for some information which could be
> consumed afterwards for custom error messages...
> - The set of breakpoints which can then be used to define if a given
> code path should show up a custom error message.  I can think of three
> of them: cancel, termination and extension, which is a custom path that
> extensions can use.  The extension breakpoint would make sense as
> something included from the start, could be stored as an integer and I
> guess should not be an enum but a set of defined tables (see pgstat.h
> for wait event categories for example).
> - The set of SQL functions making use of it in-core, in your case these
> are the SQL functions for cancellation and termination.

This does not sound like a bad idea to me.  Is that something you are planning
to do or do you want me to cut the patch up into pieces?  Just want to avoid
stepping on each others toes if you are already thinking/poking at this.

Attached is a v18 with the above changes.

cheers ./daniel

Attachment

terminate_msg_v18.patch

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

11 October 2018, 01:29:07

On Wed, Oct 10, 2018 at 02:20:53PM +0200, Daniel Gustafsson wrote:
>> On 9 Oct 2018, at 07:38, Michael Paquier <michael@paquier.xyz> wrote:
>> In order to make a test with non-ASCII characters in the error message,
>> we could use a trick similar to xml.sql: use a function which ignores
>> the test case if server_encoding is not UTF-8 and use something like
>> chr() to pass down a messages with non-ASCII characters.  Then if the
>> function catches the error we are good.
>
> Thats one approach, do you think it’s worth adding to ensure clipping during
> truncation?

I am not exactly sure what you mean here.

> > The facility to store messages should be in its own file, and
> > signalfuncs.c should depend on it, something like signal_message.c?
>
> It was originally in backend_signal.c, and was moved into (what is now)
> signalfuncs.c in the refactoring that Alvaro suggested.  Re-reading
> his email I’m not sure he actually proposed merging this code with the
> signal code.

Indeed, you could understand the past suggestion both ways.  From what I
can see, there is merit in keeping the SQL-callable functions working on
signals into their own file.  The message capacity that you are
proposing here should on their own, and...

> Moved the new functionality to signal_message.{ch}.

That's actually a much better name than anything I could think of.

>> - More information could make this facility more generic: an elevel to
>> be able to fully manipulate the custom error message, and a breakpoint
>> definition.  As far as I can see you have two of them in the patch which
>> are the callers of ConsumeBackendSignalFeedback().  One event cancels
>> the query, and another terminates it.  In both cases, the breakpoint
>> implies the elevel.  So could we have more possible breakpoints possible
>> and should we try to make this API more pluggable from the start?
>
> I’m not sure I follow, can you elaborate on this?

Sure.  From what I can see in your patch is that you want to generate in
some code paths an ereport() call with a given message string.  As far
as I can see, you can decide three things with what's available:
- errcode
- errstring
- the moment you want the ereport() to be generated.

What I am suggesting here is to let callers extend the possibilities a
bit more with:
- elevel
- errdetail
This way, you can override code paths with a more custom experience.
The thing could break easily Postgres, as you should not really use a
non-FATAL elevel when terminating a session, but having a flexible API
will allow to not come back to it in the future.  We may also want to
have ConsumeBackendSignalFeedback() call ereport() by itself with all
the details available in the data registered.

The main take on your feature is that it is more flexible than the elog
hook, as you can enforce custom strings at function call, and don't need
to do some weird hacks in plugins in need of manipulating the error.

>> - Is orig_length really useful in the data stored?  Why not just
>> warning/noticing the caller defining the message and just store the
>> message.
>
> The caller is being notified, but the original length is returned such that the
> consumer can format the message with the truncation in mind.  In postgres.c we
> currently do:
>
>     ereport(FATAL,
>             (errcode(sqlerrcode),
>              errmsg("%s%s",
>                     buffer, (len > sizeof(buffer) ? "..." : "")),
>              errdetail("terminating connection due to administrator command by process %d",
>                        pid)));

The goal is to generate an elog() at the end, so I can see your point,
not sure if that's worth the complication though..

>> So, looking at your patch, I am wondering also about splitting it into
>> a couple of pieces for clarity:
>> - The base facility to be able to register and consume messages which
>> can be attached to a backend slot, and then be accessed by other things.
>> Let's think about it as a base facility which is useful for extensions
>> and module developers.  If coupled with a hook, it would be actually
>> possible to scan a backend's slot for some information which could be
>> consumed afterwards for custom error messages...
>> - The set of breakpoints which can then be used to define if a given
>> code path should show up a custom error message.  I can think of three
>> of them: cancel, termination and extension, which is a custom path that
>> extensions can use.  The extension breakpoint would make sense as
>> something included from the start, could be stored as an integer and I
>> guess should not be an enum but a set of defined tables (see pgstat.h
>> for wait event categories for example).
>> - The set of SQL functions making use of it in-core, in your case these
>> are the SQL functions for cancellation and termination.
>
> This does not sound like a bad idea to me.  Is that something you are
> planning to do or do you want me to cut the patch up into pieces?
> Just want to avoid stepping on each others toes if you are already
> thinking/poking at this.

Those are my first impressions about the patch after looking at the
code so I have not done any code diving.  The more pieces we are able to
break this stuff into, the more likely it is easy to review and to get
committed.  From what I can see the main idea can be split into more
sub-concepts.
--
Michael

Attachment

signature.asc

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Daniel Gustafsson

Date:

30 October 2018, 23:29:25

> On 11 Oct 2018, at 03:29, Michael Paquier <michael@paquier.xyz> wrote:

Hi!,

Thanks for reviewing this patch, and sorry for having been slow lately.

> On Wed, Oct 10, 2018 at 02:20:53PM +0200, Daniel Gustafsson wrote:
>>> On 9 Oct 2018, at 07:38, Michael Paquier <michael@paquier.xyz> wrote:
>>> In order to make a test with non-ASCII characters in the error message,
>>> we could use a trick similar to xml.sql: use a function which ignores
>>> the test case if server_encoding is not UTF-8 and use something like
>>> chr() to pass down a messages with non-ASCII characters.  Then if the
>>> function catches the error we are good.
>>
>> Thats one approach, do you think it’s worth adding to ensure clipping during
>> truncation?
>
> I am not exactly sure what you mean here.

The test was added to the patch as a reviewer found a bug in an early version
where it didn’t treat mb characters properly, it was using strlen() and prone
to clipping an mb char in half.  The test was added to the patch to show the
fix to the patch review process, but that doesn’t necessarily mean it’s deemed
an interesting enough case to have a test in check/installcheck for.  I don’t
have strong opinions, other than trying to not add uninteresting tests as they
do carry a cost.

>>> The facility to store messages should be in its own file, and
>>> signalfuncs.c should depend on it, something like signal_message.c?
>>
>> It was originally in backend_signal.c, and was moved into (what is now)
>> signalfuncs.c in the refactoring that Alvaro suggested.  Re-reading
>> his email I’m not sure he actually proposed merging this code with the
>> signal code.
>
> Indeed, you could understand the past suggestion both ways.  From what I
> can see, there is merit in keeping the SQL-callable functions working on
> signals into their own file.  The message capacity that you are
> proposing here should on their own, and...
>
>> Moved the new functionality to signal_message.{ch}.
>
> That's actually a much better name than anything I could think of.

Actually it isn’t, it was your suggestion =)

>>> - More information could make this facility more generic: an elevel to
>>> be able to fully manipulate the custom error message, and a breakpoint
>>> definition.  As far as I can see you have two of them in the patch which
>>> are the callers of ConsumeBackendSignalFeedback().  One event cancels
>>> the query, and another terminates it.  In both cases, the breakpoint
>>> implies the elevel.  So could we have more possible breakpoints possible
>>> and should we try to make this API more pluggable from the start?
>>
>> I’m not sure I follow, can you elaborate on this?
>
> Sure.  From what I can see in your patch is that you want to generate in
> some code paths an ereport() call with a given message string.  As far
> as I can see, you can decide three things with what's available:
> - errcode
> - errstring
> - the moment you want the ereport() to be generated.
>
> What I am suggesting here is to let callers extend the possibilities a
> bit more with:
> - elevel
> - errdetail
> This way, you can override code paths with a more custom experience.
> The thing could break easily Postgres, as you should not really use a
> non-FATAL elevel when terminating a session, but having a flexible API
> will allow to not come back to it in the future.  We may also want to
> have ConsumeBackendSignalFeedback() call ereport() by itself with all
> the details available in the data registered.

I’ve added elevel to the API, but I’m not sure if keeping an additional buffer
for errdetail is worth it since it probably wouldn’t be used that often?

> The main take on your feature is that it is more flexible than the elog
> hook, as you can enforce custom strings at function call, and don't need
> to do some weird hacks in plugins in need of manipulating the error.
>
>>> - Is orig_length really useful in the data stored?  Why not just
>>> warning/noticing the caller defining the message and just store the
>>> message.
>>
>> The caller is being notified, but the original length is returned such that the
>> consumer can format the message with the truncation in mind.  In postgres.c we
>> currently do:
>>
>>    ereport(FATAL,
>>            (errcode(sqlerrcode),
>>             errmsg("%s%s",
>>                    buffer, (len > sizeof(buffer) ? "..." : "")),
>>             errdetail("terminating connection due to administrator command by process %d",
>>                       pid)));
>
> The goal is to generate an elog() at the end, so I can see your point,
> not sure if that's worth the complication though..

Since I wrote the code, I think it’s worth it as a nice-to-have rather than as
a pure necessity.  If noone else sees the value then we’ll rip it out of the
patch.

>>> So, looking at your patch, I am wondering also about splitting it into
>>> a couple of pieces for clarity:
>>> - The base facility to be able to register and consume messages which
>>> can be attached to a backend slot, and then be accessed by other things.
>>> Let's think about it as a base facility which is useful for extensions
>>> and module developers.  If coupled with a hook, it would be actually
>>> possible to scan a backend's slot for some information which could be
>>> consumed afterwards for custom error messages...
>>> - The set of breakpoints which can then be used to define if a given
>>> code path should show up a custom error message.  I can think of three
>>> of them: cancel, termination and extension, which is a custom path that
>>> extensions can use.  The extension breakpoint would make sense as
>>> something included from the start, could be stored as an integer and I
>>> guess should not be an enum but a set of defined tables (see pgstat.h
>>> for wait event categories for example).
>>> - The set of SQL functions making use of it in-core, in your case these
>>> are the SQL functions for cancellation and termination.

Is a hook for extensions needed, as they too can drain the message via the
Consume function? Where if so do you reckon it should go?

>> This does not sound like a bad idea to me.  Is that something you are
>> planning to do or do you want me to cut the patch up into pieces?
>> Just want to avoid stepping on each others toes if you are already
>> thinking/poking at this.
>
> Those are my first impressions about the patch after looking at the
> code so I have not done any code diving.  The more pieces we are able to
> break this stuff into, the more likely it is easy to review and to get
> committed.  From what I can see the main idea can be split into more
> sub-concepts.

I’ve split the patch into two logical parts, the signalling functionality and
the userfacing terminate/cancel part.  For extra clarity I’ve also included the
full v19 patch, in case you prefer that instead when seeing the two.

cheers ./daniel

Attachment

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Tom Lane

Date:

12 November 2018, 19:35:47

Daniel Gustafsson <daniel@yesql.se> writes:
> I’ve split the patch into two logical parts, the signalling functionality and
> the userfacing terminate/cancel part.  For extra clarity I’ve also included the
> full v19 patch, in case you prefer that instead when seeing the two.

I'm a bit befuddled by this patch, or at least the proposed test cases.
Those show no proof that the feature actually works, ie, delivers a
message to the target backend.  Also, what am I to make of the test cases
involving NULL arguments?

Personally, rather than messing around with defaulted arguments and
detecting nulls at runtime, I'd make two C functions that are both strict.

The signal_message stuff seems both overly complicated and overly fragile.
You have, for example, largely ignored the coding rule that says to have
only straight-line code within a spinlock hold.  I am also not very
enamored of setting up Yet Another per-backend data structure in shared
memory, especially one that can so easily get out of sync with the
ProcArray.  If we're going to expend dedicated shared memory on this
feature, let's just add the necessary fields to the PGPROC structs.
That would also provide some opportunity to interlock things in a way that
would fix the race conditions, ie, once we've found the relevant PGPROC
entry, hold a lock on it till we've inserted the message and sent the
signal.

I don't especially like orig_length: that information is of no use
to the recipient backend, and what the field is actually being used
as, ie a boolean "slot is full" indicator, is nowhere to be guessed
from the field's documentation.

The current patch fails to apply against parallel_schedule, which
reminds me that you forgot to include an update to serial_schedule.

            regards, tom lane

Re: [HACKERS] Optional message to user when terminating/cancelling backend

From

Dmitry Dolgov

Date:

30 November 2018, 17:02:15

On Mon, Nov 12, 2018 at 8:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
> > I’ve split the patch into two logical parts, the signalling functionality and
> > the userfacing terminate/cancel part.  For extra clarity I’ve also included the
> > full v19 patch, in case you prefer that instead when seeing the two.

Just for the records, cfbot complains during the compilation:

option.c: In function ‘parseCommandLine’:
option.c:265:8: error: ignoring return value of ‘getcwd’, declared
with attribute warn_unused_result [-Werror=unused-result]
  getcwd(default_sockdir, MAXPGPATH);
        ^

Re: [HACKERS] Optional message to user when terminating/cancellingbackend

From

Michael Paquier

Date:

03 February 2019, 09:25:59

On Fri, Nov 30, 2018 at 06:02:15PM +0100, Dmitry Dolgov wrote:
> Just for the records, cfbot complains during the compilation:
>
> option.c: In function ‘parseCommandLine’:
> option.c:265:8: error: ignoring return value of ‘getcwd’, declared
> with attribute warn_unused_result [-Werror=unused-result]
>   getcwd(default_sockdir, MAXPGPATH);
>         ^

The patch has been marked as returned with feedback.
--
Michael

Attachment

signature.asc