Thread: [HACKERS] Optional message to user when terminating/cancelling backend
When terminating, or cancelling, a backend it’s currently not possible to let the signalled session know *why* it was dropped. This has nagged me in the past and now it happened to come up again, so I took a stab at this. The attached patch implements the ability to pass an optional text message to the signalled session which is included in the error message: SELECT pg_terminate_backend(<pid> [, message]); SELECT pg_cancel_backend(<pid> [, message]); Right now the message is simply appended on the error message, not sure if errdetail or errhint would be better? Calling: select pg_terminate_backend(<pid>, 'server rebooting'); ..leads to: FATAL: terminating connection due to administrator command: "server rebooting" Omitting the message invokes the command just like today. The message is stored in a new shmem area which is checked when the session is aborted. To keep things simple a small buffer is kept per backend for the message. If deemed too costly, keeping a central buffer from which slabs are allocated can be done (but seemed rather complicated for little gain compared to the quite moderate memory spend.) cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
When terminating, or cancelling, a backend it’s currently not possible to let
the signalled session know *why* it was dropped. This has nagged me in the
past and now it happened to come up again, so I took a stab at this. The
attached patch implements the ability to pass an optional text message to the
signalled session which is included in the error message:
SELECT pg_terminate_backend(<pid> [, message]);
SELECT pg_cancel_backend(<pid> [, message]);
Right now the message is simply appended on the error message, not sure if
errdetail or errhint would be better? Calling:
select pg_terminate_backend(<pid>, 'server rebooting');
..leads to:
FATAL: terminating connection due to administrator command: "server rebooting"
Omitting the message invokes the command just like today.
The message is stored in a new shmem area which is checked when the session is
aborted. To keep things simple a small buffer is kept per backend for the
message. If deemed too costly, keeping a central buffer from which slabs are
allocated can be done (but seemed rather complicated for little gain compared
to the quite moderate memory spend.)
cheers ./daniel
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
+1.
This really helps PostgreSQL Azure service as well. When we are doing the upgrades to the service, instead of abruptly terminating the sessions we can provide this message.
Thanks,
Satya
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Pavel Stehule
Sent: Monday, June 19, 2017 11:41 AM
To: Daniel Gustafsson <daniel@yesql.se>
Cc: PostgreSQL mailing lists <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Optional message to user when terminating/cancelling backend
2017-06-19 20:24 GMT+02:00 Daniel Gustafsson <daniel@yesql.se>:
When terminating, or cancelling, a backend it’s currently not possible to let
the signalled session know *why* it was dropped. This has nagged me in the
past and now it happened to come up again, so I took a stab at this. The
attached patch implements the ability to pass an optional text message to the
signalled session which is included in the error message:
SELECT pg_terminate_backend(<pid> [, message]);
SELECT pg_cancel_backend(<pid> [, message]);
Right now the message is simply appended on the error message, not sure if
errdetail or errhint would be better? Calling:
select pg_terminate_backend(<pid>, 'server rebooting');
..leads to:
FATAL: terminating connection due to administrator command: "server rebooting"
Omitting the message invokes the command just like today.
The message is stored in a new shmem area which is checked when the session is
aborted. To keep things simple a small buffer is kept per backend for the
message. If deemed too costly, keeping a central buffer from which slabs are
allocated can be done (but seemed rather complicated for little gain compared
to the quite moderate memory spend.)
cheers ./daniel
+1
very good idea
Pavel
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Satyanarayana Narlapuram wrote: > +1. > > This really helps PostgreSQL Azure service as well. When we are doing > the upgrades to the service, instead of abruptly terminating the > sessions we can provide this message. I think you mean "in addition to" rather than "instead of". Unless you have a lot of users running psql manually, I don't see how this is actually very useful or actionable. What would the user do with the information? Hopefully your users already trust that you'd keep the downtime to the minimum possible. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Optional message to user when terminating/cancelling backend
On Tue, Jun 20, 2017 at 11:54 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Satyanarayana Narlapuram wrote: > Unless you have a lot of users running psql manually, I don't see how > this is actually very useful or actionable. What would the user do with > the information? Hopefully your users already trust that you'd keep the > downtime to the minimum possible. > Why wouldn't this be useful in application logs? Spurious dropped connections during application execution would be alarming. Seeing a message from the DBA when looking into those would be a painless and quick way to alleviate stress. pg_cancel_backend(<pid>, 'please try not to leave sessions in an "idle in transaction" state...') would also seem like a useful message to communicate; to user or application. Sure, some of this can, and maybe would also need to, be done out-of-band but this communication channel seems worthy enough to at least evaluate the provided implementation. David J.
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
Agree with David on the general usefulness of this channel. Not that Azure has this implementation or proposal today, butas a managed service this channel of communication is worth. For example, DBA / service can set a policy that if certainsession exceeds the resource usage DBA can kill it and communicate the same. For example, too many locks, lot of IOactivity, large open transactions etc. The messages will help application developer to tune their workloads appropriately. Thanks, Satya -----Original Message----- From: David G. Johnston [mailto:david.g.johnston@gmail.com] Sent: Tuesday, June 20, 2017 12:44 PM To: Alvaro Herrera <alvherre@2ndquadrant.com> Cc: Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com>; Pavel Stehule <pavel.stehule@gmail.com>; Daniel Gustafsson<daniel@yesql.se>; PostgreSQL mailing lists <pgsql-hackers@postgresql.org> Subject: Re: [HACKERS] Optional message to user when terminating/cancelling backend On Tue, Jun 20, 2017 at 11:54 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Satyanarayana Narlapuram wrote: > Unless you have a lot of users running psql manually, I don't see how > this is actually very useful or actionable. What would the user do > with the information? Hopefully your users already trust that you'd > keep the downtime to the minimum possible. > Why wouldn't this be useful in application logs? Spurious dropped connections during application execution would be alarming. Seeing a message from the DBA when looking into those would be a painless and quick way to alleviate stress. pg_cancel_backend(<pid>, 'please try not to leave sessions in an "idle in transaction" state...') would also seem like auseful message to communicate; to user or application. Sure, some of this can, and maybe would also need to, be done out-of-bandbut this communication channel seems worthy enough to at least evaluate the provided implementation. David J.
Hi, On 2017-06-19 20:24:43 +0200, Daniel Gustafsson wrote: > When terminating, or cancelling, a backend it’s currently not possible to let > the signalled session know *why* it was dropped. This has nagged me in the > past and now it happened to come up again, so I took a stab at this. The > attached patch implements the ability to pass an optional text message to the > signalled session which is included in the error message: > > SELECT pg_terminate_backend(<pid> [, message]); > SELECT pg_cancel_backend(<pid> [, message]); > > Right now the message is simply appended on the error message, not sure if > errdetail or errhint would be better? Calling: > > select pg_terminate_backend(<pid>, 'server rebooting'); > > ..leads to: > > FATAL: terminating connection due to administrator command: "server rebooting" > > Omitting the message invokes the command just like today. For extensions it'd also be useful if it'd be possible to overwrite the error code. E.g. for citus there's a distributed deadlock detector, running out of process because there's no way to interrupt lock waits locally, and we've to do some ugly hacking to generate proper error messages and code from another session. - Andres
Re: [HACKERS] Optional message to user when terminating/cancelling backend
On Tue, Jun 20, 2017 at 3:24 AM, Daniel Gustafsson <daniel@yesql.se> wrote: > The message is stored in a new shmem area which is checked when the session is > aborted. To keep things simple a small buffer is kept per backend for the > message. If deemed too costly, keeping a central buffer from which slabs are > allocated can be done (but seemed rather complicated for little gain compared > to the quite moderate memory spend.) I think that you are right to take the approach with a per-backend slot. This will avoid complications related to entry removals and locking issues. There would be scaling issues as well if things get very signaled for a lot of backends. +#define MAX_CANCEL_MSG 128 That looks enough. + LWLockAcquire(BackendCancelLock, LW_EXCLUSIVE); + + strlcpy(slot->message, message, sizeof(slot->message)); + slot->len = strlen(message); Why not using one spin lock per slot and save it in BackendCancelShmemStruct? + pid_t pid = PG_GETARG_INT32(0); + char *msg = text_to_cstring(PG_GETARG_TEXT_PP(1)); + + PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg)); It would be more solid to add some error handling for messages that are too long, or at least truncate the message if too long. -- Michael
Hi, Here are some comments for the patch. +Datum +pg_cancel_backend(PG_FUNCTION_ARGS) +{ + PG_RETURN_BOOL(pg_cancel_backend_internal(PG_GETARG_INT32(0), NULL)); +} +Datum +pg_cancel_backend_msg(PG_FUNCTION_ARGS) +{ + pid_t pid = PG_GETARG_INT32(0); + char *msg = text_to_cstring(PG_GETARG_TEXT_PP(1)); + + PG_RETURN_BOOL(pg_cancel_backend_internal(pid, msg)); +} It would be better to insert a blank line between these functions. +/* + * Sets a cancellation message for the backend with the specified pid and + * returns the length of message actually created. If the returned length + * is equal to the length of the message parameter, truncation may have + * occurred. If the backend wasn't found and no message was set, -1 is + * returned. + */ It seems to me that this comment is incorrect. "If the returned length is not equal to the length of the message parameter," is right, isn't it? In addition, the last statement would be "If the backend wasn't found, -1 is returned. Otherwize, if no message was set, 0 is returned." + strlcpy(slot->message, message, sizeof(slot->message)); + slot->len = strlen(message); + + LWLockRelease(BackendCancelLock); + return slot->len; If SetBackendCancelMessage() has to return the length of message actually created, slot->len should be strlen(slot->message) instead of strlen(message). In the current code, when the return value and slot->len is always set to the length of the passed message parameter. Regards, On Mon, 19 Jun 2017 20:24:43 +0200 Daniel Gustafsson <daniel@yesql.se> wrote: > When terminating, or cancelling, a backend it’s currently not possible to let > the signalled session know *why* it was dropped. This has nagged me in the > past and now it happened to come up again, so I took a stab at this. The > attached patch implements the ability to pass an optional text message to the > signalled session which is included in the error message: > > SELECT pg_terminate_backend(<pid> [, message]); > SELECT pg_cancel_backend(<pid> [, message]); > > Right now the message is simply appended on the error message, not sure if > errdetail or errhint would be better? Calling: > > select pg_terminate_backend(<pid>, 'server rebooting'); > > ..leads to: > > FATAL: terminating connection due to administrator command: "server rebooting" > > Omitting the message invokes the command just like today. > > The message is stored in a new shmem area which is checked when the session is > aborted. To keep things simple a small buffer is kept per backend for the > message. If deemed too costly, keeping a central buffer from which slabs are > allocated can be done (but seemed rather complicated for little gain compared > to the quite moderate memory spend.) > > cheers ./daniel > -- Yugo Nagata <nagata@sraoss.co.jp>
On Wed, 21 Jun 2017 12:06:33 +0900 Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Jun 20, 2017 at 3:24 AM, Daniel Gustafsson <daniel@yesql.se> wrote: > > The message is stored in a new shmem area which is checked when the session is > > aborted. To keep things simple a small buffer is kept per backend for the > > message. If deemed too costly, keeping a central buffer from which slabs are > > allocated can be done (but seemed rather complicated for little gain compared > > to the quite moderate memory spend.) > > I think that you are right to take the approach with a per-backend > slot. This will avoid complications related to entry removals and > locking issues. There would be scaling issues as well if things get > very signaled for a lot of backends. > > +#define MAX_CANCEL_MSG 128 > That looks enough. > > + LWLockAcquire(BackendCancelLock, LW_EXCLUSIVE); > + > + strlcpy(slot->message, message, sizeof(slot->message)); > + slot->len = strlen(message); > Why not using one spin lock per slot and save it in BackendCancelShmemStruct? +1 I found an example that a spin lock is used during strlcpy in WalReceiverMain(). > > + pid_t pid = PG_GETARG_INT32(0); > + char *msg = text_to_cstring(PG_GETARG_TEXT_PP(1)); > + > + PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg)); > It would be more solid to add some error handling for messages that > are too long, or at least truncate the message if too long. I agree that error handling for too long messages is needed. Although long messages are truncated in SetBackendCancelMessage(), it is better to inform users that the message they can read was truncated one. Or, maybe we should prohibit too long message is passed in pg_teminate_backend() > -- > Michael > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Yugo Nagata <nagata@sraoss.co.jp>
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 21 Jun 2017, at 16:30, Yugo Nagata <nagata@sraoss.co.jp> wrote: > > On Wed, 21 Jun 2017 12:06:33 +0900 > Michael Paquier <michael.paquier@gmail.com> wrote: > >> On Tue, Jun 20, 2017 at 3:24 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >>> The message is stored in a new shmem area which is checked when the session is >>> aborted. To keep things simple a small buffer is kept per backend for the >>> message. If deemed too costly, keeping a central buffer from which slabs are >>> allocated can be done (but seemed rather complicated for little gain compared >>> to the quite moderate memory spend.) >> >> I think that you are right to take the approach with a per-backend >> slot. This will avoid complications related to entry removals and >> locking issues. There would be scaling issues as well if things get >> very signaled for a lot of backends. >> >> +#define MAX_CANCEL_MSG 128 >> That looks enough. >> >> + LWLockAcquire(BackendCancelLock, LW_EXCLUSIVE); >> + >> + strlcpy(slot->message, message, sizeof(slot->message)); >> + slot->len = strlen(message); >> Why not using one spin lock per slot and save it in BackendCancelShmemStruct? > > +1 > > I found an example that a spin lock is used during strlcpy in WalReceiverMain(). Yeah I agree as well, will fix. >> + pid_t pid = PG_GETARG_INT32(0); >> + char *msg = text_to_cstring(PG_GETARG_TEXT_PP(1)); >> + >> + PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg)); >> It would be more solid to add some error handling for messages that >> are too long, or at least truncate the message if too long. > > I agree that error handling for too long messages is needed. > Although long messages are truncated in SetBackendCancelMessage(), > it is better to inform users that the message they can read was > truncated one. Or, maybe we should prohibit too long message > is passed in pg_teminate_backend() The message is truncated in SetBackendCancelMessage() for safety, but pg_{cancel|terminate}_backend() could throw an error on too long message, or warning truncation, to the caller as well. Personally I think a warning is the appropriate response, but I don’t really have a strong opinion. cheers ./daniel
Re: [HACKERS] Optional message to user when terminating/cancelling backend
On Wed, Jun 21, 2017 at 11:42 PM, Daniel Gustafsson <daniel@yesql.se> wrote: > The message is truncated in SetBackendCancelMessage() for safety, but > pg_{cancel|terminate}_backend() could throw an error on too long message, or > warning truncation, to the caller as well. Personally I think a warning is the > appropriate response, but I don’t really have a strong opinion. And a NOTICE? That's what happens for relation name truncation. You are right that having a check in SetBackendCancelMessage() makes the most sense as bgworkers could just call the low level API. Isn't the concept actually closer to just a backend message? This slot could be used for other purposes than cancellation. -- Michael
On Thu, 22 Jun 2017 09:24:54 +0900 Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Jun 21, 2017 at 11:42 PM, Daniel Gustafsson <daniel@yesql.se> wrote: > > The message is truncated in SetBackendCancelMessage() for safety, but > > pg_{cancel|terminate}_backend() could throw an error on too long message, or > > warning truncation, to the caller as well. Personally I think a warning is the > > appropriate response, but I don’t really have a strong opinion. > > And a NOTICE? That's what happens for relation name truncation. You > are right that having a check in SetBackendCancelMessage() makes the > most sense as bgworkers could just call the low level API. Isn't the > concept actually closer to just a backend message? This slot could be > used for other purposes than cancellation. +1 for NOTICE. The message truncation seems to be a kind of helpful information rather than a likely problem as long as pg_terminated_backend exits successfully. https://www.postgresql.org/docs/10/static/runtime-config-logging.html#runtime-config-severity-levels > -- > Michael -- Yugo Nagata <nagata@sraoss.co.jp>
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 22 Jun 2017, at 10:24, Yugo Nagata <nagata@sraoss.co.jp> wrote: > > On Thu, 22 Jun 2017 09:24:54 +0900 > Michael Paquier <michael.paquier@gmail.com> wrote: > >> On Wed, Jun 21, 2017 at 11:42 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >>> The message is truncated in SetBackendCancelMessage() for safety, but >>> pg_{cancel|terminate}_backend() could throw an error on too long message, or >>> warning truncation, to the caller as well. Personally I think a warning is the >>> appropriate response, but I don’t really have a strong opinion. >> >> And a NOTICE? That's what happens for relation name truncation. You >> are right that having a check in SetBackendCancelMessage() makes the >> most sense as bgworkers could just call the low level API. Isn't the >> concept actually closer to just a backend message? This slot could be >> used for other purposes than cancellation. > > +1 for NOTICE. The message truncation seems to be a kind of helpful > information rather than a likely problem as long as pg_terminated_backend > exits successfully. > > https://www.postgresql.org/docs/10/static/runtime-config-logging.html#runtime-config-severity-levels Good point. I’ve attached a new version which issues a NOTICE on truncation and also addresses all other comments so far in this thread. Thanks a lot for the early patch reviews! cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Jun 22, 2017 at 7:18 PM, Daniel Gustafsson <daniel@yesql.se> wrote: > Good point. I’ve attached a new version which issues a NOTICE on truncation > and also addresses all other comments so far in this thread. Thanks a lot for > the early patch reviews! > > cheers ./daniel I have done an initial review of the patch. I have some comments/suggestions. +int +GetCancelMessage(char **buffer, size_t buf_len) +{ + volatile BackendCancelShmemStruct *slot = MyCancelSlot; + int msg_length = 0; + Returned value of this function is never used, better to convert it to just void. ------- +bool +HasCancelMessage(void) +{ + volatile BackendCancelShmemStruct *slot = MyCancelSlot; +/* + * Return the configured cancellation message and its length. If the returned + * length is greater than the size of the passed buffer, truncation has been + * performed. The message is cleared on reading. + */ +int +GetCancelMessage(char **buffer, size_t buf_len) I think it will be good to merge these two function where GetCancelMessage will first check whether it has the message or not if it does then allocate the buffer of size slot->len and copy the slot message to allocated buffer otherwise just return NULL. So it will look like "char *GetCancelMessage()" --------- + SpinLockAcquire(&slot->mutex); + strlcpy(*buffer, (const char *) slot->message, buf_len); strlcpy(*buffer, (const char *) slot->message, slot->len); Isn't it better to copy only upto slot->len, seems like it will always be <= buf_len, and if not then we can do min(buf_len, slot->len) ---- -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 22 Jun 2017, at 17:52, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Jun 22, 2017 at 7:18 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >> Good point. I’ve attached a new version which issues a NOTICE on truncation >> and also addresses all other comments so far in this thread. Thanks a lot for >> the early patch reviews! >> >> cheers ./daniel > > I have done an initial review of the patch. I have some comments/suggestions. Thanks! > +int > +GetCancelMessage(char **buffer, size_t buf_len) > +{ > + volatile BackendCancelShmemStruct *slot = MyCancelSlot; > + int msg_length = 0; > + > > Returned value of this function is never used, better to convert it to > just void. You’re probably right, I was thinking that someone might be interested in knowing about truncation when extracting the message but I can’t really think of a callsite which wouldn’t just pass in a large enough buffer in the first place. > +bool > +HasCancelMessage(void) > +{ > + volatile BackendCancelShmemStruct *slot = MyCancelSlot; > > +/* > + * Return the configured cancellation message and its length. If the returned > + * length is greater than the size of the passed buffer, truncation has been > + * performed. The message is cleared on reading. > + */ > +int > +GetCancelMessage(char **buffer, size_t buf_len) > > I think it will be good to merge these two function where > GetCancelMessage will first check whether it has the message or not > if it does then allocate the buffer of size slot->len and copy the > slot message to allocated buffer otherwise just return NULL. > > So it will look like "char *GetCancelMessage()” It doesn’t seem like a good idea to perform memory allocation inside a spinlock in a signalled backend, that would probably at least require an LWLock wouldn’t it? It seems safer to leave memory management to the signalled backend but it may be paranoia on my part. > + SpinLockAcquire(&slot->mutex); > + strlcpy(*buffer, (const char *) slot->message, buf_len); > > strlcpy(*buffer, (const char *) slot->message, slot->len); > > Isn't it better to copy only upto slot->len, seems like it will always > be <= buf_len, and if not then > we can do min(buf_len, slot->len) strlcpy(3) is defined as taking the size of the passed buffer and not the copied string. Since we guarantee that slot->message is NUL terminated it seems wise to stick to the API. Or did I misunderstand your comment? cheers ./daniel
+1 On Tue, Jun 20, 2017 at 8:54 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Unless you have a lot of users running psql manually, I don't see how > this is actually very useful or actionable. What would the user do with > the information? Hopefully your users already trust that you'd keep the > downtime to the minimum possible. I think this feature would be useful for PgTerminator (https://github.com/trustly/pgterminator) a tool which automatically kills unprotected processes that could potentially be the reason why >X number of protected important processes have been waiting for >Y seconds. When I'm guilty of locking this in the production DB and get killed by PgTerminator, it would be nice to know the reason, e.g. that it was PgTerminator that killed me and what process I was blocking.
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On 06/26/2017 07:15 AM, Joel Jacobson wrote: > +1 > > On Tue, Jun 20, 2017 at 8:54 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> Unless you have a lot of users running psql manually, I don't see how >> this is actually very useful or actionable. What would the user do with >> the information? Hopefully your users already trust that you'd keep the >> downtime to the minimum possible. > > I think this feature would be useful for PgTerminator > (https://github.com/trustly/pgterminator) > a tool which automatically kills unprotected processes that could > potentially be the reason why >> X number of protected important processes have been waiting for >Y seconds. > > When I'm guilty of locking this in the production DB and get killed by > PgTerminator, > it would be nice to know the reason, e.g. that it was PgTerminator > that killed me > and what process I was blocking. And not just the pid but literally "what". jD -- Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc PostgreSQL Centered full stack support, consulting and development. Advocate: @amplifypostgres || Learn: https://pgconf.us ***** Unless otherwise stated, opinions are my own. *****
On Fri, Jun 23, 2017 at 1:48 AM, Daniel Gustafsson <daniel@yesql.se> wrote: > Good point. I’ve attached a new version which issues a NOTICE on truncation > and also addresses all other comments so far in this thread. Thanks a lot for > the early patch reviews! FYI this no longer builds because commit 81c5e46c490e just stole your OIDs: make[3]: Entering directory `/home/travis/build/postgresql-cfbot/postgresql/src/backend/catalog' cd ../../../src/include/catalog && '/usr/bin/perl' ./duplicate_oids 772 972 make[3]: *** [postgres.bki] Error 1 -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 02 Sep 2017, at 02:21, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > > On Fri, Jun 23, 2017 at 1:48 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >> Good point. I’ve attached a new version which issues a NOTICE on truncation >> and also addresses all other comments so far in this thread. Thanks a lot for >> the early patch reviews! > > FYI this no longer builds because commit 81c5e46c490e just stole your OIDs: > > make[3]: Entering directory > `/home/travis/build/postgresql-cfbot/postgresql/src/backend/catalog' > cd ../../../src/include/catalog && '/usr/bin/perl' ./duplicate_oids > 772 > 972 > make[3]: *** [postgres.bki] Error 1 Thanks, I hadn’t spotted that yet. Attached is an updated patch using new OIDs to make it compile again. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Sun, 3 Sep 2017 22:47:10 +0200 Daniel Gustafsson <daniel@yesql.se> wrote: I have reviewed your latest patch. I can apply this to the master branch and build this successfully, and the behavior is as expected. However, here are some comments and suggestions. > 135 + char *buffer = palloc0(MAX_CANCEL_MSG); > 136 + > 137 + GetCancelMessage(&buffer, MAX_CANCEL_MSG); > 138 + ereport(ERROR, > 139 + (errcode(ERRCODE_QUERY_CANCELED), > 140 + errmsg("canceling statement due to user request: \"%s\"", > 141 + buffer))); The memory for buffer is allocated here, but I think we can do this in GetCancelMessage. Since the size of allocated memory is fixed to MAX_CANCEL_MSG, it isn't neccesary to pass this to the function. In addition, how about returning the message as the return value? That is, we can define GetCancelMessage as following; char * GetCancelMessage(void) > 180 + r = SetBackendCancelMessage(pid, msg); > 181 + > 182 + if (r != -1 && r != strlen(msg)) > 183 + ereport(NOTICE, > 184 + (errmsg("message is too long and has been truncated"))); > 185 + } We can this error handling in SetBackendCancelMessage. I think this is better because the truncation of message is done in this function. In addition, we should use pg_mbcliplen for this truncation as done in truncate_identifier. Else, multibyte character boundary is broken, and noises can slip in log messages. > 235 - int r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM); > 236 + int r = pg_signal_backend(pid, SIGTERM, msg); This line includes an unnecessary indentation change. > 411 + * returns the length of message actually created. If the returned length "the length of message" might be "the length of the message" > 413 + * If the backend wasn't found and no message was set, -1 is returned. If two This comment is incorrect since this function returns 0 when no message was set. > 440 + strlcpy(slot->message, message, sizeof(slot->message)); > 441 + slot->len = strlen(slot->message); > 442 + message_len = slot->len; > 443 + SpinLockRelease(&slot->mutex); > 444 + > 445 + return message_len; This can return slot->len directly and the variable message_len is unnecessary. However, if we handle the "too long message" error in this function as suggested above, this does not have to return anything. Regards, -- Yugo Nagata <nagata@sraoss.co.jp> > > On 02 Sep 2017, at 02:21, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > > > > On Fri, Jun 23, 2017 at 1:48 AM, Daniel Gustafsson <daniel@yesql.se> wrote: > >> Good point. I’ve attached a new version which issues a NOTICE on truncation > >> and also addresses all other comments so far in this thread. Thanks a lot for > >> the early patch reviews!" > > > > FYI this no longer builds because commit 81c5e46c490e just stole your OIDs: > > > > make[3]: Entering directory > > `/home/travis/build/postgresql-cfbot/postgresql/src/backend/catalog' > > cd ../../../src/include/catalog && '/usr/bin/perl' ./duplicate_oids > > 772 > > 972 > > make[3]: *** [postgres.bki] Error 1 > > Thanks, I hadn’t spotted that yet. Attached is an updated patch using new OIDs > to make it compile again. > > cheers ./daniel > -- Yugo Nagata <nagata@sraoss.co.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 28 Sep 2017, at 14:55, Yugo Nagata <nagata@sraoss.co.jp> wrote: > > On Sun, 3 Sep 2017 22:47:10 +0200 > Daniel Gustafsson <daniel@yesql.se> wrote: > > I have reviewed your latest patch. > > I can apply this to the master branch and build this successfully, > and the behavior is as expected. > > However, here are some comments and suggestions. Thanks for the review! As I haven’t had time to address the comments for a new versin of this patch, I’m closing this as Returned with Feedback and will re-submit for the next commitfest. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 28 Sep 2017, at 14:55, Yugo Nagata <nagata@sraoss.co.jp> wrote: > > On Sun, 3 Sep 2017 22:47:10 +0200 > Daniel Gustafsson <daniel@yesql.se> wrote: > > I have reviewed your latest patch. Thanks! > I can apply this to the master branch and build this successfully, > and the behavior is as expected. > > However, here are some comments and suggestions. > >> 135 + char *buffer = palloc0(MAX_CANCEL_MSG); >> 136 + >> 137 + GetCancelMessage(&buffer, MAX_CANCEL_MSG); >> 138 + ereport(ERROR, >> 139 + (errcode(ERRCODE_QUERY_CANCELED), >> 140 + errmsg("canceling statement due to user request: \"%s\"", >> 141 + buffer))); > > The memory for buffer is allocated here, but I think we can do this > in GetCancelMessage. Since the size of allocated memory is fixed > to MAX_CANCEL_MSG, it isn't neccesary to pass this to the function. > In addition, how about returning the message as the return value? > That is, we can define GetCancelMessage as following; > > char * GetCancelMessage(void) I agree that it would be a much neater API, but it would mean pallocing inside the spinlock wouldn’t it? That would be a no-no. >> 180 + r = SetBackendCancelMessage(pid, msg); >> 181 + >> 182 + if (r != -1 && r != strlen(msg)) >> 183 + ereport(NOTICE, >> 184 + (errmsg("message is too long and has been truncated"))); >> 185 + } > > We can this error handling in SetBackendCancelMessage. I think this is better > because the truncation of message is done in this function. In addition, > we should use pg_mbcliplen for this truncation as done in truncate_identifier. > Else, multibyte character boundary is broken, and noises can slip in log > messages. Agreed on both points. I did however leave the returnvalue as before even though it’s not read in this coding. >> 235 - int r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM); >> 236 + int r = pg_signal_backend(pid, SIGTERM, msg); > > This line includes an unnecessary indentation change. That’s embarrasing, fixed. >> 411 + * returns the length of message actually created. If the returned length > > "the length of message" might be "the length of the message” Fixed. >> 413 + * If the backend wasn't found and no message was set, -1 is returned. If two > > This comment is incorrect since this function returns 0 when no message was set. Fixed. >> 440 + strlcpy(slot->message, message, sizeof(slot->message)); >> 441 + slot->len = strlen(slot->message); >> 442 + message_len = slot->len; >> 443 + SpinLockRelease(&slot->mutex); >> 444 + >> 445 + return message_len; > > This can return slot->len directly and the variable message_len is > unnecessary. However, if we handle the "too long message" error > in this function as suggested above, this does not have to > return anything. That would mean returning a variable which was set while holding the lock, when the lock has been released. That doesn’t seem like a good idea, even though it may be of more theoretical importance here. Attached patch is rebased over HEAD and addresses the above review comments. Adding to the next commitfest as it was returned in a previous one. cheers ./daniel
Attachment
A quick suggestion from a passer-by -- would you put the new code in src/backend/storage/ipc/backend_signal.c instead? Sounds like a better place than utils/misc/; and "signal" is more general in nature than just "cancel". A bunch of stuff from utils/adt/misc.c (???) can migrate to the new file -- everything from line 201 to 362 at least, that is: pg_signal_backend pg_cancel_backend pg_terminate_backend pg_reload_conf pg_rotate_logfile Maybe have two patches, 0001 creates the files moving the contents over, then 0002 adds your new stuff on top. /me wonders if there's anything that would suggest to make this extensive to processes other than backends ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Wed, Jan 24, 2018 at 12:45:48PM -0300, Alvaro Herrera wrote: > /me wonders if there's anything that would suggest to make this > extensive to processes other than backends ... That's a good thought. Now ProcessInterrupts() is not used by non-backend processes. For example the WAL receiver has its own logic to handle interrupts but I guess that we could have an interface which can be plugged in for any processes, which is by default enabled for bgworkers. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 24 Jan 2018, at 16:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > A quick suggestion from a passer-by -- would you put the new code in > src/backend/storage/ipc/backend_signal.c instead? Sounds like a better > place than utils/misc/; and "signal" is more general in nature than just > "cancel". A bunch of stuff from utils/adt/misc.c (???) can migrate to > the new file -- everything from line 201 to 362 at least, that is: > > pg_signal_backend > pg_cancel_backend > pg_terminate_backend > pg_reload_conf > pg_rotate_logfile +1, this makes a lot of sense. When writing this I didn’t find a good fit anywhere so I modelled it after utils/misc/backend_random.c, not because it was a good fit but it was the least bad one I could come up with. This seems a lot cleaner. > Maybe have two patches, 0001 creates the files moving the contents over, > then 0002 adds your new stuff on top. The two attached patches implements this. > /me wonders if there's anything that would suggest to make this > extensive to processes other than backends ... Perhaps, do you have an API in mind? cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 26 Jan 2018, at 00:05, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 24 Jan 2018, at 16:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> Maybe have two patches, 0001 creates the files moving the contents over, >> then 0002 adds your new stuff on top. > > The two attached patches implements this. Attached are rebased patches to cope with the recent pgproc changes. I also made the function cope with NULL messages, not because it’s a sensible value but I can see this function being fed from a sub-SELECT which could return NULL. As per the previous mail, 0001 refactors the signal code according to Alvaros suggestion and 0002 implements $subject on top of the refactoring. cheers ./daniel
Attachment
> On 26 Jan 2018, at 00:05, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 24 Jan 2018, at 16:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>> Maybe have two patches, 0001 creates the files moving the contents over,
>> then 0002 adds your new stuff on top.
>
> The two attached patches implements this.
Attached are rebased patches to cope with the recent pgproc changes. I also
made the function cope with NULL messages, not because it’s a sensible value
but I can see this function being fed from a sub-SELECT which could return
NULL.
As per the previous mail, 0001 refactors the signal code according to Alvaros
suggestion and 0002 implements $subject on top of the refactoring.
cheers ./daniel
Re: Eren Başak 2018-03-20 <CAFNTstPcstV8Brqkg00a84V72b_FfnLinhu2C2Top+QssmwFhg@mail.gmail.com> > Another thing is that, in a similar manner, we could allow changing the > error code which might be useful for extensions. For example, Citus could > use it to cancel remote backends when it detects a distributed deadlock and > changes the error code to something retryable while doing so. Another useful thing to do on top of this patch would be to include messages when the termination comes from postgres itself, e.g. on a server shutdown. Possibly, the message for pg_terminate_backend() itself could say that someone invoke that, unless overridden. FATAL: 57P01: terminating connection due to administrator command: server shutting down FATAL: 57P01: terminating connection due to administrator command: restarting because of a crash of another server process FATAL: 57P01: terminating connection due to administrator command: terminated by pg_terminate_backend() Christoph
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 20 Mar 2018, at 12:12, Eren Başak <eren@citusdata.com> wrote: > I reviewed the patch. Here are my notes: Thanks! > The patch does not add new tests (maybe isolation tests) for the new feature. Are there any opportunities to have testsfor confirming the new behavior? Good point, not sure why I’ve forgotten to add this. No existing suite seemed to match well, so I added a new one for system administration functions like this one. > Not introduced with this patch, but the spacing between the functions is not consistent in src/backend/storage/ipc/backend_signal.c.There are 2 newlines after some functions (like pg_cancel_backend_msg) and 1 newlineafter others (like pg_terminate_backend_msg). It would be nice to fix these while refactoring. Fixed. > Another thing is that, in a similar manner, we could allow changing the error code which might be useful for extensions.For example, Citus could use it to cancel remote backends when it detects a distributed deadlock and changes theerror code to something retryable while doing so. For reference, see the hack that Citus is currently using: > https://github.com/citusdata/citus/blob/81cbb7c223f2eb9cfa8903f1d28869b6f972ded1/src/backend/distributed/shared_library_init.c#L237 In 20170620200134.ubnv4sked5seolyk@alap3.anarazel.de, Andres suggested the same thing. I don’t disagree, but I’m also not sure what the API would look like so I’d prefer to address that in a follow-up patch should this one get accepted. > + If the optional <literal>message</literal> parameter is set, the text > + will be appended to the error message returned to the signalled backend. > > I think providing more detail would be useful. For example we can provide an example about how the error message lookslike in its final form or what will happen if the message is too long. Expanded the documentation a little and added a (contrived) example. > -pg_signal_backend(int pid, int sig) > +pg_signal_backend(int pid, int sig, char *msg) > > The new parameter (msg) is not mentioned in the function header comment. This applies to pg_signal_backend, pg_cancel_backend_internal,pg_terminate_backend_internal functions. Fixed. > +Datum > +pg_cancel_backend(PG_FUNCTION_ARGS) > +{ > + PG_RETURN_BOOL(pg_cancel_backend_internal(PG_GETARG_INT32(0), NULL)); > +} > + > +Datum > +pg_cancel_backend_msg(PG_FUNCTION_ARGS) > +{ > + pid_t pid; > + char *msg = NULL; > + > + if (PG_ARGISNULL(0)) > + ereport(ERROR, > + (errmsg("pid cannot be NULL"))); > + > + pid = PG_GETARG_INT32(0); > + > + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) > + msg = text_to_cstring(PG_GETARG_TEXT_PP(1)); > + > + PG_RETURN_BOOL(pg_cancel_backend_internal(pid, msg)); > +} > > The first function seem redundant here because the second one covers all the cases. pg_cancel_backend() is defined proisstrict, while pg_cancel_backend_msg() is not. I think we must be able to perform the cancellation if the message is NULL, so made it two functions. Thinking more about this, I think pg_cancel_backend_msg() should handle a NULL pid in the same way as pg_cancel_backend() so fixed that as well. > + memset(slot->message, '\0', sizeof(slot->message)); > > SetBackendCancelMessage uses memset while BackendCancelShmemInit uses MemSet. Not a big deal but it would be nice to beconsistent and use postgres macro versions for such calls. Good point, moved to MemSet() for both. > +int > +SetBackendCancelMessage(pid_t backend, char *message) > +{ > + BackendCancelShmemStruct *slot; > > The variable "slot" is declared outside of the foor loop but only used in the inside, therefore it would be nicer to declareit inside the loop. Moved to inside the loop. > + BackendCancelInit(MyBackendId); > > Is the "normal multiuser case" the best place to initialize cancellation? For example, can't we cancel background workers?If this is the right place, maybe we should justify why this is the best place to initialize backend cancellationmemory part with a comment. I didn’t envision this being in another setting than the multiuser case, but it’s been clear throughout the thread that others have had good ideas around extended uses of this. Background workers can still be terminated/canceled, this only affects setting the message, and that is to me a multiuser feature. Renamed the functions BackendCancelMessage*() for clarity. > + char *buffer = palloc0(MAX_CANCEL_MSG); > > Why not use a static array like char[MAX_CANCEL_MSG], instead of pallocing? No specific reason, changed. > +/* > + * Return the configured cancellation message and its length. If the returned > + * length is greater than the size of the passed buffer, truncation has been > + * performed. The message is cleared on reading. > + */ > +int > +GetCancelMessage(char **buffer, size_t buf_len) ... > We never change what the buffer points to, therefore is it necessary to declare `buffer` as char** instead of char*? Fixed > +extern int SetBackendCancelMessage(pid_t backend, char *message); > > Maybe rename backend to something like backend_pid. Makes sense, changed. > I think a function named GetStuff should not clear after getting the stuff. Therefore either the function should be namedsomething like ConsumeCancelMessage or we should separate the "get" and "clear" concerns to different functions. Good point, changed to ConsumeCancelMessage() > I don't think providing a single header comment for the functions below this (CancelBackendMsgShmemSize, BackendCancelShmemInit,BackendCancelInit, CleanupCancelBackend) is enough. More detailed comments about what each functiondoes would be useful. Comments expanded. > It took me some time to get used to the function names. The confusion was about whether the function name tells "cancelthe initialization process" or "initialize backend cancellation memory”. Naming is hard. I’m not wed to any name used, and am open to any suggestions that can improve code clarity. Attached patches are rebased over HEAD with all of the above addressed. cheers ./daniel
Attachment
> thing. I don’t disagree, but I’m also not sure what the API would look like so
> I’d prefer to address that in a follow-up patch should this one get accepted.
> not. I think we must be able to perform the cancellation if the message is
> NULL, so made it two functions.
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 27 Mar 2018, at 13:50, Eren Başak <eren@citusdata.com> wrote: > > pg_cancel_backend() is defined proisstrict, while pg_cancel_backend_msg() is > > not. I think we must be able to perform the cancellation if the message is > > NULL, so made it two functions. > > I agree that we should preserve the old usage as well. What I don't understand is that, can't we remove proisstrict frompg_cancel_backend and copy the body of pg_cancel_backend_msg into pg_cancel_backend. This way, I think we can accomplishthe same thing without introducing two new functions. If we do that, wouldn’t SELECT pg_cancel_backend(NULL) fail unless NULL is explicitly casted to integer? Also, I think that would cause make check to fail the opr_sanity suite on the “multiple uses of the same internal function” test. Maybe I’m misunderstanding your point here? > pg_terminate_backend_msg errors out if the pid is null but pg_cancel_backend_msg returns null and I think we should haveconsistency between these two in this regard. Absolutely, that was an oversight in the v8 patch, they should both handle NULL in the same way. Whitespace and null handling fixed, as well as rebased over HEAD. cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
The documentation change suggests the error will be ERROR: canceling statement due to user request: "Cancellation message text" Why the quotes? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 05 Apr 2018, at 23:16, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > The documentation change suggests the error will be > > ERROR: canceling statement due to user request: "Cancellation message text" > > Why the quotes? It seemed like a good idea at the time to indicate which part was submitted by the user, but looking at it now the colon sign is a pretty clear indicator already. cheers ./daniel
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Thu, Apr 05, 2018 at 11:33:57PM +0200, Daniel Gustafsson wrote: > It seemed like a good idea at the time to indicate which part was submitted by > the user, but looking at it now the colon sign is a pretty clear indicator > already. Dropping the quotes is more consistent with other error messages. We don't bother about quoting things for example with system-related errors generated by strerror(). -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 06 Apr 2018, at 04:49, Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Apr 05, 2018 at 11:33:57PM +0200, Daniel Gustafsson wrote: >> It seemed like a good idea at the time to indicate which part was submitted by >> the user, but looking at it now the colon sign is a pretty clear indicator >> already. > > Dropping the quotes is more consistent with other error messages. We > don't bother about quoting things for example with system-related errors > generated by strerror(). Yep, I completely agree. Attached are patches with the quotes removed and rebased since Oids were taken etc. cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Fri, Apr 06, 2018 at 11:18:34AM +0200, Daniel Gustafsson wrote: > Yep, I completely agree. Attached are patches with the quotes removed and > rebased since Oids were taken etc. I still find this idea interesting for plugin authors. However, as feature freeze for v11 is in effect, I am marking the patch as returned with feedback. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancelling backend
> On 09 Apr 2018, at 02:47, Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Apr 06, 2018 at 11:18:34AM +0200, Daniel Gustafsson wrote: >> Yep, I completely agree. Attached are patches with the quotes removed and >> rebased since Oids were taken etc. > > I still find this idea interesting for plugin authors. However, as > feature freeze for v11 is in effect, I am marking the patch as returned > with feedback. Not sure what the feedback is though? The patch has been through thorough review with all review comments addressed. Will resubmit this for a v12 CF. cheers ./daniel
Hi, On 2017-06-20 13:01:35 -0700, Andres Freund wrote: > For extensions it'd also be useful if it'd be possible to overwrite the > error code. E.g. for citus there's a distributed deadlock detector, > running out of process because there's no way to interrupt lock waits > locally, and we've to do some ugly hacking to generate proper error > messages and code from another session. What happened to this request? Seems we're out of the crunch mode and could round the feature out a littlebit more... Greetings, Andres Freund
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 9 Apr 2018, at 23:55, Andres Freund <andres@anarazel.de> wrote: > On 2017-06-20 13:01:35 -0700, Andres Freund wrote: >> For extensions it'd also be useful if it'd be possible to overwrite the >> error code. E.g. for citus there's a distributed deadlock detector, >> running out of process because there's no way to interrupt lock waits >> locally, and we've to do some ugly hacking to generate proper error >> messages and code from another session. > > What happened to this request? Seems we're out of the crunch mode and > could round the feature out a littlebit more… Revisiting old patches, I took a stab at this request. Since I don’t really have a use case for altering the sqlerrcode other than the on that Citus.. cited, I modelled the API around that. The slot now holds a sqlerrcode as well as a message, with functions to just set the message keeping the default sqlerrcode for when that is all one wants to do. There is no function for just altering the sqlerrcode without a message as that seems not useful to me. The combination of sqlerrcode and message was dubbed SignalFeedback for lack of a better term. With this I also addressed something that annoyed me; I had called all the functions Cancel* which technically isn’t true since we also support termination. There are no new user facing changes in patch compared to the previous version. This patchset still has the refactoring that Alvaro brought up upthread. Parking this again the commitfest as it was returned with feedback from the last one it was in (all review comments addressed, see upthread). cheers ./daniel
Attachment
Hi, Onder, I CCed you because it seems worthwhile to ensure that the relevant Citus code could use this instead of the gross hack you and I committed... On 2018-06-13 20:54:03 +0200, Daniel Gustafsson wrote: > > On 9 Apr 2018, at 23:55, Andres Freund <andres@anarazel.de> wrote: > > On 2017-06-20 13:01:35 -0700, Andres Freund wrote: > >> For extensions it'd also be useful if it'd be possible to overwrite the > >> error code. E.g. for citus there's a distributed deadlock detector, > >> running out of process because there's no way to interrupt lock waits > >> locally, and we've to do some ugly hacking to generate proper error > >> messages and code from another session. > > > > What happened to this request? Seems we're out of the crunch mode and > > could round the feature out a littlebit more… > > Revisiting old patches, I took a stab at this request. > > Since I don’t really have a use case for altering the sqlerrcode other than the > on that Citus.. cited, I modelled the API around that. Cool. Onder? > diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml > index b851fe023a..ad9763698f 100644 > --- a/doc/src/sgml/func.sgml > +++ b/doc/src/sgml/func.sgml > @@ -18540,7 +18540,7 @@ SELECT set_config('log_statement_stats', 'off', false); > <tbody> > <row> > <entry> > - <literal><function>pg_cancel_backend(<parameter>pid</parameter> <type>int</type>)</function></literal> > + <literal><function>pg_cancel_backend(<parameter>pid</parameter> <type>int</type> [, <parameter>message</parameter><type>text</type>])</function></literal> > </entry> > <entry><type>boolean</type></entry> > <entry>Cancel a backend's current query. This is also allowed if the > @@ -18565,7 +18565,7 @@ SELECT set_config('log_statement_stats', 'off', false); > </row> > <row> > <entry> > - <literal><function>pg_terminate_backend(<parameter>pid</parameter> <type>int</type>)</function></literal> > + <literal><function>pg_terminate_backend(<parameter>pid</parameter> <type>int</type> [, <parameter>message</parameter><type>text</type>])</function></literal> > </entry> > <entry><type>boolean</type></entry> > <entry>Terminate a backend. This is also allowed if the calling role > @@ -18596,6 +18596,14 @@ SELECT set_config('log_statement_stats', 'off', false); > The role of an active backend can be found from the > <structfield>usename</structfield> column of the > <structname>pg_stat_activity</structname> view. > + If the optional <literal>message</literal> parameter is set, the text > + will be appended to the error message returned to the signalled backend. > + <literal>message</literal> is limited to 128 bytes, any longer text > + will be truncated. An example where we cancel our own backend: > +<programlisting> > +postgres=# SELECT pg_cancel_backend(pg_backend_pid(), 'Cancellation message text'); > +ERROR: canceling statement due to user request: Cancellation message text > +</programlisting> > </para> I'm not sure I really like the appending bit. There's a security argument to be made about doing so, but from a user POV that mostly seems restrictive. I wonder if that aspect would be better handled by adding an error context that contains information about which process did the cancellation (or DETAIL?)? > +/* > + * Structure for registering a feedback payload to be sent to a cancelled, or > + * terminated backend. Each backend is registered per pid in the array which is > + * indexed by Backend ID. Reading and writing the message is protected by a > + * per-slot spinlock. > + */ > +typedef struct > +{ > + pid_t pid; This is the pid of the receiving process? If so, why do we need this in here? That's just duplicated data, no? > + slock_t mutex; > + char message[MAX_CANCEL_MSG]; > + int len; > + int sqlerrcode; I'd vote for including the pid of the process that did the cancelling in here. > +Datum > +pg_cancel_backend(PG_FUNCTION_ARGS) > +{ > + PG_RETURN_BOOL(pg_cancel_backend_internal(PG_GETARG_INT32(0), NULL)); > +} > + > +Datum > +pg_cancel_backend_msg(PG_FUNCTION_ARGS) > +{ > + pid_t pid; > + char *msg = NULL; > + > + if (PG_ARGISNULL(0)) > + PG_RETURN_NULL(); > + > + pid = PG_GETARG_INT32(0); > + > + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) > + msg = text_to_cstring(PG_GETARG_TEXT_PP(1)); > + > + PG_RETURN_BOOL(pg_cancel_backend_internal(pid, msg)); > +} Why isn't this just one function? Given that you already have a PG_NARGS check, I don't quite see the point? > /* > * Signal to terminate a backend process. This is allowed if you are a member > * of the role whose process is being terminated. > * > * Note that only superusers can signal superuser-owned processes. > */ > -Datum > -pg_terminate_backend(PG_FUNCTION_ARGS) > +static bool > +pg_terminate_backend_internal(pid_t pid, char *msg) > { > - int r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM); > + int r = pg_signal_backend(pid, SIGTERM, msg); > > if (r == SIGNAL_BACKEND_NOSUPERUSER) > ereport(ERROR, > @@ -146,7 +203,30 @@ pg_terminate_backend(PG_FUNCTION_ARGS) > (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > (errmsg("must be a member of the role whose process is being terminated or member of pg_signal_backend")))); > > - PG_RETURN_BOOL(r == SIGNAL_BACKEND_SUCCESS); > + return (r == SIGNAL_BACKEND_SUCCESS); > +} > + > +Datum > +pg_terminate_backend(PG_FUNCTION_ARGS) > +{ > + PG_RETURN_BOOL(pg_terminate_backend_internal(PG_GETARG_INT32(0), NULL)); > +} > + > +Datum > +pg_terminate_backend_msg(PG_FUNCTION_ARGS) > +{ > + pid_t pid; > + char *msg = NULL; > + > + if (PG_ARGISNULL(0)) > + PG_RETURN_NULL(); > + > + pid = PG_GETARG_INT32(0); > + > + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) > + msg = text_to_cstring(PG_GETARG_TEXT_PP(1)); > + > + PG_RETURN_BOOL(pg_terminate_backend_internal(pid, msg)); > } > > /* > @@ -168,7 +248,6 @@ pg_reload_conf(PG_FUNCTION_ARGS) > PG_RETURN_BOOL(true); > } Same. > +/* > + * Set a message for the cancellation of the backend with the specified pid, > + * using the default sqlerrcode. > + */ > +int > +SetBackendCancelMessage(pid_t backend_pid, char *message) > +{ > + return backend_feedback(backend_pid, message, ERRCODE_QUERY_CANCELED); > +} > + > +/* > + * Set a message for the termination of the backend with the specified pid, > + * using the default sqlerrcode. > + */ > +int > +SetBackendTerminationMessage(pid_t backend_pid, char *message) > +{ > + return backend_feedback(backend_pid, message, ERRCODE_ADMIN_SHUTDOWN); > +} > + > +/* > + * Set both a message and a sqlerrcode for use when signalling the backend > + * with the specified pid. > + */ > +int > +SetBackendSignalFeedback(pid_t backend_pid, char *message, int sqlerrcode) > +{ > + return backend_feedback(backend_pid, message, sqlerrcode); > +} I'm not quite seeing the point of these variants. What advantage are they over just doing the same on the caller level? > +/* > + * Sets a cancellation message for the backend with the specified pid, and > + * returns the length of the message actually created. If the returned length > + * is less than the length of the message parameter, truncation has occurred. > + * If the backend isn't found, -1 is returned. If no message is passed, zero is > + * returned. If two backends collide in setting a message, the existing message > + * will be overwritten by the last one in. > + */ > +static int > +backend_feedback(pid_t backend_pid, char *message, int sqlerrcode) > +{ > + int i; > + int len; > + > + if (!message) > + return 1; > + > + len = pg_mbcliplen(message, strlen(message), MAX_CANCEL_MSG - 1); > + > + for (i = 0; i < MaxBackends; i++) > + { > + BackendSignalFeedbackShmemStruct *slot = &BackendSignalFeedbackSlots[i]; > + > + if (slot->pid != 0 && slot->pid == backend_pid) > + { > + SpinLockAcquire(&slot->mutex); > + if (slot->pid != backend_pid) > + { > + SpinLockRelease(&slot->mutex); > + goto error; > + } > + > + MemSet(slot->message, '\0', sizeof(slot->message)); > + memcpy(slot->message, message, len); This seems unnecessarily expensive. > + slot->len = len; > + slot->sqlerrcode = sqlerrcode; > + SpinLockRelease(&slot->mutex); > + > + if (len != strlen(message)) > + ereport(NOTICE, > + (errmsg("message is too long and has been truncated"))); > + return 0; > + } > + } So we made cancellation a O(N) proposition :/. Probably not too bad, but still... > +error: > + > + elog(LOG, "Cancellation message requested for missing backend %d by %d", > + (int) backend_pid, MyProcPid); > + > + return 1; > +} So we now do log spam if processes exit in the wrong moment? > +/* > + * Test whether there is feedback registered for the current backend that can > + * be consumed and presented to the user. > + */ > +bool > +HasBackendSignalFeedback(void) > +{ > + volatile BackendSignalFeedbackShmemStruct *slot = MyCancelSlot; > + bool has_message = false; > + > + if (slot != NULL) > + { > + SpinLockAcquire(&slot->mutex); > + has_message = ((slot->len > 0) && (slot->sqlerrcode != 0)); > + SpinLockRelease(&slot->mutex); > + } > + > + return has_message; > +} I'm somewhat uncomfortable with acquiring a mutex in an error path. We used to prcoess at least some interrupts in signal handlers in some cases - I think I removed all of them, but if not, we'd be in deep trouble here. Wonder if we shouldn't try to get around needing that... Greetings, Andres Freund
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 13 Jun 2018, at 21:16, Andres Freund <andres@anarazel.de> wrote: Thanks for the review! > I'm not sure I really like the appending bit. There's a security > argument to be made about doing so, but from a user POV that mostly > seems restrictive. I wonder if that aspect would be better handled by > adding an error context that contains information about which process > did the cancellation (or DETAIL?)? That doesn’t sound like a bad idea at all, I took a stab at that in the attached patch to see what it would look like. Is that along the lines of what you had in mind? It does however make testing harder as the .out file can’t have the matching pid (commented out the test in the attached hoping that there is a way to test it that I fail to see). >> +/* >> + * Structure for registering a feedback payload to be sent to a cancelled, or >> + * terminated backend. Each backend is registered per pid in the array which is >> + * indexed by Backend ID. Reading and writing the message is protected by a >> + * per-slot spinlock. >> + */ >> +typedef struct >> +{ >> + pid_t pid; > > This is the pid of the receiving process? If so, why do we need this in > here? That's just duplicated data, no? Correct, we need that for finding the right slot in backend_feedback() unless I’m missing something? >> + slock_t mutex; >> + char message[MAX_CANCEL_MSG]; >> + int len; >> + int sqlerrcode; > > I'd vote for including the pid of the process that did the cancelling in > here. Did that for the above discussed log message change. >> [ .. ] > > Why isn't this just one function? Given that you already have a PG_NARGS > check, I don't quite see the point? I was trying to retain STRICT on pg_cancel_backend(int4) for no food reason, and I’m not entirely sure what the reason was anymore. Fixed in the attached version. >> [ .. ] > > I'm not quite seeing the point of these variants. What advantage are > they over just doing the same on the caller level? It seemed a cleaner API to provide these for when the caller just want to provide messaging (which is where I started this a long time ago). I’m not wed to the idea at all though, they are clearly just for convenience. >> + MemSet(slot->message, '\0', sizeof(slot->message)); >> + memcpy(slot->message, message, len); > > This seems unnecessarily expensive. My goal was to safeguard against the risk of fragments from an undelivered message being delivered to another backend, which might be overly cautious. Using strlcpy() instead to guarantee termination is perhaps enough? >> + slot->len = len; >> + slot->sqlerrcode = sqlerrcode; >> + SpinLockRelease(&slot->mutex); >> + >> + if (len != strlen(message)) >> + ereport(NOTICE, >> + (errmsg("message is too long and has been truncated"))); >> + return 0; >> + } >> + } > > So we made cancellation a O(N) proposition :/. Probably not too bad, but > still… When setting a message to be passed to the backend yes, with an upper bound of MaxBackends. >> [ .. ] > > So we now do log spam if processes exit in the wrong moment? Yeah, that was rather pointless. Removed in the attached version. > I'm somewhat uncomfortable with acquiring a mutex in an error path. We > used to prcoess at least some interrupts in signal handlers in some > cases - I think I removed all of them, but if not, we'd be in deep > trouble here. Wonder if we shouldn't try to get around needing that… I’m all ears if you have a better idea, this was the only option I could come up with. cheers ./daniel
Attachment
On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote: > attached Hi Daniel, 6118 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); 6119 select pg_cancel_backend(pg_backend_pid(), NULL); 6120! ERROR: canceling statement due to user request 6121--- 25,32 ---- 6122 6123 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); 6124 select pg_cancel_backend(pg_backend_pid(), NULL); 6125! pg_cancel_backend 6126! ------------------- 6127! t Apparently Windows can take or leave it as it pleases. https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488 -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Fri, Jul 06, 2018 at 12:18:02PM +1200, Thomas Munro wrote: > 6118 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); > 6119 select pg_cancel_backend(pg_backend_pid(), NULL); > 6120! ERROR: canceling statement due to user request > 6121--- 25,32 ---- > 6122 > 6123 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); > 6124 select pg_cancel_backend(pg_backend_pid(), NULL); > 6125! pg_cancel_backend > 6126! ------------------- > 6127! t > > Apparently Windows can take or leave it as it pleases. > > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488 The test coverage looks adapted if it is possible to catch such failures, so that's nice. +select pg_cancel_backend(); +ERROR: function pg_cancel_backend() does not exist +LINE 1: select pg_cancel_backend(); This negative test is not really necessary. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 6 Jul 2018, at 14:12, Pavel Stehule <pavel.stehule@gmail.com> wrote: > I am testing last patch and looks so truncating message and appending "..." doesn't work. Thanks for testing, and sorry about the late response. > The slot->len hold trimmed length, not original length. Fixed in the attached rebased patchset together with a few small touch-ups such as fixed documentation. cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 6 Jul 2018, at 03:47, Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Jul 06, 2018 at 12:18:02PM +1200, Thomas Munro wrote: >> 6118 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); >> 6119 select pg_cancel_backend(pg_backend_pid(), NULL); >> 6120! ERROR: canceling statement due to user request >> 6121--- 25,32 ---- >> 6122 >> 6123 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); >> 6124 select pg_cancel_backend(pg_backend_pid(), NULL); >> 6125! pg_cancel_backend >> 6126! ------------------- >> 6127! t >> >> Apparently Windows can take or leave it as it pleases. >> >> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488 > > The test coverage looks adapted if it is possible to catch such > failures, so that's nice. > > +select pg_cancel_backend(); > +ERROR: function pg_cancel_backend() does not exist > +LINE 1: select pg_cancel_backend(); > This negative test is not really necessary. I guess you’re right, it’s a bit superfluous. Removed in the last sent patchset. cheers ./daniel
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 6 Jul 2018, at 02:18, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > > On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >> attached > > Hi Daniel, > > 6118 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); > 6119 select pg_cancel_backend(pg_backend_pid(), NULL); > 6120! ERROR: canceling statement due to user request > 6121--- 25,32 ---- > 6122 > 6123 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); > 6124 select pg_cancel_backend(pg_backend_pid(), NULL); > 6125! pg_cancel_backend > 6126! ------------------- > 6127! t > > Apparently Windows can take or leave it as it pleases. Well played =) > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488 That reads to me like it’s cancelling another backend than the current one, which clearly isn’t right as we’re passing pg_backend_pid(). I can’t really see what Windows specific bug was introduced by this patch though (or well, the bug exhibits itself on Windows but it may well be generic of course). Will continue to hunt. cheers ./daniel
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 24 Jul 2018, at 22:57, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 6 Jul 2018, at 02:18, Thomas Munro <thomas.munro@enterprisedb.com> wrote: >> >> On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >>> attached >> >> Hi Daniel, >> >> 6118 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); >> 6119 select pg_cancel_backend(pg_backend_pid(), NULL); >> 6120! ERROR: canceling statement due to user request >> 6121--- 25,32 ---- >> 6122 >> 6123 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); >> 6124 select pg_cancel_backend(pg_backend_pid(), NULL); >> 6125! pg_cancel_backend >> 6126! ------------------- >> 6127! t >> >> Apparently Windows can take or leave it as it pleases. > > Well played =) > >> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488 > > That reads to me like it’s cancelling another backend than the current one, > which clearly isn’t right as we’re passing pg_backend_pid(). I can’t really > see what Windows specific bug was introduced by this patch though (or well, the > bug exhibits itself on Windows but it may well be generic of course). > > Will continue to hunt. Seems the build of the updated patch built and tested Ok. Still have no idea why the previous one didn’t. https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.6695 cheers ./daniel
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
Has there been any consideration to encodings? What happens if the message contains non-ASCII characters, and the sending backend is connected to database that uses a different encoding than the backend being signaled? - Heikki
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > Has there been any consideration to encodings? Thats a good point, no =/ > What happens if the message contains non-ASCII characters, and the sending backend is connected to database that uses adifferent encoding than the backend being signaled? In the current state of the patch, instead of the message you get: FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has no equivalent in encoding “ISO_8859_5" Thats clearly not good enough, but I’m not entirely sure what would be the best way forward. Restrict messages to only be in SQL_ASCII? Store the encoding of the message and check the encoding of the receiving backend before issuing it for a valid conversion, falling back to no message in case there is none? Neither seems terribly appealing, do you have any better suggestions? cheers ./daniel
> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> Has there been any consideration to encodings?
Thats a good point, no =/
> What happens if the message contains non-ASCII characters, and the sending backend is connected to database that uses a different encoding than the backend being signaled?
In the current state of the patch, instead of the message you get:
FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has
no equivalent in encoding “ISO_8859_5"
Thats clearly not good enough, but I’m not entirely sure what would be the best
way forward. Restrict messages to only be in SQL_ASCII? Store the encoding of
the message and check the encoding of the receiving backend before issuing it
for a valid conversion, falling back to no message in case there is none?
Neither seems terribly appealing, do you have any better suggestions?
cheers ./daniel
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 12 Aug 2018, at 07:42, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2018-08-12 0:17 GMT+02:00 Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>>: > > On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> wrote: > > What happens if the message contains non-ASCII characters, and the sending backend is connected to database that usesa different encoding than the backend being signaled? > > In the current state of the patch, instead of the message you get: > > FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has > no equivalent in encoding “ISO_8859_5" > > Where this code fails? Isn't somewhere upper where string literals are translated? Then this message is ok. This happens for example when a UTF-8 backend sends a message with japanese characters to a backend using ISO_8859_5. So the code works as expected, but it’s not a very good user experience. cheers ./daniel
> On 12 Aug 2018, at 07:42, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2018-08-12 0:17 GMT+02:00 Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>>:
> > On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> wrote:
> > What happens if the message contains non-ASCII characters, and the sending backend is connected to database that uses a different encoding than the backend being signaled?
>
> In the current state of the patch, instead of the message you get:
>
> FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" has
> no equivalent in encoding “ISO_8859_5"
>
> Where this code fails? Isn't somewhere upper where string literals are translated? Then this message is ok.
This happens for example when a UTF-8 backend sends a message with japanese
characters to a backend using ISO_8859_5. So the code works as expected, but
it’s not a very good user experience.
cheers ./daniel
On August 12, 2018 12:17:59 AM GMT+02:00, Daniel Gustafsson <daniel@yesql.se> wrote: >> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> >> Has there been any consideration to encodings? > >Thats a good point, no =/ > >> What happens if the message contains non-ASCII characters, and the >sending backend is connected to database that uses a different encoding >than the backend being signaled? > >In the current state of the patch, instead of the message you get: > >FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" >has > no equivalent in encoding “ISO_8859_5" > >Thats clearly not good enough, but I’m not entirely sure what would be >the best >way forward. Restrict messages to only be in SQL_ASCII? Store the >encoding of >the message and check the encoding of the receiving backend before >issuing it >for a valid conversion, falling back to no message in case there is >none? >Neither seems terribly appealing, do you have any better suggestions? Restricting to ASCII seems reasonable. But note that sqlascii isn't that (it's essentially just arbitrary null terminateddata). Easier to relax later. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 12 Aug 2018, at 11:01, Andres Freund <andres@anarazel.de> wrote: > > On August 12, 2018 12:17:59 AM GMT+02:00, Daniel Gustafsson <daniel@yesql.se> wrote: >>> On 6 Aug 2018, at 09:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> >>> Has there been any consideration to encodings? >> >> Thats a good point, no =/ >> >>> What happens if the message contains non-ASCII characters, and the >> sending backend is connected to database that uses a different encoding >> than the backend being signaled? >> >> In the current state of the patch, instead of the message you get: >> >> FATAL: character with byte sequence 0xe3 0x82 0xbd in encoding "UTF8" >> has >> no equivalent in encoding “ISO_8859_5" >> >> Thats clearly not good enough, but I’m not entirely sure what would be >> the best >> way forward. Restrict messages to only be in SQL_ASCII? Store the >> encoding of >> the message and check the encoding of the receiving backend before >> issuing it >> for a valid conversion, falling back to no message in case there is >> none? >> Neither seems terribly appealing, do you have any better suggestions? > > Restricting to ASCII seems reasonable. It’s quite restrictive, but it’s the safe option. I’ve hacked this into the updated patch, but kept the backend_feedback() function using pg_mbstrlen() at least for now since it seems the safe option should this be relaxed at some point. Also added a small test by copying text from a ja.po file in the tree. > But note that sqlascii isn't that (it's essentially just arbitrary null terminated data). Easier to relax later. Yeah, my fingers and brain were not in sync during typing, I meant to say ASCII there. I blame a lack of coffee. cheers ./daniel
Attachment
On Wed, Jul 25, 2018 at 7:27 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >> On 24 Jul 2018, at 22:57, Daniel Gustafsson <daniel@yesql.se> wrote: >> >>> On 6 Jul 2018, at 02:18, Thomas Munro <thomas.munro@enterprisedb.com> wrote: >>> >>> On Fri, Jun 15, 2018 at 9:56 AM, Daniel Gustafsson <daniel@yesql.se> wrote: >>>> attached >>> >>> Hi Daniel, >>> >>> 6118 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); >>> 6119 select pg_cancel_backend(pg_backend_pid(), NULL); >>> 6120! ERROR: canceling statement due to user request >>> 6121--- 25,32 ---- >>> 6122 >>> 6123 --select pg_cancel_backend(pg_backend_pid(), 'it brings on many changes'); >>> 6124 select pg_cancel_backend(pg_backend_pid(), NULL); >>> 6125! pg_cancel_backend >>> 6126! ------------------- >>> 6127! t >>> >>> Apparently Windows can take or leave it as it pleases. >> >> Well played =) >> >>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.4488 >> >> That reads to me like it’s cancelling another backend than the current one, >> which clearly isn’t right as we’re passing pg_backend_pid(). I can’t really >> see what Windows specific bug was introduced by this patch though (or well, the >> bug exhibits itself on Windows but it may well be generic of course). >> >> Will continue to hunt. > > Seems the build of the updated patch built and tested Ok. Still have no idea > why the previous one didn’t. That problem apparently didn't go away. cfbot tested it 7 times in the past week, and it passed only once on Windows: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9691 The other times all failed like this: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9833 -- Thomas Munro http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes: > On Wed, Jul 25, 2018 at 7:27 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >> Seems the build of the updated patch built and tested Ok. Still have no idea >> why the previous one didn’t. > That problem apparently didn't go away. cfbot tested it 7 times in > the past week, and it passed only once on Windows: > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9691 > The other times all failed like this: > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9833 I think this is just a timing problem: the signal gets sent, but it might or might not get received before the current statement ends. It's possible that a signal-sent-to-self can be expected to be received synchronously on all Unix platforms, but I wouldn't entirely bet on that (in particular, the POSIX text for kill(2) does NOT guarantee it); and our Windows signal implementation certainly doesn't guarantee anything of the sort. I don't think there's necessarily anything wrong with the code, but you can't test it with such a simplistic test as this and expect stable results. If you feel an urgent need to have a test case, try building an isolationtester script. regards, tom lane
On Fri, Aug 24, 2018 at 6:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> On Wed, Jul 25, 2018 at 7:27 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >>> Seems the build of the updated patch built and tested Ok. Still have no idea >>> why the previous one didn’t. > >> That problem apparently didn't go away. cfbot tested it 7 times in >> the past week, and it passed only once on Windows: >> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9691 >> The other times all failed like this: >> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.9833 > > I think this is just a timing problem: the signal gets sent, > but it might or might not get received before the current statement ends. > It's possible that a signal-sent-to-self can be expected to be received > synchronously on all Unix platforms, but I wouldn't entirely bet on that > (in particular, the POSIX text for kill(2) does NOT guarantee it); and > our Windows signal implementation certainly doesn't guarantee anything > of the sort. How about we just wait forever if the function manages to return? select case when pg_cancel_backend(pg_backend_pid(), '...') then pg_sleep('infinity') end; -- Thomas Munro http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes: > On Fri, Aug 24, 2018 at 6:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think this is just a timing problem: the signal gets sent, >> but it might or might not get received before the current statement ends. > How about we just wait forever if the function manages to return? > select case when pg_cancel_backend(pg_backend_pid(), '...') then > pg_sleep('infinity') end; Hm, that might work. I'd pick a long but not infinite timeout --- maybe 60 sec would be good. regards, tom lane
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 24 Aug 2018, at 03:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> On Fri, Aug 24, 2018 at 6:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I think this is just a timing problem: the signal gets sent, >>> but it might or might not get received before the current statement ends. > >> How about we just wait forever if the function manages to return? >> select case when pg_cancel_backend(pg_backend_pid(), '...') then >> pg_sleep('infinity') end; > > Hm, that might work. I'd pick a long but not infinite timeout --- maybe > 60 sec would be good. I like that idea, so I’ve updated the patch to see what the cfbot thinks of it. cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 23 Aug 2018, at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think this is just a timing problem: the signal gets sent, > but it might or might not get received before the current statement ends. > It's possible that a signal-sent-to-self can be expected to be received > synchronously on all Unix platforms, but I wouldn't entirely bet on that > (in particular, the POSIX text for kill(2) does NOT guarantee it); and > our Windows signal implementation certainly doesn't guarantee anything > of the sort. That makes a lot of sense, I was thinking about it in too simplistic terms. Thanks for the clarification. > I don't think there's necessarily anything wrong with the code, but > you can't test it with such a simplistic test as this and expect > stable results. If you feel an urgent need to have a test case, > try building an isolationtester script. During hacking on it I preferred to have tests for it. Iff this makes it in, the tests can be omitted per the judgement of the committers of course. Since the printed pid will vary between runs it’s hard to test more interesting scenarios so they are of questionable worth. cheers ./daniel
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed I tested this patch, and all looks well and functional. I reread a discussion and I don't see any unresolved objection againstthis patch. There are not warning, crashes, all tests are passed. New behave is documented well. I'll mark this patch as ready for commiters The new status of this patch is: Ready for Committer
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Mon, Aug 27, 2018 at 12:06:18PM +0200, Daniel Gustafsson wrote: > I like that idea, so I’ve updated the patch to see what the cfbot > thinks of it. + when pg_cancel_backend(pg_backend_pid(), 'ソケットをブロッキングモードに設定できませんでした') You could have chosen something less complicated, like "ホゲ", which is the equivalent of "foo" in English. Anyway, non-ASCII characters should not be included in the final patch. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Sun, Sep 30, 2018 at 10:51:44PM +0900, Michael Paquier wrote: > You could have chosen something less complicated, like "ホゲ", which is > the equivalent of "foo" in English. Anyway, non-ASCII characters should > not be included in the final patch. Looking at the refactoring patch 0001, wouldn't signalfuncs.c make a better name for the new file? There are already multiple examples of this type, like logicalfuncs.c, slotfuncs.c, etc. I have moved this patch set to the next commit fest for now. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 1 Oct 2018, at 01:19, Michael Paquier <michael@paquier.xyz> wrote: > > On Sun, Sep 30, 2018 at 10:51:44PM +0900, Michael Paquier wrote: >> You could have chosen something less complicated, like "ホゲ", which is >> the equivalent of "foo" in English. Anyway, non-ASCII characters should >> not be included in the final patch. Fixed in the attached v16 revision. > Looking at the refactoring patch 0001, wouldn't signalfuncs.c make a > better name for the new file? There are already multiple examples of > this type, like logicalfuncs.c, slotfuncs.c, etc. I have no strong feelings on this, I was merely using the name that Alvaro suggested when he brought up the refactoring as an extension of this patch. signalfuncs.c is fine by me, so I did this rename in the attached revision. > I have moved this patch set to the next commit fest for now. Thanks. cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Mon, Oct 01, 2018 at 02:37:42PM +0200, Daniel Gustafsson wrote: >> On 1 Oct 2018, at 01:19, Michael Paquier <michael@paquier.xyz> wrote: >> Looking at the refactoring patch 0001, wouldn't signalfuncs.c make a >> better name for the new file? There are already multiple examples of >> this type, like logicalfuncs.c, slotfuncs.c, etc. > > I have no strong feelings on this, I was merely using the name that Alvaro > suggested when he brought up the refactoring as an extension of this patch. > signalfuncs.c is fine by me, so I did this rename in the attached revision. Indeed, I missed the previous part posted here: https://www.postgresql.org/message-id/20180124154548.gmpyvkzlsijren7u@alvherre.pgsql Alvaro, do you have a strong preference over one or the other? -- Michael
Attachment
On Tue, Oct 2, 2018 at 1:37 AM Daniel Gustafsson <daniel@yesql.se> wrote: > > On 1 Oct 2018, at 01:19, Michael Paquier <michael@paquier.xyz> wrote: > > On Sun, Sep 30, 2018 at 10:51:44PM +0900, Michael Paquier wrote: > >> You could have chosen something less complicated, like "ホゲ", which is > >> the equivalent of "foo" in English. Anyway, non-ASCII characters should > >> not be included in the final patch. > > Fixed in the attached v16 revision. Hi Daniel, It looks like you missed another case that needs tolerance for late signal delivery on Windows: +select pg_cancel_backend(pg_backend_pid()); https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15263 -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Wed, Oct 03, 2018 at 12:09:54PM +1300, Thomas Munro wrote: > It looks like you missed another case that needs tolerance for late > signal delivery on Windows: > > +select pg_cancel_backend(pg_backend_pid()); > > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15263 Looking at 0001, why are the declarations needed in patch 0002 part of 0001 (see signalfuncs.h)? I think that something like instead the attached is enough for this part. Daniel, could you confirm? -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 4 Oct 2018, at 09:59, Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Oct 03, 2018 at 12:09:54PM +1300, Thomas Munro wrote: >> It looks like you missed another case that needs tolerance for late >> signal delivery on Windows: >> >> +select pg_cancel_backend(pg_backend_pid()); >> >> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15263 > > Looking at 0001, why are the declarations needed in patch 0002 part of > 0001 (see signalfuncs.h)? I think that something like instead the > attached is enough for this part. Daniel, could you confirm? Yes, you are correct, the signalfuncs.h includes in 0001 are a rebase error from when I renamed the file. They are not present in the v15 patch but got introduced in v16 when I clearly wasn’t caffeinated enough to rebase. cheers ./daniel
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Thu, Oct 04, 2018 at 10:06:06AM +0200, Daniel Gustafsson wrote: > Yes, you are correct, the signalfuncs.h includes in 0001 are a rebase error > from when I renamed the file. They are not present in the v15 patch but got > introduced in v16 when I clearly wasn’t caffeinated enough to rebase. No problem, thanks for confirming. I have worked a bit more on the thing, reducing the headers of the new file to a bare minimum, and I have committed 0001. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 4 Oct 2018, at 13:00, Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Oct 04, 2018 at 10:06:06AM +0200, Daniel Gustafsson wrote: >> Yes, you are correct, the signalfuncs.h includes in 0001 are a rebase error >> from when I renamed the file. They are not present in the v15 patch but got >> introduced in v16 when I clearly wasn’t caffeinated enough to rebase. > > No problem, thanks for confirming. I have worked a bit more on the > thing, reducing the headers of the new file to a bare minimum, and I > have committed 0001. Thanks! Attached is a v17 which rebases the former 0002 patch on top of current master, along with the test fix for Windows that Thomas reported upthread (no other changes introduced over earlier versions). cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Fri, Oct 05, 2018 at 10:11:45AM +0200, Daniel Gustafsson wrote: > Thanks! Attached is a v17 which rebases the former 0002 patch on top of > current master, along with the test fix for Windows that Thomas reported > upthread (no other changes introduced over earlier versions). Thanks for the new version. In order to make a test with non-ASCII characters in the error message, we could use a trick similar to xml.sql: use a function which ignores the test case if server_encoding is not UTF-8 and use something like chr() to pass down a messages with non-ASCII characters. Then if the function catches the error we are good. + *pid = slot->src_pid; + slot->orig_length = 0; + slot->message[0] = '\0'; Message consumption should be the part using memset(0), no? system calls are avoided within a spinlock section, but memset and strlcpy should be fast enough. Those are used in WalReceiverMain() for example. The facility to store messages should be in its own file, and signalfuncs.c should depend on it, something like signal_message.c? +typedef struct +{ + pid_t dest_pid; /* The pid of the process being signalled */ + pid_t src_pid; /* The pid of the processing signalling */ + slock_t mutex; /* Per-slot protection */ + char message[MAX_CANCEL_MSG]; /* Message to send to signalled backend */ + int orig_length; /* Length of the message as passed by the user, + * if this length exceeds MAX_CANCEL_MSG it will + * be truncated but we store the original length + * in order to be able to convey truncation */ + int sqlerrcode; /* errcode to use when signalling backend */ +} BackendSignalFeedbackShmemStruct A couple of thoughts here: - More information could make this facility more generic: an elevel to be able to fully manipulate the custom error message, and a breakpoint definition. As far as I can see you have two of them in the patch which are the callers of ConsumeBackendSignalFeedback(). One event cancels the query, and another terminates it. In both cases, the breakpoint implies the elevel. So could we have more possible breakpoints possible and should we try to make this API more pluggable from the start? - Is orig_length really useful in the data stored? Why not just warning/noticing the caller defining the message and just store the message. So, looking at your patch, I am wondering also about splitting it into a couple of pieces for clarity: - The base facility to be able to register and consume messages which can be attached to a backend slot, and then be accessed by other things. Let's think about it as a base facility which is useful for extensions and module developers. If coupled with a hook, it would be actually possible to scan a backend's slot for some information which could be consumed afterwards for custom error messages... - The set of breakpoints which can then be used to define if a given code path should show up a custom error message. I can think of three of them: cancel, termination and extension, which is a custom path that extensions can use. The extension breakpoint would make sense as something included from the start, could be stored as an integer and I guess should not be an enum but a set of defined tables (see pgstat.h for wait event categories for example). - The set of SQL functions making use of it in-core, in your case these are the SQL functions for cancellation and termination. -- Michael
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Fri, Oct 05, 2018 at 10:11:45AM +0200, Daniel Gustafsson wrote: > Thanks! Attached is a v17 which rebases the former 0002 patch on top of > current master, along with the test fix for Windows that Thomas reported > upthread (no other changes introduced over earlier versions). Thanks for the new version. In order to make a test with non-ASCII characters in the error message, we could use a trick similar to xml.sql: use a function which ignores the test case if server_encoding is not UTF-8 and use something like chr() to pass down a messages with non-ASCII characters. Then if the function catches the error we are good. + *pid = slot->src_pid; + slot->orig_length = 0; + slot->message[0] = '\0'; Message consumption should be the part using memset(0), no? system calls are avoided within a spinlock section, but memset and strlcpy should be fast enough. Those are used in WalReceiverMain() for example. The facility to store messages should be in its own file, and signalfuncs.c should depend on it, something like signal_message.c? +typedef struct +{ + pid_t dest_pid; /* The pid of the process being signalled */ + pid_t src_pid; /* The pid of the processing signalling */ + slock_t mutex; /* Per-slot protection */ + char message[MAX_CANCEL_MSG]; /* Message to send to signalled backend */ + int orig_length; /* Length of the message as passed by the user, + * if this length exceeds MAX_CANCEL_MSG it will + * be truncated but we store the original length + * in order to be able to convey truncation */ + int sqlerrcode; /* errcode to use when signalling backend */ +} BackendSignalFeedbackShmemStruct A couple of thoughts here: - More information could make this facility more generic: an elevel to be able to fully manipulate the custom error message, and a breakpoint definition. As far as I can see you have two of them in the patch which are the callers of ConsumeBackendSignalFeedback(). One event cancels the query, and another terminates it. In both cases, the breakpoint implies the elevel. So could we have more possible breakpoints possible and should we try to make this API more pluggable from the start? - Is orig_length really useful in the data stored? Why not just warning/noticing the caller defining the message and just store the message. So, looking at your patch, I am wondering also about splitting it into a couple of pieces for clarity: - The base facility to be able to register and consume messages which can be attached to a backend slot, and then be accessed by other things. Let's think about it as a base facility which is useful for extensions and module developers. If coupled with a hook, it would be actually possible to scan a backend's slot for some information which could be consumed afterwards for custom error messages... - The set of breakpoints which can then be used to define if a given code path should show up a custom error message. I can think of three of them: cancel, termination and extension, which is a custom path that extensions can use. The extension breakpoint would make sense as something included from the start, could be stored as an integer and I guess should not be an enum but a set of defined tables (see pgstat.h for wait event categories for example). - The set of SQL functions making use of it in-core, in your case these are the SQL functions for cancellation and termination. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 9 Oct 2018, at 07:38, Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Oct 05, 2018 at 10:11:45AM +0200, Daniel Gustafsson wrote: >> Thanks! Attached is a v17 which rebases the former 0002 patch on top of >> current master, along with the test fix for Windows that Thomas reported >> upthread (no other changes introduced over earlier versions). > > Thanks for the new version. Thanks for reviewing! > In order to make a test with non-ASCII characters in the error message, > we could use a trick similar to xml.sql: use a function which ignores > the test case if server_encoding is not UTF-8 and use something like > chr() to pass down a messages with non-ASCII characters. Then if the > function catches the error we are good. Thats one approach, do you think it’s worth adding to ensure clipping during truncation? > + *pid = slot->src_pid; > + slot->orig_length = 0; > + slot->message[0] = '\0'; > Message consumption should be the part using memset(0), no? system > calls are avoided within a spinlock section, but memset and strlcpy > should be fast enough. Those are used in WalReceiverMain() for > example. Good point. IIRC I added it in the setting codepath to avoid memsetting in case there were no more messages set. That’s an incorrect optimization, so fixed. > The facility to store messages should be in its own file, and > signalfuncs.c should depend on it, something like signal_message.c? It was originally in backend_signal.c, and was moved into (what is now) signalfuncs.c in the refactoring that Alvaro suggested. Re-reading his email I’m not sure he actually proposed merging this code with the signal code. Moved the new functionality to signal_message.{ch}. > +typedef struct > +{ > + pid_t dest_pid; /* The pid of the process being signalled */ > + pid_t src_pid; /* The pid of the processing signalling */ > + slock_t mutex; /* Per-slot protection */ > + char message[MAX_CANCEL_MSG]; /* Message to send to signalled backend */ > + int orig_length; /* Length of the message as passed by the user, > + * if this length exceeds MAX_CANCEL_MSG it will > + * be truncated but we store the original length > + * in order to be able to convey truncation > */ > + int sqlerrcode; /* errcode to use when signalling backend */ > +} BackendSignalFeedbackShmemStruct > > A couple of thoughts here: > - More information could make this facility more generic: an elevel to > be able to fully manipulate the custom error message, and a breakpoint > definition. As far as I can see you have two of them in the patch which > are the callers of ConsumeBackendSignalFeedback(). One event cancels > the query, and another terminates it. In both cases, the breakpoint > implies the elevel. So could we have more possible breakpoints possible > and should we try to make this API more pluggable from the start? I’m not sure I follow, can you elaborate on this? > - Is orig_length really useful in the data stored? Why not just > warning/noticing the caller defining the message and just store the > message. The caller is being notified, but the original length is returned such that the consumer can format the message with the truncation in mind. In postgres.c we currently do: ereport(FATAL, (errcode(sqlerrcode), errmsg("%s%s", buffer, (len > sizeof(buffer) ? "..." : "")), errdetail("terminating connection due to administrator command by process %d", pid))); If that’s not deemed of value to keep, then orig_length can be dropped. > So, looking at your patch, I am wondering also about splitting it into > a couple of pieces for clarity: > - The base facility to be able to register and consume messages which > can be attached to a backend slot, and then be accessed by other things. > Let's think about it as a base facility which is useful for extensions > and module developers. If coupled with a hook, it would be actually > possible to scan a backend's slot for some information which could be > consumed afterwards for custom error messages... > - The set of breakpoints which can then be used to define if a given > code path should show up a custom error message. I can think of three > of them: cancel, termination and extension, which is a custom path that > extensions can use. The extension breakpoint would make sense as > something included from the start, could be stored as an integer and I > guess should not be an enum but a set of defined tables (see pgstat.h > for wait event categories for example). > - The set of SQL functions making use of it in-core, in your case these > are the SQL functions for cancellation and termination. This does not sound like a bad idea to me. Is that something you are planning to do or do you want me to cut the patch up into pieces? Just want to avoid stepping on each others toes if you are already thinking/poking at this. Attached is a v18 with the above changes. cheers ./daniel
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Wed, Oct 10, 2018 at 02:20:53PM +0200, Daniel Gustafsson wrote: >> On 9 Oct 2018, at 07:38, Michael Paquier <michael@paquier.xyz> wrote: >> In order to make a test with non-ASCII characters in the error message, >> we could use a trick similar to xml.sql: use a function which ignores >> the test case if server_encoding is not UTF-8 and use something like >> chr() to pass down a messages with non-ASCII characters. Then if the >> function catches the error we are good. > > Thats one approach, do you think it’s worth adding to ensure clipping during > truncation? I am not exactly sure what you mean here. > > The facility to store messages should be in its own file, and > > signalfuncs.c should depend on it, something like signal_message.c? > > It was originally in backend_signal.c, and was moved into (what is now) > signalfuncs.c in the refactoring that Alvaro suggested. Re-reading > his email I’m not sure he actually proposed merging this code with the > signal code. Indeed, you could understand the past suggestion both ways. From what I can see, there is merit in keeping the SQL-callable functions working on signals into their own file. The message capacity that you are proposing here should on their own, and... > Moved the new functionality to signal_message.{ch}. That's actually a much better name than anything I could think of. >> - More information could make this facility more generic: an elevel to >> be able to fully manipulate the custom error message, and a breakpoint >> definition. As far as I can see you have two of them in the patch which >> are the callers of ConsumeBackendSignalFeedback(). One event cancels >> the query, and another terminates it. In both cases, the breakpoint >> implies the elevel. So could we have more possible breakpoints possible >> and should we try to make this API more pluggable from the start? > > I’m not sure I follow, can you elaborate on this? Sure. From what I can see in your patch is that you want to generate in some code paths an ereport() call with a given message string. As far as I can see, you can decide three things with what's available: - errcode - errstring - the moment you want the ereport() to be generated. What I am suggesting here is to let callers extend the possibilities a bit more with: - elevel - errdetail This way, you can override code paths with a more custom experience. The thing could break easily Postgres, as you should not really use a non-FATAL elevel when terminating a session, but having a flexible API will allow to not come back to it in the future. We may also want to have ConsumeBackendSignalFeedback() call ereport() by itself with all the details available in the data registered. The main take on your feature is that it is more flexible than the elog hook, as you can enforce custom strings at function call, and don't need to do some weird hacks in plugins in need of manipulating the error. >> - Is orig_length really useful in the data stored? Why not just >> warning/noticing the caller defining the message and just store the >> message. > > The caller is being notified, but the original length is returned such that the > consumer can format the message with the truncation in mind. In postgres.c we > currently do: > > ereport(FATAL, > (errcode(sqlerrcode), > errmsg("%s%s", > buffer, (len > sizeof(buffer) ? "..." : "")), > errdetail("terminating connection due to administrator command by process %d", > pid))); The goal is to generate an elog() at the end, so I can see your point, not sure if that's worth the complication though.. >> So, looking at your patch, I am wondering also about splitting it into >> a couple of pieces for clarity: >> - The base facility to be able to register and consume messages which >> can be attached to a backend slot, and then be accessed by other things. >> Let's think about it as a base facility which is useful for extensions >> and module developers. If coupled with a hook, it would be actually >> possible to scan a backend's slot for some information which could be >> consumed afterwards for custom error messages... >> - The set of breakpoints which can then be used to define if a given >> code path should show up a custom error message. I can think of three >> of them: cancel, termination and extension, which is a custom path that >> extensions can use. The extension breakpoint would make sense as >> something included from the start, could be stored as an integer and I >> guess should not be an enum but a set of defined tables (see pgstat.h >> for wait event categories for example). >> - The set of SQL functions making use of it in-core, in your case these >> are the SQL functions for cancellation and termination. > > This does not sound like a bad idea to me. Is that something you are > planning to do or do you want me to cut the patch up into pieces? > Just want to avoid stepping on each others toes if you are already > thinking/poking at this. Those are my first impressions about the patch after looking at the code so I have not done any code diving. The more pieces we are able to break this stuff into, the more likely it is easy to review and to get committed. From what I can see the main idea can be split into more sub-concepts. -- Michael
Attachment
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
> On 11 Oct 2018, at 03:29, Michael Paquier <michael@paquier.xyz> wrote: Hi!, Thanks for reviewing this patch, and sorry for having been slow lately. > On Wed, Oct 10, 2018 at 02:20:53PM +0200, Daniel Gustafsson wrote: >>> On 9 Oct 2018, at 07:38, Michael Paquier <michael@paquier.xyz> wrote: >>> In order to make a test with non-ASCII characters in the error message, >>> we could use a trick similar to xml.sql: use a function which ignores >>> the test case if server_encoding is not UTF-8 and use something like >>> chr() to pass down a messages with non-ASCII characters. Then if the >>> function catches the error we are good. >> >> Thats one approach, do you think it’s worth adding to ensure clipping during >> truncation? > > I am not exactly sure what you mean here. The test was added to the patch as a reviewer found a bug in an early version where it didn’t treat mb characters properly, it was using strlen() and prone to clipping an mb char in half. The test was added to the patch to show the fix to the patch review process, but that doesn’t necessarily mean it’s deemed an interesting enough case to have a test in check/installcheck for. I don’t have strong opinions, other than trying to not add uninteresting tests as they do carry a cost. >>> The facility to store messages should be in its own file, and >>> signalfuncs.c should depend on it, something like signal_message.c? >> >> It was originally in backend_signal.c, and was moved into (what is now) >> signalfuncs.c in the refactoring that Alvaro suggested. Re-reading >> his email I’m not sure he actually proposed merging this code with the >> signal code. > > Indeed, you could understand the past suggestion both ways. From what I > can see, there is merit in keeping the SQL-callable functions working on > signals into their own file. The message capacity that you are > proposing here should on their own, and... > >> Moved the new functionality to signal_message.{ch}. > > That's actually a much better name than anything I could think of. Actually it isn’t, it was your suggestion =) >>> - More information could make this facility more generic: an elevel to >>> be able to fully manipulate the custom error message, and a breakpoint >>> definition. As far as I can see you have two of them in the patch which >>> are the callers of ConsumeBackendSignalFeedback(). One event cancels >>> the query, and another terminates it. In both cases, the breakpoint >>> implies the elevel. So could we have more possible breakpoints possible >>> and should we try to make this API more pluggable from the start? >> >> I’m not sure I follow, can you elaborate on this? > > Sure. From what I can see in your patch is that you want to generate in > some code paths an ereport() call with a given message string. As far > as I can see, you can decide three things with what's available: > - errcode > - errstring > - the moment you want the ereport() to be generated. > > What I am suggesting here is to let callers extend the possibilities a > bit more with: > - elevel > - errdetail > This way, you can override code paths with a more custom experience. > The thing could break easily Postgres, as you should not really use a > non-FATAL elevel when terminating a session, but having a flexible API > will allow to not come back to it in the future. We may also want to > have ConsumeBackendSignalFeedback() call ereport() by itself with all > the details available in the data registered. I’ve added elevel to the API, but I’m not sure if keeping an additional buffer for errdetail is worth it since it probably wouldn’t be used that often? > The main take on your feature is that it is more flexible than the elog > hook, as you can enforce custom strings at function call, and don't need > to do some weird hacks in plugins in need of manipulating the error. > >>> - Is orig_length really useful in the data stored? Why not just >>> warning/noticing the caller defining the message and just store the >>> message. >> >> The caller is being notified, but the original length is returned such that the >> consumer can format the message with the truncation in mind. In postgres.c we >> currently do: >> >> ereport(FATAL, >> (errcode(sqlerrcode), >> errmsg("%s%s", >> buffer, (len > sizeof(buffer) ? "..." : "")), >> errdetail("terminating connection due to administrator command by process %d", >> pid))); > > The goal is to generate an elog() at the end, so I can see your point, > not sure if that's worth the complication though.. Since I wrote the code, I think it’s worth it as a nice-to-have rather than as a pure necessity. If noone else sees the value then we’ll rip it out of the patch. >>> So, looking at your patch, I am wondering also about splitting it into >>> a couple of pieces for clarity: >>> - The base facility to be able to register and consume messages which >>> can be attached to a backend slot, and then be accessed by other things. >>> Let's think about it as a base facility which is useful for extensions >>> and module developers. If coupled with a hook, it would be actually >>> possible to scan a backend's slot for some information which could be >>> consumed afterwards for custom error messages... >>> - The set of breakpoints which can then be used to define if a given >>> code path should show up a custom error message. I can think of three >>> of them: cancel, termination and extension, which is a custom path that >>> extensions can use. The extension breakpoint would make sense as >>> something included from the start, could be stored as an integer and I >>> guess should not be an enum but a set of defined tables (see pgstat.h >>> for wait event categories for example). >>> - The set of SQL functions making use of it in-core, in your case these >>> are the SQL functions for cancellation and termination. Is a hook for extensions needed, as they too can drain the message via the Consume function? Where if so do you reckon it should go? >> This does not sound like a bad idea to me. Is that something you are >> planning to do or do you want me to cut the patch up into pieces? >> Just want to avoid stepping on each others toes if you are already >> thinking/poking at this. > > Those are my first impressions about the patch after looking at the > code so I have not done any code diving. The more pieces we are able to > break this stuff into, the more likely it is easy to review and to get > committed. From what I can see the main idea can be split into more > sub-concepts. I’ve split the patch into two logical parts, the signalling functionality and the userfacing terminate/cancel part. For extra clarity I’ve also included the full v19 patch, in case you prefer that instead when seeing the two. cheers ./daniel
Attachment
Daniel Gustafsson <daniel@yesql.se> writes: > I’ve split the patch into two logical parts, the signalling functionality and > the userfacing terminate/cancel part. For extra clarity I’ve also included the > full v19 patch, in case you prefer that instead when seeing the two. I'm a bit befuddled by this patch, or at least the proposed test cases. Those show no proof that the feature actually works, ie, delivers a message to the target backend. Also, what am I to make of the test cases involving NULL arguments? Personally, rather than messing around with defaulted arguments and detecting nulls at runtime, I'd make two C functions that are both strict. The signal_message stuff seems both overly complicated and overly fragile. You have, for example, largely ignored the coding rule that says to have only straight-line code within a spinlock hold. I am also not very enamored of setting up Yet Another per-backend data structure in shared memory, especially one that can so easily get out of sync with the ProcArray. If we're going to expend dedicated shared memory on this feature, let's just add the necessary fields to the PGPROC structs. That would also provide some opportunity to interlock things in a way that would fix the race conditions, ie, once we've found the relevant PGPROC entry, hold a lock on it till we've inserted the message and sent the signal. I don't especially like orig_length: that information is of no use to the recipient backend, and what the field is actually being used as, ie a boolean "slot is full" indicator, is nowhere to be guessed from the field's documentation. The current patch fails to apply against parallel_schedule, which reminds me that you forgot to include an update to serial_schedule. regards, tom lane
On Mon, Nov 12, 2018 at 8:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: > > I’ve split the patch into two logical parts, the signalling functionality and > > the userfacing terminate/cancel part. For extra clarity I’ve also included the > > full v19 patch, in case you prefer that instead when seeing the two. Just for the records, cfbot complains during the compilation: option.c: In function ‘parseCommandLine’: option.c:265:8: error: ignoring return value of ‘getcwd’, declared with attribute warn_unused_result [-Werror=unused-result] getcwd(default_sockdir, MAXPGPATH); ^
Re: [HACKERS] Optional message to user when terminating/cancellingbackend
On Fri, Nov 30, 2018 at 06:02:15PM +0100, Dmitry Dolgov wrote: > Just for the records, cfbot complains during the compilation: > > option.c: In function ‘parseCommandLine’: > option.c:265:8: error: ignoring return value of ‘getcwd’, declared > with attribute warn_unused_result [-Werror=unused-result] > getcwd(default_sockdir, MAXPGPATH); > ^ The patch has been marked as returned with feedback. -- Michael