Thread: Augmenting the deadlock message with application_name
As we know, the deadlock error message isn't the most friendly one. All the client gets back is process PIDs, transaction IDs, and lock types. You have to check the server log to retrieve lock details. This is tedious.
In one of my apps I even added a deadlock exception handler on the app side to query pg_stat_activity for processes involved in the deadlock and include their application names and queries in the exception message. It is a little racy but works well enough.
Ideally I'd like to see that data coming from Postgres upon detecting the deadlock. That's why I made this small change.
The change makes the deadlock error look as follows - the new element is the application name or "<insufficient privilege>" in its place if the activity user doesn't match the current user (and the current use isn't a superuser):
postgres=*> SELECT * FROM q WHERE id = 2 FOR UPDATE;
ERROR: deadlock detected
DETAIL: Process 194520 (application_name: <insufficient privilege>) waits for ShareLock on transaction 776; blocked by process 194521.
Process 194521 (application_name: woof) waits for ShareLock on transaction 775; blocked by process 194520.
HINT: See server log for query details.
CONTEXT: while locking tuple (0,2) in relation "q"
In one of my apps I even added a deadlock exception handler on the app side to query pg_stat_activity for processes involved in the deadlock and include their application names and queries in the exception message. It is a little racy but works well enough.
Ideally I'd like to see that data coming from Postgres upon detecting the deadlock. That's why I made this small change.
The change makes the deadlock error look as follows - the new element is the application name or "<insufficient privilege>" in its place if the activity user doesn't match the current user (and the current use isn't a superuser):
postgres=*> SELECT * FROM q WHERE id = 2 FOR UPDATE;
ERROR: deadlock detected
DETAIL: Process 194520 (application_name: <insufficient privilege>) waits for ShareLock on transaction 776; blocked by process 194521.
Process 194521 (application_name: woof) waits for ShareLock on transaction 775; blocked by process 194520.
HINT: See server log for query details.
CONTEXT: while locking tuple (0,2) in relation "q"
I added a new LocalPgBackendCurrentActivity struct combining application name and query string pointers and a sameProcess boolean. It is returned by value, since it's small. Performance-wise, this is a a part of the deadlock handler, if the DB hits it frequently, there are much more serious problems going on.
I could extend it by sending the queries back to the client, with an identical security check, but this is a potential information exposure of whatever's in the query plaintext. Another extension is to replace "(application_name: <insufficient privilege>)" with something better like "(unknown application_name)", or even nothing.
Attached patch is for master, 2fb7560c. It doesn't contain any tests.
Let me know if you approve of the patch and if it makes sense to continue working on it.
Best,
Karoline
Attachment
On Thu, May 9, 2024 at 11:44:03PM +0000, Karoline Pauls wrote: > As we know, the deadlock error message isn't the most friendly one. All the > client gets back is process PIDs, transaction IDs, and lock types. You have to > check the server log to retrieve lock details. This is tedious. > > In one of my apps I even added a deadlock exception handler on the app side to > query pg_stat_activity for processes involved in the deadlock and include their > application names and queries in the exception message. It is a little racy but > works well enough. > > Ideally I'd like to see that data coming from Postgres upon detecting the > deadlock. That's why I made this small change. > > The change makes the deadlock error look as follows - the new element is the > application name or "<insufficient privilege>" in its place if the activity > user doesn't match the current user (and the current use isn't a superuser): > > postgres=*> SELECT * FROM q WHERE id = 2 FOR UPDATE; > ERROR: deadlock detected > DETAIL: Process 194520 (application_name: <insufficient privilege>) waits for > ShareLock on transaction 776; blocked by process 194521. > Process 194521 (application_name: woof) waits for ShareLock on transaction 775; > blocked by process 194520. > HINT: See server log for query details. > CONTEXT: while locking tuple (0,2) in relation "q" log_line_prefix supports application name --- why would you not use that? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
On Friday, 10 May 2024 at 20:17, Bruce Momjian <bruce@momjian.us> wrote: > > log_line_prefix supports application name --- why would you not use > that? > log_line_prefix is effective in the server log. This change is mostly about improving the message sent back to the client.While the server log is also changed to reflect the client message, it doesn't need to be. Additionally, with `%a` added to log_line_prefix, the server log would only contain the application name of the client affectedby the deadlock, not the application names of all other clients involved in it. Example server log with application names (here: a and b) added to the log prefix: 2024-05-10 20:39:58.459 BST [197591] (a)ERROR: deadlock detected 2024-05-10 20:39:58.459 BST [197591] (a)DETAIL: Process 197591 (application_name: a) waits for ShareLock on transaction782; blocked by process 197586. Process 197586 (application_name: b) waits for ShareLock on transaction 781; blocked by process 197591. Process 197591, (application_name: a): SELECT * FROM q WHERE id = 2 FOR UPDATE; Process 197586, (application_name: b): SELECT * FROM q WHERE id = 1 FOR UPDATE; 2024-05-10 20:39:58.459 BST [197591] (a)HINT: See server log for query details. 2024-05-10 20:39:58.459 BST [197591] (a)CONTEXT: while locking tuple (0,2) in relation "q" 2024-05-10 20:39:58.459 BST [197591] (a)STATEMENT: SELECT * FROM q WHERE id = 2 FOR UPDATE; All log line prefixes refer to the application a. The message has both a and b. Anyway, the server log is not the important part here. The crucial UX feature is the client getting application names back,so browsing through server logs can be avoided. Best, Karoline
Karoline Pauls <code@karolinepauls.com> writes: > On Friday, 10 May 2024 at 20:17, Bruce Momjian <bruce@momjian.us> wrote: >> log_line_prefix supports application name --- why would you not use >> that? > log_line_prefix is effective in the server log. This change is mostly about improving the message sent back to the client.While the server log is also changed to reflect the client message, it doesn't need to be. It's normally necessary to look at the server log anyway if you want to figure out what caused the deadlock, since the client message intentionally doesn't provide query texts. I think this proposal doesn't move the goalposts noticeably: it seems likely to me that in many installations the sessions would mostly all have the same application_name, or at best not-too-informative names like "psql" versus "PostgreSQL JDBC Driver". (If we thought these names *were* really informative about what other sessions are doing, we'd probably have to hide them from unprivileged users in pg_stat_activity, and then there'd also be a security concern here.) On the whole I'd reject this proposal as causing churn in application-visible behavior for very little gain. regards, tom lane
On Fri, May 10, 2024 at 08:10:58PM +0000, Karoline Pauls wrote: > On Friday, 10 May 2024 at 20:17, Bruce Momjian <bruce@momjian.us> > wrote: > > > > log_line_prefix supports application name --- why would you not use > > that? > > > > log_line_prefix is effective in the server log. This change is mostly > about improving the message sent back to the client. While the server > log is also changed to reflect the client message, it doesn't need to > be. I was hoping client_min_messages would show the application name, but it doesn't but your bigger point is below. > Additionally, with `%a` added to log_line_prefix, the server log > would only contain the application name of the client affected by the > deadlock, not the application names of all other clients involved in > it. Yeah, getting the application names of the pids reported in the log requires looking backward in the logs to find out what the most recent log line was for the pids involved. Frankly, I think it would be more useful to show the session_id in the deadlock so you could then use that to look back to any details you want in the logs, not only the application name. > Anyway, the server log is not the important part here. The crucial > UX feature is the client getting application names back, so browsing > through server logs can be avoided. Well, we don't want to report too much information because it gets confusing. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.