LISTEN denial of service with aborted transaction - Mailing list pgsql-hackers
From | Matt Newell |
---|---|
Subject | LISTEN denial of service with aborted transaction |
Date | |
Msg-id | 1740208.GbWj81YA5k@obsidian Whole thread Raw |
Responses |
Re: LISTEN denial of service with aborted transaction
|
List | pgsql-hackers |
This morning with our production database I began receiving reports of the database being "down". I checked the log and was surprised to see extremely long durations for a LISTEN that happens after each connection is made by our database library. This coincided with many(approx 600) new connections happening in a short time window due to render nodes automatically being turned on when the first job of the morning was submitted(idle nodes are turned off to save power). My initial hunch was that there was some code in postgres that resulted exponential execution time if enough listens on new connections happened at the same time. As I was trying to gather more information the listen times began to decrease and after about 20 minutes things were back to normal. A few hours later the symptoms returned but this time the listen was taking upwards of 15 minutes. I did some more reading and checked the pg_notify directory and found that there were 49 files. Having never checked that directory before I wasn't sure if that was normal. A short time later I noticed there was a process sitting idle with an aborted transaction. After killing the process things quickly returned to normal. After doing a search for "idle in transaction (aborted)" I came upon http://stackoverflow.com/questions/15036438/postgres-connection-leaks-idle-in-transaction-aborted While this is a potential solution for dealing with the problem it seems that the postgresql developers have decided to let connections stay in the "idle in transaction (aborted)" state for a reason, most likely under the assumption that it's relatively safe and only eats up the resources of a single connection. However it's easy to demonstrate that doing: listen "abc"; begin; select bad_column from non_existant_table; ...will eventually cause a denial of service situation if the DBA hasn't setup guards against connection sitting idle in an aborted transaction. Looking at the code in src/backend/commands/async.c I think there are a couple ways to eliminate this problem. 1. When a connection issues it's first LISTEN command, in Exec_ListenPreCommit QUEUE_BACKEND_POS(MyBackendId) = QUEUE_TAIL; this causes the connection to iterate through every notify queued in the slru, even though at that point I believe the connection can safely ignore any notifications from transactions that are already committed, and if I understand correctly notifications aren't put into the slru until precommit, so wouldn't it be safe to do: QUEUE_BACKEND_POS(MyBackendId) = QUEUE_HEAD; inside Async_Listen? If that's not safe, then could a new member be added to AsyncQueueControl that points to the first uncommitted QueuePosition (wouldn't have to be kept completely up to date). This would solve the problem of slow initial LISTEN in the face of a growing pg_notify queue. 2. Would it be possible when a backend is signaled to check if it is idle inside an aborted transaction, and if so process the notifications and put any that match what the backend is listening on into a local list. This would allow the slru to be cleaned up. In practice I think most notifications would either be disregarded or combined with a duplicate, so the list would most likely end up staying very small. If the backend local list does grow too large then the connection could be killed or handled in some other appropriate way. I am happy to attempt coming up with a patch if the ideas are deemed worthwhile. Thanks, Matt Newell
pgsql-hackers by date: