It has come to my attention that when a user has a CONNECTION LIMIT
set, and they make use of parallel query, that their queries can fail
due to the connection limit being exceeded.
Simple test case:
postgres=# CREATE USER user1 LOGIN CONNECTION LIMIT 2;
CREATE ROLE
postgres=# \c postgres user1
You are now connected TO DATABASE "postgres" AS USER "user1".
postgres=> CREATE TABLE t1 AS (SELECT i FROM GENERATE_SERIES(1,6000000) s(i));
SELECT 6000000
postgres=> SET max_parallel_workers_per_gather = 2;
SET
postgres=> SELECT COUNT(*) FROM t1;
ERROR: too many connections FOR ROLE "user1"
CONTEXT: parallel worker
Now, as I understand it, during the design of parallel query, it was
designed in such a way that nodeGather could perform all of the work
in the main process in the event that no workers were available, and
that the only user visible evidence of this would be the query would
be slower than it would otherwise be.
After a little bit of looking around I see that CountUserBackends()
does not ignore the parallel workers, and counts these as
"CONNECTIONS". It's probably debatable to weather these are
connections or not, but I do see that max_connections is separate from
max_worker_processes, per:
/* the extra unit accounts for the autovacuum launcher */
MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
max_worker_processes;
so the two don't stomp on each other's feet, which makes me think that
a parallel worker should not consume a user connection, since it's not
eating into max_connections. Also this is convenient fix for this
would be to have CountUserBackends() ignore parallel workers
completely.
The alternatives I've thought of are would be to make some additional
checks in RegisterDynamicBackgroundWorker() to make sure we don't get
more workers than the user would be allowed, but that would add more
code between the lock and increase contention, and we'd also somehow
need to find a way to reserve the connections until the parallel
workers started, so they were not taken by another concurrent
connection in the meantime. This all sounds pretty horrid.
Perhaps we can provide greater control of parallel workers per user in
a future release to allow admins who are concerned about users hogging
all of the parallel workers. Yet that's likely premature, as we don't
have a per query nob for that yet.
Thoughts?
-- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services