Thread: [HACKERS] CONNECTION LIMIT and Parallel Query don't play well together
It has come to my attention that when a user has a CONNECTION LIMIT set, and they make use of parallel query, that their queries can fail due to the connection limit being exceeded. Simple test case: postgres=# CREATE USER user1 LOGIN CONNECTION LIMIT 2; CREATE ROLE postgres=# \c postgres user1 You are now connected TO DATABASE "postgres" AS USER "user1". postgres=> CREATE TABLE t1 AS (SELECT i FROM GENERATE_SERIES(1,6000000) s(i)); SELECT 6000000 postgres=> SET max_parallel_workers_per_gather = 2; SET postgres=> SELECT COUNT(*) FROM t1; ERROR: too many connections FOR ROLE "user1" CONTEXT: parallel worker Now, as I understand it, during the design of parallel query, it was designed in such a way that nodeGather could perform all of the work in the main process in the event that no workers were available, and that the only user visible evidence of this would be the query would be slower than it would otherwise be. After a little bit of looking around I see that CountUserBackends() does not ignore the parallel workers, and counts these as "CONNECTIONS". It's probably debatable to weather these are connections or not, but I do see that max_connections is separate from max_worker_processes, per: /* the extra unit accounts for the autovacuum launcher */ MaxBackends = MaxConnections + autovacuum_max_workers + 1 + max_worker_processes; so the two don't stomp on each other's feet, which makes me think that a parallel worker should not consume a user connection, since it's not eating into max_connections. Also this is convenient fix for this would be to have CountUserBackends() ignore parallel workers completely. The alternatives I've thought of are would be to make some additional checks in RegisterDynamicBackgroundWorker() to make sure we don't get more workers than the user would be allowed, but that would add more code between the lock and increase contention, and we'd also somehow need to find a way to reserve the connections until the parallel workers started, so they were not taken by another concurrent connection in the meantime. This all sounds pretty horrid. Perhaps we can provide greater control of parallel workers per user in a future release to allow admins who are concerned about users hogging all of the parallel workers. Yet that's likely premature, as we don't have a per query nob for that yet. Thoughts? -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jan 11, 2017 at 2:44 AM, David Rowley <david.rowley@2ndquadrant.com> wrote: > It has come to my attention that when a user has a CONNECTION LIMIT > set, and they make use of parallel query, that their queries can fail > due to the connection limit being exceeded. > > Simple test case: > > postgres=# CREATE USER user1 LOGIN CONNECTION LIMIT 2; > CREATE ROLE > postgres=# \c postgres user1 > You are now connected TO DATABASE "postgres" AS USER "user1". > postgres=> CREATE TABLE t1 AS (SELECT i FROM GENERATE_SERIES(1,6000000) s(i)); > SELECT 6000000 > postgres=> SET max_parallel_workers_per_gather = 2; > SET > postgres=> SELECT COUNT(*) FROM t1; > ERROR: too many connections FOR ROLE "user1" > CONTEXT: parallel worker > > Now, as I understand it, during the design of parallel query, it was > designed in such a way that nodeGather could perform all of the work > in the main process in the event that no workers were available, and > that the only user visible evidence of this would be the query would > be slower than it would otherwise be. > This has been reported previously [1] and I have explained the reason why such a behaviour is possible and why this can't be handled in Gather node. > After a little bit of looking around I see that CountUserBackends() > does not ignore the parallel workers, and counts these as > "CONNECTIONS". It's probably debatable to weather these are > connections or not, I think this is not only for parallel workers, rather any background worker that uses database connection (BGWORKER_BACKEND_DATABASE_CONNECTION) will be counted in a similar way. I am not sure if it is worth inventing something to consider such background worker connections different from backend connections. However, I think we should document it either in parallel query or in background worker or in Create User .. Connection section. [1] - https://www.postgresql.org/message-id/20161222111345.25620.8603%40wrigleys.postgresql.org -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila wrote: > On Wed, Jan 11, 2017 at 2:44 AM, David Rowley > <david.rowley@2ndquadrant.com> wrote: >> It has come to my attention that when a user has a CONNECTION LIMIT >> set, and they make use of parallel query, that their queries can fail >> due to the connection limit being exceeded. >> >> Simple test case: >> >> postgres=# CREATE USER user1 LOGIN CONNECTION LIMIT 2; >> [...] >> postgres=> SELECT COUNT(*) FROM t1; >> ERROR: too many connections FOR ROLE "user1" >> CONTEXT: parallel worker >> >> Now, as I understand it, during the design of parallel query, it was >> designed in such a way that nodeGather could perform all of the work >> in the main process in the event that no workers were available, and >> that the only user visible evidence of this would be the query would >> be slower than it would otherwise be. >> >> After a little bit of looking around I see that CountUserBackends() >> does not ignore the parallel workers, and counts these as >> "CONNECTIONS". It's probably debatable to weather these are >> connections or not, > > I think this is not only for parallel workers, rather any background > worker that uses database connection > (BGWORKER_BACKEND_DATABASE_CONNECTION) will be counted in a similar > way. I am not sure if it is worth inventing something to consider > such background worker connections different from backend connections. > However, I think we should document it either in parallel query or in > background worker or in Create User .. Connection section. I think that this should be fixed rather than documented. Users will not take it well if their queries error out in this fashion. Background processes should not be counted as active connections. Their limit should be determined by max_worker_processes, and neither max_connections nor the connection limit per user or database should take them into account. Yours, Laurenz Albe
On Tue, Jan 10, 2017 at 4:14 PM, David Rowley <david.rowley@2ndquadrant.com> wrote: > It has come to my attention that when a user has a CONNECTION LIMIT > set, and they make use of parallel query, that their queries can fail > due to the connection limit being exceeded. That's bad. > Now, as I understand it, during the design of parallel query, it was > designed in such a way that nodeGather could perform all of the work > in the main process in the event that no workers were available, and > that the only user visible evidence of this would be the query would > be slower than it would otherwise be. That was the intent. > After a little bit of looking around I see that CountUserBackends() > does not ignore the parallel workers, and counts these as > "CONNECTIONS". It's probably debatable to weather these are > connections or not, ... Yeah. I think that I looked at the connection limit stuff in the 9.4 time frame and said, well, we shouldn't let people use background workers as a way of evading the connection limit, so let's continue to count them against that limit. Then, later on, I did the work to try to make it transparent when sufficient parallel workers cannot be obtained, but forgot about this case or somehow convinced myself that it didn't matter. One option is certainly to decide categorically that background workers are not connections, and therefore CountUserBackends() should ignore them and InitializeSessionUserId() shouldn't call it when the session being started is a background worker. That means that background workers don't count against the user connection limit, full stop. Another option, probably slightly less easy to implement, is to decide that background workers in general count against the limit but parallel workers do not. The third option is to count both background workers and parallel workers against the limit but somehow recover gracefully when this error trips. But I have no idea how we could implement that third option in a reasonable way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12 January 2017 at 09:36, Robert Haas <robertmhaas@gmail.com> wrote: > One option is certainly to decide categorically that background > workers are not connections, and therefore CountUserBackends() should > ignore them and InitializeSessionUserId() shouldn't call it when the > session being started is a background worker. That means that > background workers don't count against the user connection limit, full > stop. Another option, probably slightly less easy to implement, is to > decide that background workers in general count against the limit but > parallel workers do not. I think the root question here which we need to ask ourselves is, what is "CONNECTION LIMIT" for. I seem to have come around to assuming it's meant to be to protect the server to give everyone a fairer chance of getting a connection to the database. Now, since background workers don't consume anything from max_connections, then I don't really feel that a background worker should count towards "CONNECTION LIMIT". I'd assume any CONNECTION LIMITs that are set for a user would be calculated based on what max_connections is set to. If we want to limit background workers in the same manner, then perhaps we'd want to invent something like "WORKER LIMIT N" in CREATE USER. > The third option is to count both background > workers and parallel workers against the limit but somehow recover > gracefully when this error trips. But I have no idea how we could > implement that third option in a reasonable way. I agree with your view on the third option. I looked at this too and it seems pretty horrible to try and do anything in that direction. It seems that even if we suppressed the ERROR message, and had the worker exit, that we'd still briefly consume a background worker slot which would reduce the chances of some entitle user connection obtaining them, in fact, this is the case as it stands today, even if that moment is brief. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> On 12 January 2017 at 09:36, Robert Haas <robertmhaas@gmail.com> wrote: >> One option is certainly to decide categorically that background >> workers are not connections, and therefore CountUserBackends() should >> ignore them and InitializeSessionUserId() shouldn't call it when the >> session being started is a background worker. That means that >> background workers don't count against the user connection limit, full >> stop. I've attached a patch which intended to assist discussions on this topic. The patch adds some notes to the docs to mention that background workers and prepared xacts are not counted in CONNECTION LIMIT, it then goes on and makes CountUserBackends() ignore bgworkers. It was already ignoring prepared xacts. There's a bit of plumbing work to make the proc array aware of the background worker status. Hopefully this is suitable. I'm not all that close to that particular area of the code. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 12 January 2017 at 15:24, David Rowley <david.rowley@2ndquadrant.com> wrote: > I've attached a patch which intended to assist discussions on this topic. > > The patch adds some notes to the docs to mention that background > workers and prepared xacts are not counted in CONNECTION LIMIT, it > then goes on and makes CountUserBackends() ignore bgworkers. It was > already ignoring prepared xacts. There's a bit of plumbing work to > make the proc array aware of the background worker status. Hopefully > this is suitable. I'm not all that close to that particular area of > the code. Hi Robert, Wondering you've had any time to glance over this? If you think the patch needs more work, or goes about things the wrong way, let me know, and I'll make the changes. Thanks -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jan 26, 2017 at 7:59 AM, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 12 January 2017 at 15:24, David Rowley <david.rowley@2ndquadrant.com> wrote: >> I've attached a patch which intended to assist discussions on this topic. >> >> The patch adds some notes to the docs to mention that background >> workers and prepared xacts are not counted in CONNECTION LIMIT, it >> then goes on and makes CountUserBackends() ignore bgworkers. It was >> already ignoring prepared xacts. There's a bit of plumbing work to >> make the proc array aware of the background worker status. Hopefully >> this is suitable. I'm not all that close to that particular area of >> the code. > > Wondering you've had any time to glance over this? > > If you think the patch needs more work, or goes about things the wrong > way, let me know, and I'll make the changes. Sorry, this had slipped through the cracks -- I'm having a very hard time keeping up with the flow of patches and emails. But it looks good to me, except that it seems like CountDBBackends() needs the same fix (and probably a corresponding documentation update). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 27 January 2017 at 03:53, Robert Haas <robertmhaas@gmail.com> wrote: > Sorry, this had slipped through the cracks -- I'm having a very hard > time keeping up with the flow of patches and emails. But it looks > good to me, except that it seems like CountDBBackends() needs the same > fix (and probably a corresponding documentation update). Thanks for looking at this. Looks like there's a few other usages of CountDBBackends() which require background workers to be counted too, so I ended up creating CountDBConnections() as I didn't really think adding a bool flag to CountDBBackends was so nice. I thought about renaming CountUserBackends() to become CountUserConnections(), but I've not. Although, perhaps its better to break any third party stuff that uses that so that authors can review which behaviour they need rather than have their extension silently break? David -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 01/29/2017 04:07 PM, David Rowley wrote: > On 27 January 2017 at 03:53, Robert Haas <robertmhaas@gmail.com> wrote: >> Sorry, this had slipped through the cracks -- I'm having a very hard >> time keeping up with the flow of patches and emails. But it looks >> good to me, except that it seems like CountDBBackends() needs the same >> fix (and probably a corresponding documentation update). > Thanks for looking at this. > > Looks like there's a few other usages of CountDBBackends() which > require background workers to be counted too, so I ended up creating > CountDBConnections() as I didn't really think adding a bool flag to > CountDBBackends was so nice. > > I thought about renaming CountUserBackends() to become > CountUserConnections(), but I've not. Although, perhaps its better to > break any third party stuff that uses that so that authors can review > which behaviour they need rather than have their extension silently > break? > > I'm inclined to keep this as is - I don't think we should change the names at least in the stable releases. I'm not sure how far back it should be patched. The real effect is going to be felt from 9.6, I think, but arguably for consistency we should change it back to 9.3 or 9.4. Thoughts? Other things being equal I intend to commit this later today. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan wrote: > On 01/29/2017 04:07 PM, David Rowley wrote: >> Looks like there's a few other usages of CountDBBackends() which >> require background workers to be counted too, so I ended up creating >> CountDBConnections() as I didn't really think adding a bool flag to >> CountDBBackends was so nice. >> >> I thought about renaming CountUserBackends() to become >> CountUserConnections(), but I've not. Although, perhaps its better to >> break any third party stuff that uses that so that authors can review >> which behaviour they need rather than have their extension silently >> break? > > I'm inclined to keep this as is - I don't think we should change the > names at least in the stable releases. I'm not sure how far back it > should be patched. The real effect is going to be felt from 9.6, I > think, but arguably for consistency we should change it back to 9.3 or > 9.4. Thoughts? > > Other things being equal I intend to commit this later today. +1 Maybe it is better not to backpatch farther than 9.6 - I think it is good to be conservative about backpatching, and, as you say, the effect won't be noticable much before 9.6. Yours, Laurenz Albe
Re: [HACKERS] CONNECTION LIMIT and Parallel Query don't play welltogether
From
Peter Eisentraut
Date:
On 1/11/17 5:51 PM, David Rowley wrote: > Now, since background workers > don't consume anything from max_connections, then I don't really feel > that a background worker should count towards "CONNECTION LIMIT". I'd > assume any CONNECTION LIMITs that are set for a user would be > calculated based on what max_connections is set to. If we want to > limit background workers in the same manner, then perhaps we'd want to > invent something like "WORKER LIMIT N" in CREATE USER. This explanation makes sense, but it kind of upset my background sessions patch, which would previously have been limited by per-user connection settings. So I would like to have a background worker limit per user, as you allude to. Attached is a patch that implements a GUC setting max_worker_processes_per_user. Besides the uses for background sessions, but it can also be useful for parallel workers, logical replication apply workers, or things like third-party partitioning extensions. Thoughts? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Wed, Feb 15, 2017 at 11:19 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 1/11/17 5:51 PM, David Rowley wrote: >> Now, since background workers >> don't consume anything from max_connections, then I don't really feel >> that a background worker should count towards "CONNECTION LIMIT". I'd >> assume any CONNECTION LIMITs that are set for a user would be >> calculated based on what max_connections is set to. If we want to >> limit background workers in the same manner, then perhaps we'd want to >> invent something like "WORKER LIMIT N" in CREATE USER. > > This explanation makes sense, but it kind of upset my background > sessions patch, which would previously have been limited by per-user > connection settings. > > So I would like to have a background worker limit per user, as you > allude to. Attached is a patch that implements a GUC setting > max_worker_processes_per_user. > > Besides the uses for background sessions, but it can also be useful for > parallel workers, logical replication apply workers, or things like > third-party partitioning extensions. > > Thoughts? This isn't going to deliver consistent results if it's set differently in different sessions, although maybe you could weasel around that by wording the documentation in just the right way. It seems OK otherwise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] CONNECTION LIMIT and Parallel Query don't play welltogether
From
Peter Eisentraut
Date:
On 2/15/17 11:19, Peter Eisentraut wrote: > So I would like to have a background worker limit per user, as you > allude to. Attached is a patch that implements a GUC setting > max_worker_processes_per_user. > > Besides the uses for background sessions, but it can also be useful for > parallel workers, logical replication apply workers, or things like > third-party partitioning extensions. Given that background sessions have been postponed, is there still interest in this separate from that? It would be useful for per-user parallel worker limits, for example. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] CONNECTION LIMIT and Parallel Query don't play welltogether
From
Peter Eisentraut
Date:
On 4/6/17 15:01, Peter Eisentraut wrote: > On 2/15/17 11:19, Peter Eisentraut wrote: >> So I would like to have a background worker limit per user, as you >> allude to. Attached is a patch that implements a GUC setting >> max_worker_processes_per_user. >> >> Besides the uses for background sessions, but it can also be useful for >> parallel workers, logical replication apply workers, or things like >> third-party partitioning extensions. > > Given that background sessions have been postponed, is there still > interest in this separate from that? It would be useful for per-user > parallel worker limits, for example. Here is a slightly updated patch for consideration in the upcoming commit fest. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 24 August 2017 at 11:15, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > Here is a slightly updated patch for consideration in the upcoming > commit fest. Hi Peter, I just had a quick glance over this and wondered about 2 things. 1. Why a GUC and not a new per user option so it can be configured differently for different users? Something like ALTER USER ... WORKER LIMIT <n>; perhaps. I mentioned about this up-thread a bit. 2. + if (count > max_worker_processes_per_user) + { + ereport(LOG, + (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED), + errmsg("too many worker processes for role \"%s\"", + GetUserNameFromId(GetUserId(), false)))); + LWLockRelease(BackgroundWorkerLock); + return false; Unless I've misunderstood something, it seems that this is going to give random errors to users which might only occur when they run queries against larger tables. Part of why it made sense not to count workers towards the CONNECTION LIMIT was the fact that we didn't want to throw these random errors when workers could not be obtained when we take precautions in other places to just silently have fewer workers. There's lots of discussions earlier in this thread about this and I don't think anyone was in favour of queries randomly working sometimes. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] CONNECTION LIMIT and Parallel Query don't play well together
From
Michael Paquier
Date:
On Fri, Aug 25, 2017 at 6:25 PM, David Rowley <david.rowley@2ndquadrant.com> wrote: > I just had a quick glance over this and wondered about 2 things. > > 1. Why a GUC and not a new per user option so it can be configured > differently for different users? Something like ALTER USER ... WORKER > LIMIT <n>; perhaps. I mentioned about this up-thread a bit. > > 2. > > + if (count > max_worker_processes_per_user) > + { > + ereport(LOG, > + (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED), > + errmsg("too many worker processes for role \"%s\"", > + GetUserNameFromId(GetUserId(), false)))); > + LWLockRelease(BackgroundWorkerLock); > + return false; > > Unless I've misunderstood something, it seems that this is going to > give random errors to users which might only occur when they run > queries against larger tables. Part of why it made sense not to count > workers towards the CONNECTION LIMIT was the fact that we didn't want > to throw these random errors when workers could not be obtained when > we take precautions in other places to just silently have fewer > workers. There's lots of discussions earlier in this thread about this > and I don't think anyone was in favour of queries randomly working > sometimes. The status of the patch is incorrect I think. This was marked as needs review but I can see some input here which has remained unanswered for three months. I am marking this patch as returned with feedback. -- Michael