Thread: [BUGS] BUG #14473: Parallel query aborts with too many connections

[BUGS] BUG #14473: Parallel query aborts with too many connections

From
steven.winfield@cantabcapital.com
Date:
The following bug has been logged on the website:

Bug reference:      14473
Logged by:          Steven Winfield
Email address:      steven.winfield@cantabcapital.com
PostgreSQL version: 9.6.1
Operating system:   RHEL 7.3
Description:

Let's say a user is using all but two or three of their allowed connections
(set with ALTER USER name CONNECTION LIMIT n).

Now if they attempt to execute a query that would be executed in parallel by
many (say 8) workers, such as a count(*) of a large table, then the users
connection limit can be reached and the query is aborted:

mydatabase=> select count(*) from large_table;
ERROR:  too many connections for role "myname"
CONTEXT:  parallel worker

...even though the query could have been successfully executed with fewer
workers (as I checked by locally setting max_parallel_workers_per_gather to
0).

This is surprising, because in other circumstances the query _can_ proceed
with fewer workers than planned - e.g. when the max_worker_processes limit
has been hit.

From the docs:
(https://www.postgresql.org/docs/9.6/static/how-parallel-query-works.html)

"Every background worker process which is successfully started for a given
parallel query will execute the portion of the plan which is a descendent of
the Gather node"

...implying (at least to me) that workers may not start successfully, but
this is handled gracefully.

I couldn't find anything in the docs about the relationship between
max_parallel_workers_per_gather and a user's connection limit, nor that the
connection could be blown and that queries could be aborted.

Thanks,
Steve.


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14473: Parallel query aborts with too many connections

From
Amit Kapila
Date:
On Thu, Dec 22, 2016 at 4:43 PM,  <steven.winfield@cantabcapital.com> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      14473
> Logged by:          Steven Winfield
> Email address:      steven.winfield@cantabcapital.com
> PostgreSQL version: 9.6.1
> Operating system:   RHEL 7.3
> Description:
>
> Let's say a user is using all but two or three of their allowed connections
> (set with ALTER USER name CONNECTION LIMIT n).
>
> Now if they attempt to execute a query that would be executed in parallel by
> many (say 8) workers, such as a count(*) of a large table, then the users
> connection limit can be reached and the query is aborted:
>
> mydatabase=> select count(*) from large_table;
> ERROR:  too many connections for role "myname"
> CONTEXT:  parallel worker
>

This happens because parallel background workers uses database
connection (BGWORKER_BACKEND_DATABASE_CONNECTION).

> ...even though the query could have been successfully executed with fewer
> workers (as I checked by locally setting max_parallel_workers_per_gather to
> 0).
>
> This is surprising, because in other circumstances the query _can_ proceed
> with fewer workers than planned - e.g. when the max_worker_processes limit
> has been hit.
>

Query can proceed with lesser workers only when we launch lesser
workers to process it.  As this happens after launching the workers,
it is treated as any other normal error and will abort the query
execution.  We might avoid this error by checking users 'connection
limit' before launching the workers, but not sure if it is worth.

> From the docs:
> (https://www.postgresql.org/docs/9.6/static/how-parallel-query-works.html)
>
> "Every background worker process which is successfully started for a given
> parallel query will execute the portion of the plan which is a descendent of
> the Gather node"
>
> ...implying (at least to me) that workers may not start successfully, but
> this is handled gracefully.
>

No, if there is any error while starting workers, the query will abort.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14473: Parallel query aborts with too manyconnections

From
Steven Winfield
Date:
Hi Amit,

Thanks for your reply.

> This happens because parallel background workers uses database
> connection (BGWORKER_BACKEND_DATABASE_CONNECTION).

Yes, that's what I suspected, though I didn't expect to have to delve into the pages about the C-API to find it, and I still
can't see anything explicitly mentioning that it counts towards a user's connection limit.

> Query can proceed with lesser workers only when we launch lesser
> workers to process it. As this happens after launching the workers,
> it is treated as any other normal error and will abort the query
> execution. We might avoid this error by checking users 'connection
> limit' before launching the workers, but not sure if it is worth.
...
> No, if there is any error while starting workers, the query will abort.

Doesn't this severely limit the usability of parallel queries?

Essentially, any perfectly well-formed query can now fail, and
clients must be prepared to retry every single transaction as this error cannot be followed by anything but a rollback, otherwise:

"ERROR: current transaction is aborted, commands ignored until end of transaction block"

So the _user_ is penalised because of the way the _server_ chose to execute the query, which seems wrong to me.

It isn't enough for clients to set their own max_parallel_workers_per_gather value (to anything except zero) after
first discovering how many allowed connections they have remaining, since there is a race condition - between setting
it and executing the potentially parallel query, another connection or background worker could be needed by the same
user. Or the cluster-wide connection limit could be hit.

Am I missing something? Is there a recommended way of managing max_connections, max_worker_processes, max_parallel_workers_per_gather, and individual workers' connection limits?
(We also use pgBouncer to multiplex connections, which complicates things further.)

OTOH, if parallel queries could be made tolerant of workers failing to start then there wouldn't be as much of an issue (though doubtless this is easier said than done!)

Regards,
Steve.




This email is confidential. If you are not the intended recipient, please advise us immediately and delete this message. The registered name of Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See - http://www.gam.com/en/Legal/Email+disclosures+EU for further information on confidentiality, the risks of non-secure electronic communication, and certain disclosures which we are required to make in accordance with applicable legislation and regulations. If you cannot access this link, please notify us by reply message and we will send the contents to you.

Re: [BUGS] BUG #14473: Parallel query aborts with too many connections

From
Michael Paquier
Date:
On Fri, Dec 23, 2016 at 7:17 PM, Steven Winfield
<Steven.Winfield@cantabcapital.com> wrote:
> ________________________________
> This email is confidential. If you are not the intended recipient, please
> advise us immediately and delete this message. The registered name of
> Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See -
> http://www.gam.com/en/Legal/Email+disclosures+EU for further information on
> confidentiality, the risks of non-secure electronic communication, and
> certain disclosures which we are required to make in accordance with
> applicable legislation and regulations. If you cannot access this link,
> please notify us by reply message and we will send the contents to you.
> ________________________________

This is a public mailing list.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14473: Parallel query aborts with too many connections

From
Amit Kapila
Date:
On Fri, Dec 23, 2016 at 3:47 PM, Steven Winfield
<Steven.Winfield@cantabcapital.com> wrote:
>
> OTOH, if parallel queries could be made tolerant of workers failing to start
> then there wouldn't be as much of an issue (though doubtless this is easier
> said than done!)
>

I think it is difficult to decide which errors parallel query be made
tolerant.  I also don't think it is advisable to make parallel query
handle some errors.  I think here DBA should carefully use the
connection limits.  I think it might be okay to add a note in parallel
query docs to indicate that parallel workers use database connection.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14473: Parallel query aborts with too manyconnections

From
Steven Winfield
Date:

From: Michael Paquier [mailto:michael.paquier@gmail.com]
Sent: 23 December 2016 11:44
To: Steven Winfield
Cc: Amit Kapila; PostgreSQL Bugs
Subject: Re: [BUGS] BUG #14473: Parallel query aborts with too many connections

 

On Fri, Dec 23, 2016 at 7:17 PM, Steven Winfield
<Steven.Winfield@cantabcapital.com> wrote:
> ________________________________
> This email is confidential. If you are not the intended recipient, please
> advise us immediately and delete this message. The registered name of
> Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See -
> http://www.gam.com/en/Legal/Email+disclosures+EU for further information on
> confidentiality, the risks of non-secure electronic communication, and
> certain disclosures which we are required to make in accordance with
> applicable legislation and regulations. If you cannot access this link,
> please notify us by reply message and we will send the contents to you.
> ________________________________

This is a public mailing list.
--
Michael

 

 

Apologies - the signature is added automatically by my company’s email server (not the client). I’ve asked our IT team to whitelist postgres mailing lists.

 

Regards,

Steve.