Thread: replication_reserved_connections

replication_reserved_connections

From
Marko Tiikkaja
Date:
Hi,

Yesterday an interesting scenario was diagnosed on IRC.  If you're 
running a synchronous slave and the connection to the slave is lost 
momentarily, your backends start naturally waiting for the slave to 
reconnect.  If then your application keeps trying to create new 
connections, it can use all non-reserved connections, thus locking out 
the synchronous slave when the connection problem has resolved itself. 
This brings the entire cluster into a state where manual intervention is 
necessary.

While you could limit the number of connections for non-replication 
roles, that's not always possible or desirable.  I would like to 
introduce a way to reserve connection slots for replication.  However, 
it's not clear how this would work.  I looked at how 
superuser_reserved_connections is implented, and with small changes I 
could see how to implement two ideas:
  1) Reserve a portion of superuser_reserved_connections for replication     connections.  For example, with
max_connections=10,    superuser_reserved_connections=2 and     replication_reserved_connections=1, at 8 connections
eithera     replication connection or a superuser connection can be created,     and at 9 connections only a superuser
onewould be allowed.  This     is a bit clumsy as there still aren't guaranteed slots for     replication.  2) A GUC
whichsays "superuser_reserved_connections can be used up by     replication connections", and then limiting the number
of    replication connections using per-role limits to make sure     superusers aren't locked out.
 

Does anyone see a better way to do this?  I'm not too satisfied with 
either of these ideas.


Regards,
Marko Tiikkaja



Re: replication_reserved_connections

From
Atri Sharma
Date:

Sent from my iPad

On 28-Jul-2013, at 5:53, Marko Tiikkaja <marko@joh.to> wrote:

> Hi,
>
> Yesterday an interesting scenario was diagnosed on IRC.  If you're running a synchronous slave and the connection to
theslave is lost momentarily, your backends start naturally waiting for the slave to reconnect.  If then your
applicationkeeps trying to create new connections, it can use all non-reserved connections, thus locking out the
synchronousslave when the connection problem has resolved itself. This brings the entire cluster into a state where
manualintervention is necessary. 
>
Solving that was fun!

> While you could limit the number of connections for non-replication roles, that's not always possible or desirable.
Iwould like to introduce a way to reserve connection slots for replication.  However, it's not clear how this would
work. I looked at how superuser_reserved_connections is implented, and with small changes I could see how to implement
twoideas: 
>
>  1) Reserve a portion of superuser_reserved_connections for replication
>     connections.  For example, with max_connections=10,
>     superuser_reserved_connections=2 and
>     replication_reserved_connections=1, at 8 connections either a
>     replication connection or a superuser connection can be created,
>     and at 9 connections only a superuser one would be allowed.  This
>     is a bit clumsy as there still aren't guaranteed slots for
>     replication.
>

I would generally in agree with sharing super user reserved connections with replication.One thing I would like to
exploreis if we could potentially add some sort of priority system for avoiding contention between super user threads
andreplication threads competing for the same connection. 

We could potentially add a GUC for specifying which has the higher priority.

I am just musing here,though.

Thanks and Regards,

Atri


Re: replication_reserved_connections

From
Marko Tiikkaja
Date:
On 28/07/2013 08:51, Atri Sharma wrote:
> I would generally in agree with sharing super user reserved connections with replication.One thing I would like to
exploreis if we could potentially add some sort of priority system for avoiding contention between super user threads
andreplication threads competing for the same connection.
 
>
> We could potentially add a GUC for specifying which has the higher priority.

This sounds an awful lot like it would have to scan through the list of 
existing connections, which I wanted to avoid.

Or maybe we could maintain a separate list of "reserved" connections, 
i.e. ones that were created when we were at max_connections - 
ReservedBackends?  We could quickly look through that list to see how 
many of which we have allowed.  Not sure if that's practical, though.



Regards,
Marko Tiikkaja



Re: replication_reserved_connections

From
Gibheer
Date:
On Sun, 28 Jul 2013 02:23:47 +0200
Marko Tiikkaja <marko@joh.to> wrote:

> Hi,
> 
> Yesterday an interesting scenario was diagnosed on IRC.  If you're 
> running a synchronous slave and the connection to the slave is lost 
> momentarily, your backends start naturally waiting for the slave to 
> reconnect.  If then your application keeps trying to create new 
> connections, it can use all non-reserved connections, thus locking
> out the synchronous slave when the connection problem has resolved
> itself. This brings the entire cluster into a state where manual
> intervention is necessary.
> 
> While you could limit the number of connections for non-replication 
> roles, that's not always possible or desirable.  I would like to 
> introduce a way to reserve connection slots for replication.
> However, it's not clear how this would work.  I looked at how 
> superuser_reserved_connections is implented, and with small changes I 
> could see how to implement two ideas:
> 
>    1) Reserve a portion of superuser_reserved_connections for
> replication connections.  For example, with max_connections=10,
>       superuser_reserved_connections=2 and
>       replication_reserved_connections=1, at 8 connections either a
>       replication connection or a superuser connection can be created,
>       and at 9 connections only a superuser one would be allowed.
> This is a bit clumsy as there still aren't guaranteed slots for
>       replication.
>    2) A GUC which says "superuser_reserved_connections can be used up
> by replication connections", and then limiting the number of
>       replication connections using per-role limits to make sure
>       superusers aren't locked out.
> 
> Does anyone see a better way to do this?  I'm not too satisfied with 
> either of these ideas.
> 
> 
> Regards,
> Marko Tiikkaja
> 
> 

Hi,

I had the same problem and I created a patch to introduce a GUC for
reserved_replication_connections as a seperate flag.
You can find my patch here
https://commitfest.postgresql.org/action/patch_view?id=1180

I am still waiting for feedback though.

regards,

Stefan Radomski



Re: replication_reserved_connections

From
Marko Tiikkaja
Date:
On 2013-07-28 19:21, Gibheer wrote:
> I had the same problem and I created a patch to introduce a GUC for
> reserved_replication_connections as a seperate flag.
> You can find my patch here
> https://commitfest.postgresql.org/action/patch_view?id=1180

Oops.  I guess I should've searched through the archives before my 
email.  I didn't remember seeing anything about this so I just assumed 
nobody was working on it.

I'll take a look at your patch..


Regards,
Marko Tiikkaja



Re: replication_reserved_connections

From
Andres Freund
Date:
On 2013-07-28 02:23:47 +0200, Marko Tiikkaja wrote:
> While you could limit the number of connections for non-replication roles,
> that's not always possible or desirable.  I would like to introduce a way to
> reserve connection slots for replication.  However, it's not clear how this
> would work.  I looked at how superuser_reserved_connections is implented,
> and with small changes I could see how to implement two ideas:
> 
> Does anyone see a better way to do this?  I'm not too satisfied with either
> of these ideas.

Personally I think we should just shouldn't allow normal connections for
the backend slots added by max_wal_senders. They are internally *added*
to max_connections, so limiting that seems perfectly fine to me since
the system provides max_connections connections externally.

Hm... I wonder how that's managed for 9.4's max_worker_processes.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: replication_reserved_connections

From
Robert Haas
Date:
On Sun, Jul 28, 2013 at 2:50 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-07-28 02:23:47 +0200, Marko Tiikkaja wrote:
>> While you could limit the number of connections for non-replication roles,
>> that's not always possible or desirable.  I would like to introduce a way to
>> reserve connection slots for replication.  However, it's not clear how this
>> would work.  I looked at how superuser_reserved_connections is implented,
>> and with small changes I could see how to implement two ideas:
>>
>> Does anyone see a better way to do this?  I'm not too satisfied with either
>> of these ideas.
>
> Personally I think we should just shouldn't allow normal connections for
> the backend slots added by max_wal_senders. They are internally *added*
> to max_connections, so limiting that seems perfectly fine to me since
> the system provides max_connections connections externally.
>
> Hm... I wonder how that's managed for 9.4's max_worker_processes.

See InitProcGlobal().  There are three lists of PGPROC objects.
PGPROCs for incoming connections are pulled off of
ProcGlobal->freeProcs, the autovacuum and its workers pull from
ProcGlobal->autovacFreeProcs, and background workers pull from
ProcGlobal->bgworkerFreeProcs.  Auxiliary processes have a separate
pool of PGPROCs to pull from, but they use linear search rather than a
list, for reasons described in the comments in that function.

There may be other checks elsewhere that enforce these same limits; not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company