Re: Patch for reserved connections for replication users - Mailing list pgsql-hackers

From Gibheer
Subject Re: Patch for reserved connections for replication users
Date
Msg-id 20131016010017.3e5b3559@linse.fritz.box
Whole thread Raw
In response to Re: Patch for reserved connections for replication users  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Patch for reserved connections for replication users  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, 14 Oct 2013 11:52:57 +0530
Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Sun, Oct 13, 2013 at 2:08 PM, Gibheer <gibheer@zero-knowledge.org>
> wrote:
> > On Sun, 13 Oct 2013 11:38:17 +0530
> > Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >> On Thu, Oct 10, 2013 at 3:17 AM, Gibheer
> >> <gibheer@zero-knowledge.org> wrote:
> >> > On Mon, 7 Oct 2013 11:39:55 +0530
> >> > Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> >> Robert Haas wrote:
> >> >> On Mon, Aug 5, 2013 at 2:04 AM, Andres Freund
> >> >> <andres(at)2ndquadrant(dot)com> wrote:
> >> >> >>> Hmm.  It seems like this match is making MaxConnections no
> >> >> >>> longer mean the maximum number of connections, but rather
> >> >> >>> the maximum number of non-replication connections.  I don't
> >> >> >>> think I support that definitional change, and I'm kinda
> >> >> >>> surprised if this is sufficient to implement it anyway
> >> >> >>> (e.g. see InitProcGlobal()).
> >> >> >
> >> >> >> I don't think the implementation is correct, but why don't
> >> >> >> you like the definitional change? The set of things you can
> >> >> >> do from replication connections are completely different
> >> >> >> from a normal connection. So using separate "pools" for them
> >> >> >> seems to make sense. That they end up allocating similar
> >> >> >> internal data seems to be an implementation detail to me.
> >> >>
> >> >> > Because replication connections are still "connections".  If I
> >> >> > tell the system I want to allow 100 connections to the server,
> >> >> > it should allow 100 connections, not 110 or 95 or any other
> >> >> > number.
> >> >>
> >> >> I think that to reserve connections for replication, mechanism
> >> >> similar to superuser_reserved_connections be used rather than
> >> >> auto vacuum workers or background workers.
> >> >> This won't change the definition of MaxConnections. Another
> >> >> thing is that rather than introducing new parameter for
> >> >> replication reserved connections, it is better to use
> >> >> max_wal_senders as it can serve the purpose.
> >> >>
> >> >> Review for replication_reserved_connections-v2.patch,
> >> >> considering we are going to use mechanism similar to
> >> >> superuser_reserved_connections and won't allow definition of
> >> >> MaxConnections to change.
> >> >>
> >>
> >> >
> >> > Hi,
> >> >
> >> > I took the time and reworked the patch with the feedback till
> >> > now. Thank you very much Amit!
> >> >
> >> > So this patch uses max_wal_senders together with the idea of the
> >> > first patch I sent. The error messages are also adjusted to make
> >> > it obvious, how it is supposed to be and all checks work, as far
> >> > as I could tell.
> >>
> >> If I understand correctly, now the patch has implementation such
> >> that a. if the number of connections left are (ReservedBackends +
> >> max_wal_senders), then only superusers or replication connection's
> >> will be allowed
> >> b. if the number of connections left are ReservedBackend, then only
> >> superuser connections will be allowed.
> >
> > That is correct.
> >
> >> So it will ensure that max_wal_senders is used for reserving
> >> connection slots from being used by non-super user connections. I
> >> find new usage of max_wal_senders acceptable, if anyone else thinks
> >> otherwise, please let us know.
> >>
> >>
> >> 1.
> >> +        <varname>superuser_reserved_connections</varname>
> >> +        <varname>max_wal_senders</varname> only superuser and WAL
> >> connections
> >> +        are allowed.
> >>
> >> Here minus seems to be missing before max_wal_senders and I think
> >> it will be better to use replication connections rather than WAL
> >> connections.
> >
> > This is fixed.
> >
> >> 2.
> >> -        new replication connections will be accepted.
> >> +        new WAL or other connections will be accepted.
> >>
> >> I think as per new implementation, we don't need to change this
> >> line.
> >
> > I reverted that change.
> >
> >> 3.
> >> + * reserved slots from max_connections for wal senders. If the
> >> number of free
> >> + * slots (max_connections - max_wal_senders) is depleted.
> >>
> >>  Above calculation (max_connections - max_wal_senders) needs to
> >> include super user reserved connections.
> >
> > My first thought was, that I would not add it here. When superuser
> > reserved connections are not set, then only max_wal_senders would
> > count.
> > But you are right, it has to be set, as 3 connections are reserved
> > by default for superusers.
> 
> + * slots (max_connections - superuser_reserved_connections -
> max_wal_senders) here it should be ReservedBackends rather than
> superuser_reserved_connections.

fixed

> >> 4.
> >> + /*
> >> + * Although replication connections currently require superuser
> >> privileges, we
> >> + * don't allow them to consume the superuser reserved slots, which
> >> are
> >> + * intended for interactive use.
> >>   */
> >>   if ((!am_superuser || am_walsender) &&
> >>   ReservedBackends > 0 &&
> >>   !HaveNFreeProcs(ReservedBackends))
> >>   ereport(FATAL,
> >>   (errcode(ERRCODE_TOO_MANY_CONNECTIONS),
> >> - errmsg("remaining connection slots are reserved for
> >> non-replication superuser connections")));
> >> + errmsg("remaining connection slots are reserved for superuser
> >> connections")));
> >>
> >> Will there be any problem if we do the above check before the check
> >> for wal senders and reserved replication connections (+
> >> !HaveNFreeProcs(max_wal_senders + ReservedBackends))) and don't
> >> change the error message in this check. I think this will ensure
> >> that users who doesn't enable max_wal_senders will see the same
> >> error message as before and the purpose to reserve connections for
> >> replication can be served by your second check.
> >
> > I have attached two patches, one that checks only max_wal_senders
> > first and the other checks reserved_backends first. Both return the
> > original message, when max_wal_sender is not set and I think it is
> > only a matter of taste, which comes first.
> > Me, I would prefer max_wal_senders to get from more connections to
> > less.
> 
> I think there is no major problem even if we keep max_wal_senders
> check first, but there could be error message inconsistency. Please
> consider below scenario:
> max_connections=5
> superuser_reserved_connections=3
> max_wal_senders = 2
> 
> 1. Start primary database server M1
> 2. Start first standby S1
> 3. Start second standby S2
> 4. Now try to connect by non-super user to M1 -- here it will return
> error msg as "psql: FATAL:  remaining connection slots are reserved
> for replication
>     and superuser connections"
>     By above error message, it seems that replication connection is
> allowed, but actually it will give error for new replication
> connection, see next step
> 5. Start third standby S3 -- here an appropriate error message "FATAL:
>  could not connect to the primary server: FATAL:  remaining connection
>     slots are reserved for non-replication superuser connections"
> 
> I feel there is minor inconsistency in step-4 if we use
> max_wal_senders check before ReservedBackends check.

Yes, that is rather annoying. I fixed that with the other hint, that
there should be a check for ReservedBackends + max_wal_senders <
max_connections to have at least one free connection.

> >> 5.
> >> + gettext_noop("Sets the maximum number wal sender connections and
> >> reserves them."),
> >>
> >> Sets the maximum number 'of' wal sender connections and reserves
> >> them. a. 'of' is missing in above line.
> >> b. I think 'reserves them' is not completely right, because super
> >> user connections will be allowed. How about if we add something
> >> similar to 'and reserves them from being used by non-superuser
> >> connections' in above line.
> >
> > This is fixed.
> + gettext_noop("Sets the maximum number of wal sender connections and
> reserves "
> + "them from being used non-superuser connections."),
> 
> "them from being used 'by' non-superuser connections."
> 'by' is missing in above line.
> 
> 1.
> if (ReservedBackends >= MaxConnections)
> {
> write_stderr("%s: superuser_reserved_connections must be less than
> max_connections\n", progname);
> ExitPostmaster(1);
> }
> 
> I think we should have one check similar to above for (max_wal_senders
> + ReservedBackends >= MaxConnections) to avoid server starting
> with values, where not even 1 non-superuser connection will be
> allowed. I think this is a reasoning similar to why the check for
> ReservedBackends exists.

I have added this, see comment above.

> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
> 

I would be glad, if you could also test the patch again, as I'm nearly
code blind after testing it for 4 hours.
I had the problem, that I could not attach as many replication
connections as I wanted, as they were denied as normal connections. I
think I got it fixed, but I'm not 100% sure at the moment.
After some sleep, I will read the code again and test it again, to make
sure, it really does what it is supposed to do.

Thank you,

Stefan



pgsql-hackers by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: [PoC] pgstattuple2: block sampling to reduce physical read
Next
From: James Sewell
Date:
Subject: Re: PSQL return coder