Re: Function to promote standby servers - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: Function to promote standby servers
Date
Msg-id 183e19f71ae38f7447c2f63317f0108add6c6259.camel@cybertec.at
Whole thread Raw
In response to Re: Function to promote standby servers  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Function to promote standby servers  (Ashwin Agrawal <aagrawal@pivotal.io>)
Re: Function to promote standby servers  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Michael Paquier wrote:
> > diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> > index 7375a78ffc..3a1f49e83a 100644
> > --- a/src/backend/access/transam/xlog.c
> > +++ b/src/backend/access/transam/xlog.c
> > @@ -81,8 +81,6 @@ extern uint32 bootstrap_data_checksum_version;
> >  /* File path names (all relative to $PGDATA) */
> >  #define RECOVERY_COMMAND_FILE    "recovery.conf"
> >  #define RECOVERY_COMMAND_DONE    "recovery.done"
> > -#define PROMOTE_SIGNAL_FILE        "promote"
> > -#define FALLBACK_PROMOTE_SIGNAL_FILE "fallback_promote"
> 
> Perhaps we could just move the whole set to xlog.h.

Done.

> > +Datum
> > +pg_promote(PG_FUNCTION_ARGS)
> > +{
> > +    FILE *promote_file;
> > +
> > +    if (!superuser())
> > +        ereport(ERROR,
> > +                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > +                 errmsg("must be superuser to promote standby servers")));
> 
> Let's remove this restriction at code level, and instead use
> function-level ACLs so as it is possible to allow non-superusers to
> trigger a promotion as well, say a dedicated role.  We could add a
> system role for this purpose, but I am not sure that it is worth the
> addition yet.

Agreed, done.

> > +    while ($self->safe_psql('postgres', "SELECT pg_is_in_recovery()") != 'f')
> > +    {
> > +         sleep 1;
> > +    }
> > +    return;
> 
> This could go on infinitely, locking down buildfarm machines silently if
> a commit is not careful.  What I would suggest to do instead is to not
> touch PostgresNode.pm at all, and to just modify one test to trigger
> promotion with the SQL function.  Then from the test, you should add a
> comment that triggerring promotion from SQL is wanted instead of the
> internal routine, and then please add a call to poll_query_until() in
> the same way as what 6deb52b2 has removed.

I have modified recovery/t/004_timeline_switch.pl and recovery/t/009_twophase.pl
accordingly: one calls the function with "wait => true", the other
uses "wait => false" and waits as you suggested.

> As of now, this function returns true if promotion has been triggered,
> and false if not.  However it seems to me that we should have something
> more consistent with "pg_ctl promote", so there are actually three
> possible states:
> 1) Node is in recovery, with promotion triggered.  pg_promote() returns
> true in case of success in creating the trigger file.
> 2) Node is in recovery, with promotion triggered.  pg_promote() returns
> false in case of failure in creating the trigger file.
> 3) Node is not in recovery, where I think a call to pg_promote should
> trigger an error.  This is not handled in the patch.

So far I had returned "false" in the last case, but I am fine with
throwing an error in that case.  Done.

> An other thing which has value is to implement a "wait" mode for this
> feature, or actually a nowait mode.  You could simply do what pg_ctl
> does with its logic in get_control_dbstate() by looking at the control
> file state.  I think that waiting for the promotion to happen should be
> the default.

I have implemented that.

> At the end this patch needs a bit more work.

Yes, it did.  Thanks for the thorough review!

Yours,
Laurenz Albe

Attachment

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] pgbench - allow to store select results intovariables
Next
From: Peter Eisentraut
Date:
Subject: Re: Requesting advanced Group By support