Re: [PATCH] Make pg_basebackup configure and start standby [Review] - Mailing list pgsql-hackers

From Boszormenyi Zoltan
Subject Re: [PATCH] Make pg_basebackup configure and start standby [Review]
Date
Msg-id 506C64BE.8000903@cybertec.at
Whole thread Raw
In response to [PATCH] Make pg_basebackup configure and start standby [Review]  (Amit Kapila <amit.kapila@huawei.com>)
Responses Re: [PATCH] Make pg_basebackup configure and start standby [Review]  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
Hi,

first, thanks for the review. Comments are below.

2012-09-20 12:30 keltezéssel, Amit Kapila írta:

On Sun, 01 Jul 2012 13:02:17 +0200 Boszormenyi Zoltan wrote:

>attached is a patch that does $SUBJECT.

 

>It's a usability enhancement, to take a backup, write

>a minimalistic recovery.conf and start the streaming

>standby in one go.

 

>Comments?

 

[Review of Patch]

 

Basic stuff:
----------------------
- Patch applies OK


This is not true anymore with a newer GIT version.
Some chunk for pg_basebackup.c was rejected.

- Compiles cleanly with no warnings

What it does:
-------------------------
The pg_basebackup tool does the backup of Cluster from server to the specified location.
This new functionality will also writes the recovery.conf in the database directory and start the standby server based on options passed to pg_basebackup.

Usability
----------------
For usability aspect, I am not very sure how many users would like to start the standby server using basebackup.


Also, Magnus raised the point that it wouldn't really work on MS Windows
where you *have to* start the service via OS facilities. This part of the patch
was killed.

According to me it can be useful for users who have automated scripts to start server after backup can use this feature.


Well, scripting is not portable across UNIXes and Windows,
you have to spell out starting the server differently.


Feature Testing:
-----------------------------

1. Test pg_basebackup with option -R to check that the recovery.conf file is written to data directory.
    --recovery.conf file is created in data directory.
   
2. Test pg_basebackup with option -R to check that the recovery.conf file is not able to create because of disk full.
    --Error is given as recovery.conf file is not able to create.
     
3. Test pg_basebackup with option -S to check the standby server start on the same/different machine.
    --Starting standby server is success in if pg_basebackup is taken in different machine.
   
4. Test pg_basebackup with both options -S and -R to check the standby server start on same/different machine.
    --Starting standby server is success in if pg_basebackup is taken in different machine.
   
5. Test pg_basebackup with option -S including -h, -U, -p, -w and -W to check the standy server start
   and verify the recovery.conf which is created in data directory.
    --Except password, rest of the primary connection info parameters are working fine.


The password part is now fixed.
 

6. Test pg_basebackup with conflict options (-x or -X and -R or -S).
    --Error is given when the conflict options are provided to pg_basebackup.
   
7. Test pg_basebackup with option -S where pg_ctl/postmaster binaries are not present in the path.
    --Error is given as not able to execute.
   
8. Test pg_basebackup with option -S by connecting to a standby server.
    --standby server is started successfully when pg_basebackup is made from a standby server also.

Code Review:
----------------------------
1. In function WriteRecoveryConf(), un-initialized filename is used.
    due to which it can print junk for below line in code
   printf("add password to primary_conninfo in %s if needed\n", filename);


Fixed.

2.  In function WriteRecoveryConf(), in below code if fopen fails (due to disk full or any other file related error) it will print the error and exits.
     So now it can be confusing to user, in respect to can he consider backup as successfull and proceed. IMO, either error meesage or documentation
     can suggest the for such error user can proceed with backup to write his own recovery.conf and start the standby.

+        cf = fopen(filename, "w");
+        if (cf == NULL)
+        {
+                fprintf(stderr, _("cannot create %s"), filename);
+                exit(1);
+        }


But BaseBackup() function did indicate already that it has
finished successfully with

        if (verbose)
                fprintf(stderr, "%s: base backup completed\n", progname);

Would it be an expected (as in: not confusing) behaviour to return 0
from pg_basebackup if the backup itself has finished, but failed to write
the recovery.conf or start the standby if those were requested?

I have modified my WriteRecoveryConf() to do exit(2) instead of exit(1)
to indicate a different error. exit(1) seems to be for reporting configuration
or connection errors. (I may be mistaken though.)

3. In function main,
    instead of the following code it can be changed in two different ways,
   
            if (startstandby)
                    writerecoveryconf = true;
   
    change1:
        case 'R':
                        writerecoveryconf = true;
                        break;
                case 'S':
                        startstandby = true;
                        writerecoveryconf = true;
                        break;
                       
    change2:
                case 'S':
                        startstandby = true;
        case 'R':
                        writerecoveryconf = true;
                        break;


I went with your second variant at first but it's not needed anymore
as only "-R" exists.

4. The password is not written to primary_conninfo even if the dbpassword is present because of this reason
   connecting to the primary is failing because of authentication failure.


Fixed.

5. write the function header for the newly added functions.


Fixed.


6. execvp function is deprecated beginning in Visual C++ 2005. which is used to fork the pg_ctl process.
    http://msdn.microsoft.com/en-us/library/ms235414.aspx


This issue is now irrelevant as the standby is not started, there is no "-S" option.

7. In StartStandby function, it is better to free the memory allocated for path (path = xstrdup(command);)


Same as above.



Defects:
-------------
1. If the pg_basebackup is used in the same machine with the option of -S, the standby server start
   will fail as the port already in use because of using the same postgresql.conf.


Well, running initdb twice on the same machine with different data directories
would also cause the second server fail to start because of the same issue
and it's not called a bug. I think this is irrelevant as is and also because there
is no "-S" now.

2. If the hot_standby=off in master conf file, the same is copied to subscriber and starts the server. with that
   no client connections are allowed to the server.


Well, it simply copies the source server behaviour, which can also be a
replication standby. PostgreSQL has cascading replication, you know.


Documentation issues:
--------------------------------
1. For -R option,
Conflicts with <option>--xlog
I think it is better to explain the reason of conflict.


Fixed.


2. For -S option,
    "Start the standby database server. Implies -R option."
    I think the above can be improved to
    "Writes the recovery.conf and start the standby database server. There is no need for user to specify -R option explicitly."
    or something similar.


Not relevant anymore.

Again, thanks for the review.

The second generation of this work is now attached and contains a new
feature as was discussed and suggested by Magnus Hagander, Fujii Masao
and Peter Eisentraut. So libpq has grown a new function:

+/* return the connection options used by a live connection */
+extern PQconninfoOption *PQconninfo(PGconn *conn);

This copies all the connection parameters back from the live PGconn itself
so everything that's needed to connect is already validated.

This is used by the second patch which makes the changes in pg_basebackup
simpler and not hardcoded.

Please, review.

Best regards,
Zoltán Böszörményi

-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de    http://www.postgresql.at/
Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: do we EXEC_BACKEND on Mac OS X?
Next
From: Noah Misch
Date:
Subject: Re: Tablefunc crosstab error messages