Thread: Patch: Revised documentation on base backups

Patch: Revised documentation on base backups

From
Amir Rohan
Date:
On 09/28/2015 06:33 AM, Michael Paquier wrote:
> On Sun, Sep 27, 2015 at 11:27 AM, Amir Rohan wrote:
>> Further editing. See attached V3.
>
> I think that you should consider adding this patch to the next commit
> fest so as we do not lose track of it:
> https://commitfest.postgresql.org/7/
>


I've recently picked up the base backup section of the documentation:
http://www.postgresql.org/docs/9.4/static/continuous-archiving.html
(Section 24.3.2) and went through like a tutorial. I'm a relative
newcomer to postgres, so this was the real deal.

The docs seem to be in need of some improvement, primarily because
it has the feel of having been monkey-patched repeatedly it in the
past. It needs to have someone with more experience read through it
and comment on technical accuracy and any knee-slapper error I
might have introduced.


One pain point in rewriting is the definition of terms, or lack
there of in the manual. I didn't find a section defining terms
conclusively for consistent referencing throughout the documentation.
For example, how should one refer to the directory housing
"postgresql.conf"?, is it:

1. The data directory
2. The storage directory
3. The PGDATA directory
4. The cluster directory
5. The server root directory
6. The config file directory
7. etc'

This lack of clear guidelines and uniformly applied terms
makes for some awkward verbal manoeuvring and compromises the
quality of the documentation (but admittedly, not its copiousness).

Amir


Attachment

Re: Patch: Revised documentation on base backups

From
Noah Misch
Date:
On Mon, Sep 28, 2015 at 07:17:39AM +0300, Amir Rohan wrote:
> The docs seem to be in need of some improvement, primarily because
> it has the feel of having been monkey-patched repeatedly it in the
> past. It needs to have someone with more experience read through it
> and comment on technical accuracy and any knee-slapper error I
> might have introduced.

That's mostly outside my area, but I can answer one of your questions:

> One pain point in rewriting is the definition of terms, or lack
> there of in the manual. I didn't find a section defining terms
> conclusively for consistent referencing throughout the documentation.
> For example, how should one refer to the directory housing
> "postgresql.conf"?, is it:
> 
> 1. The data directory
> 2. The storage directory
> 3. The PGDATA directory
> 4. The cluster directory
> 5. The server root directory
> 6. The config file directory
> 7. etc'

We have no standard term for "the directory housing 'postgresql.conf'".  Since
the config_file setting customizes that directory independent of all other
directories of interest, it is not a reusable concept.  Of the names you list,
"data directory" is the dominant term.  (The data directory does not
necessarily hold configuration files, though it does by default.)  "cluster
directory" appears a few times; those could just as easily say "data
directory".  The other terms see little or no use.

> +    To aid you in doing this, the base backup process creates a
> +    a textual file called <filename>backup_label</> when you initiate the
> +    base backup, it in the same directory as your server's
> +    configuration files.

> +     This places the database in backup mode, creating the
> +     <filename>backup_label</>
> +     file described earlier, as well as the <firstterm>tablespace map</>
> +     file called
> +     <filename>tablespace_map</> which contains information about tablespace
> +     symbolic links in <filename>pg_tblspc/</>, if such links are present.
> +     Both files are created in your cluster directory, alongside
> +     your configuration files, and are critical to the integrity of
> +     the backup.

I would write "<...> created in the data directory and are critical to <...>".
backup_label and tablespace_map go there (setting "data_directory") regardless
of configuration file location.

Thanks,
nm