Thread: Documentation for initdb option --waldir

Documentation for initdb option --waldir

From
PG Doc comments form
Date:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/app-initdb.html
Description:

Hi all,

During troubleshooting with a customer we found that it might help to know
that initdb will create a symbolic link from data/pg_wal to the custom
directory with the option --waldir.
https://www.postgresql.org/docs/current/app-initdb.html
For instance:
This option specifies the directory where the write-ahead log should be
stored. A symbolic link is created from pg_wal in the main data directory to
the specified location.
(Like in this documentation https://pgpedia.info/p/pg_wal.html, but there is
a typo or missing word in the initdb section: "A symbolic link from pg_wal
in the main data directory to the specified location.")

And maybe also here a hint, that moving the pg_wal directory to another
location and creating a symbolic is what initdb --waldir does.
https://www.postgresql.org/docs/current/wal-internals.html
For instance:
It is advantageous if the WAL is located on a different disk from the main
database files. This can be achieved by moving the pg_wal directory to
another location (while the server is shut down, of course) and creating a
symbolic link from the original location in the main data directory to the
new location. This is also what the --waldir initdb option does. 

Thank you very much and kind regards
Theo

Re: Documentation for initdb option --waldir

From
"David G. Johnston"
Date:
On Wed, Mar 26, 2025 at 2:35 AM PG Doc comments form <noreply@postgresql.org> wrote:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/app-initdb.html
Description:

Hi all,

During troubleshooting with a customer we found that it might help to know
that initdb will create a symbolic link from data/pg_wal to the custom
directory with the option --waldir.


Thank you for the observation.  This indeed seems like something we should cover in a bit more detail.  How does the proposed patch (attached) look?

The main initdb change, added to Notes, is:

The pg_wal subdirectory will always exist within the data directory. This is where PostgreSQL sends its write-ahead log (WAL) files. Specifying the --waldir option turns this subdirectory entry into a symbolic link. In general, this is only useful if the remote location is on a different physical device. An existing directory must be empty and should not be a mount point. The directory will be created (including missing parents) if necessary.

David J.

Attachment

Re: Documentation for initdb option --waldir

From
Laurenz Albe
Date:
On Wed, 2025-03-26 at 17:34 -0700, David G. Johnston wrote:
> +  <para>
> +   The <filename>pg_wal</filename> subdirectory will always exist within the
> +   data directory.  This is where <productname>PostgreSQL</productname>
> +   sends its write-ahead log (<acronym>WAL</acronym>) files.
> +   Specifying the <option>--waldir</option> option turns this subdirectory entry
> +   into a symbolic link. In general, this is only useful if the remote location
> +   is on a different physical device.  An existing directory must be empty and
> +   should not be a mount point.  The directory will be created
> +   (including missing parents) if necessary.
> +  </para>

I think that it is very valuable to have WAL on a different file system on the
same storage device.  The idea is that growing data files cannot exhaust the
space available for WAL.

How about this:

    There is always a <filename>pg_wal</filename> within the data directory.
    By default, it is a directory where <productname>PostgreSQL</productname>
    places its write-ahead log (<acronym>WAL</acronym>) segment files.  If
    you create the <acronym>WAL</acronym> location somewhere else using the option
    <option>--waldir</option>, <filename>pg_wal</filename> will be created as
    a symbolic link pointing to that <acronym>WAL</acronym> location.
    If the directory already exists, it must be empty and should not be a mount
    point.  The directory will be created (including missing parents) if necessary.

Yours,
Laurenz Albe



AW: Documentation for initdb option --waldir

From
Theodor Herlo
Date:
Both look great to me!  I thought the advantage of having different devices is also to easy I/O load. But I guess you
haveto decide for yourself what is the best strategy. For me it was important to know what initdb does with ---waldir.
Andthis is nicely explained here.  
One more little thing, I stumbled across " should not be a mount point", sounds thirst like it was not good to use the
mountpoint. Maybe: " should not be the mount point itself but a directory in the mount point". It might be only me, you
diced. 

Thank you very much!

Regards,
Theo

-----Ursprüngliche Nachricht-----
Von: Laurenz Albe <laurenz.albe@cybertec.at>
Gesendet: Donnerstag, 27. März 2025 11:32
An: David G. Johnston <david.g.johnston@gmail.com>; Theodor Herlo <t.herlo@proventa.de>;
pgsql-docs@lists.postgresql.org
Betreff: Re: Documentation for initdb option --waldir

[Sie erhalten nicht häufig E-Mails von laurenz.albe@cybertec.at. Weitere Informationen, warum dies wichtig ist, finden
Sieunter https://aka.ms/LearnAboutSenderIdentification ] 

On Wed, 2025-03-26 at 17:34 -0700, David G. Johnston wrote:
> +  <para>
> +   The <filename>pg_wal</filename> subdirectory will always exist within the
> +   data directory.  This is where <productname>PostgreSQL</productname>
> +   sends its write-ahead log (<acronym>WAL</acronym>) files.
> +   Specifying the <option>--waldir</option> option turns this subdirectory entry
> +   into a symbolic link. In general, this is only useful if the remote location
> +   is on a different physical device.  An existing directory must be empty and
> +   should not be a mount point.  The directory will be created
> +   (including missing parents) if necessary.
> +  </para>

I think that it is very valuable to have WAL on a different file system on the same storage device.  The idea is that
growingdata files cannot exhaust the space available for WAL. 

How about this:

    There is always a <filename>pg_wal</filename> within the data directory.
    By default, it is a directory where <productname>PostgreSQL</productname>
    places its write-ahead log (<acronym>WAL</acronym>) segment files.  If
    you create the <acronym>WAL</acronym> location somewhere else using the option
    <option>--waldir</option>, <filename>pg_wal</filename> will be created as
    a symbolic link pointing to that <acronym>WAL</acronym> location.
    If the directory already exists, it must be empty and should not be a mount
    point.  The directory will be created (including missing parents) if necessary.

Yours,
Laurenz Albe



Re: Documentation for initdb option --waldir

From
"Euler Taveira"
Date:
On Thu, Mar 27, 2025, at 8:40 AM, Theodor Herlo wrote:
Both look great to me!  I thought the advantage of having different devices is also to easy I/O load. But I guess you have to decide for yourself what is the best strategy. For me it was important to know what initdb does with ---waldir. And this is nicely explained here. 
One more little thing, I stumbled across " should not be a mount point", sounds thirst like it was not good to use the mount point. Maybe: " should not be the mount point itself but a directory in the mount point". It might be only me, you diced. 

Laurenz suggestion looks good to me with some suggestions. I agree with
Theodor's addition but I would use the same sentence from the source code that
says 'subdirectory under the mount point'. I would also consider and/or modify
pg_basebackup documentation that has a similar option. My suggestion is that
waldir option in both tools has similar sentences. The term 'WAL location' is
not appropriate because this terminology is already used to indicate the LSN. I
would replace it with WAL directory.


--
Euler Taveira

Re: Documentation for initdb option --waldir

From
"David G. Johnston"
Date:
Version 2 Attached

There being two places now (plus doing it manually) I decided to write this material in the WAL chapter as opposed to application-specific Notes.  A new section seemed warranted.

I expanded upon the material regarding using different file systems and disks.

I would like to add a similar "why" to the mount point recommendation but don't know what that would be.  Suggestions welcomed.

David J.

Attachment

Re: Documentation for initdb option --waldir

From
Robert Treat
Date:
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> Version 2 Attached
>
> -        This option specifies the directory where the write-ahead log
> -        should be stored.
> +        This option specifies the directory in which to store write-ahead log files.
> +        See <xref linkend="wal-internals-relocation"/> for more information.

To be honest, I think the proposed text is a little verbose, and we
could cut it right here. As an example, we don't go into how the
directory must be empty or chnmod settings for --pgdata. I'd also edit
the pg_basebackup changes to match if it were me.

> +   the <option>--waldir</option> option.  On an existing, but not running,
> +   cluster use operating system commands to move the contents of the
> +   <filename>pg_wal</filename> directory to the new location, remove the
> +   empty directory, and create the symbolic link named <filename>pg_wal</filename>
> +   pointing to the new location.

I found this to be awkward to read, and not as significant a warning
as the existing text... perhaps "For an existing cluster, once the
cluster has been shut down, you can use operating system commands... "

> There being two places now (plus doing it manually) I decided to write this material in the WAL chapter as opposed to
application-specificNotes.  A new section seemed warranted. 
>

+1

> I expanded upon the material regarding using different file systems and disks.
>
> I would like to add a similar "why" to the mount point recommendation but don't know what that would be.  Suggestions
welcomed.
>

I'm not sure I follow what you are asking for... but a non-performance
reason to use a seperate mount point for pg_wal, even if the
underlying storage is the same, would be for something like using
filesystem snapshots to grab contents of the data directory without
grabbing wal (which can be handled separately).


Robert Treat
https://xzilla.net



Re: Documentation for initdb option --waldir

From
"David G. Johnston"
Date:
On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> Version 2 Attached
>
> -        This option specifies the directory where the write-ahead log
> -        should be stored.
> +        This option specifies the directory in which to store write-ahead log files.
> +        See <xref linkend="wal-internals-relocation"/> for more information.

To be honest, I think the proposed text is a little verbose, and we
could cut it right here. As an example, we don't go into how the
directory must be empty or chnmod settings for --pgdata. I'd also edit
the pg_basebackup changes to match if it were me.

Don't really see how completeness hurts here, but also not married to it.

> +   the <option>--waldir</option> option.  On an existing, but not running,
> +   cluster use operating system commands to move the contents of the
> +   <filename>pg_wal</filename> directory to the new location, remove the
> +   empty directory, and create the symbolic link named <filename>pg_wal</filename>
> +   pointing to the new location.

I found this to be awkward to read, and not as significant a warning
as the existing text... perhaps "For an existing cluster, once the
cluster has been shut down, you can use operating system commands... "

I'll probably put it back into parentheses to get the desired emphasis.  That did nag at me too as I wrote it.


> There being two places now (plus doing it manually) I decided to write this material in the WAL chapter as opposed to application-specific Notes.  A new section seemed warranted.
>

+1

> I expanded upon the material regarding using different file systems and disks.
>
> I would like to add a similar "why" to the mount point recommendation but don't know what that would be.  Suggestions welcomed.
>

I'm not sure I follow what you are asking for... but a non-performance
reason to use a seperate mount point for pg_wal, even if the
underlying storage is the same, would be for something like using
filesystem snapshots to grab contents of the data directory without
grabbing wal (which can be handled separately).


If I mount the filesystem on disk2 to:  /mnt/disk2
Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?

David J.

Re: Documentation for initdb option --waldir

From
Robert Treat
Date:
On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
>> On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
>> <david.g.johnston@gmail.com> wrote:
>> > I expanded upon the material regarding using different file systems and disks.
>> >
>> > I would like to add a similar "why" to the mount point recommendation but don't know what that would be.
Suggestionswelcomed. 
>> >
>>
>> I'm not sure I follow what you are asking for... but a non-performance
>> reason to use a seperate mount point for pg_wal, even if the
>> underlying storage is the same, would be for something like using
>> filesystem snapshots to grab contents of the data directory without
>> grabbing wal (which can be handled separately).
>>
>
> If I mount the filesystem on disk2 to:  /mnt/disk2
> Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?
>

My immediate response to this was "because Postgres won't let you"
which seemed unhelpful, and that I couldn't remember why was pretty
unsatisfying, so I dug around in the source which was unhelpful but
eventually came across this from
https://www.postgresql.org/docs/devel/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS

"Best practice is to create a directory within the mount-point
directory that is owned by the PostgreSQL user, and then create the
data directory within that. This avoids permissions problems,..."

Which I do remember having tried to do it directly and the OS
complaining that my mount point wasn't owned by root and/or Postgres
complaining that the xlog dir wasn't owned by Postgres, so I think
this advice probably still holds.


Robert Treat
https://xzilla.net



Re: Documentation for initdb option --waldir

From
"David G. Johnston"
Date:
On Saturday, March 29, 2025, Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
>> On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
>> <david.g.johnston@gmail.com> wrote:
>> > I expanded upon the material regarding using different file systems and disks.
>> >
>> > I would like to add a similar "why" to the mount point recommendation but don't know what that would be.  Suggestions welcomed.
>> >
>>
>
> If I mount the filesystem on disk2 to:  /mnt/disk2
> Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?
>

eventually came across this from
https://www.postgresql.org/docs/devel/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS

"Best practice is to create a directory within the mount-point
directory that is owned by the PostgreSQL user, and then create the
data directory within that. This avoids permissions problems,..."

Which I do remember having tried to do it directly and the OS
complaining that my mount point wasn't owned by root and/or Postgres
complaining that the xlog dir wasn't owned by Postgres, so I think
this advice probably still holds.


Thank you, I can definitely work that in and it makes sense.

On the topic of verbosity, I found the wording for —pgdata in pg_basebackup to support the more complete description.


I believe that at least documenting external side-effects should be required.  I’m less convinced that pre-conditions that will be checked by the application need to be listed.  But for now I’m going to copy pg_basebackup as my example and at some point might get to doing a survey and proposing a new standard wording for —waldir and —pgdata descriptions regarding creation.

The comment regarding absolute paths will remain unwritten.

David J.

Re: Documentation for initdb option --waldir

From
Robert Treat
Date:
On Sat, Mar 29, 2025 at 10:58 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Saturday, March 29, 2025, Robert Treat <rob@xzilla.net> wrote:
>>
>> On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
>> <david.g.johnston@gmail.com> wrote:
>> > On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
>> >> On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
>> >> <david.g.johnston@gmail.com> wrote:
>> >> > I expanded upon the material regarding using different file systems and disks.
>> >> >
>> >> > I would like to add a similar "why" to the mount point recommendation but don't know what that would be.
Suggestionswelcomed. 
>> >> >
>> >>
>> >
>> > If I mount the filesystem on disk2 to:  /mnt/disk2
>> > Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?
>> >
>>
>> eventually came across this from
>> https://www.postgresql.org/docs/devel/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS
>>
>> "Best practice is to create a directory within the mount-point
>> directory that is owned by the PostgreSQL user, and then create the
>> data directory within that. This avoids permissions problems,..."
>>
>> Which I do remember having tried to do it directly and the OS
>> complaining that my mount point wasn't owned by root and/or Postgres
>> complaining that the xlog dir wasn't owned by Postgres, so I think
>> this advice probably still holds.
>>
>
> Thank you, I can definitely work that in and it makes sense.
>
> On the topic of verbosity, I found the wording for —pgdata in pg_basebackup to support the more complete description.
>
> https://www.postgresql.org/docs/current/app-pgbasebackup.html
>
> I believe that at least documenting external side-effects should be required.  I’m less convinced that pre-conditions
thatwill be checked by the application need to be listed.  But for now I’m going to copy pg_basebackup as my example
andat some point might get to doing a survey and proposing a new standard wording for —waldir and —pgdata descriptions
regardingcreation. 
>
> The comment regarding absolute paths will remain unwritten.
>

Works for me.


Robert Treat
https://xzilla.net



Re: Documentation for initdb option --waldir

From
"Gurjeet Singh"
Date:
On Sat Mar 29, 2025 at 7:26 AM PDT, Robert Treat wrote:
> On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
> <david.g.johnston@gmail.com> wrote:
>> On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
>>> On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
>>> <david.g.johnston@gmail.com> wrote:
>>> > I expanded upon the material regarding using different file systems and disks.
>>> >
>>> > I would like to add a similar "why" to the mount point recommendation but don't know what that would be.
Suggestionswelcomed. 
>>> >
>>>
>>> I'm not sure I follow what you are asking for... but a non-performance
>>> reason to use a seperate mount point for pg_wal, even if the
>>> underlying storage is the same, would be for something like using
>>> filesystem snapshots to grab contents of the data directory without
>>> grabbing wal (which can be handled separately).
>>>
>>
>> If I mount the filesystem on disk2 to:  /mnt/disk2
>> Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?
>>
>
> My immediate response to this was "because Postgres won't let you"
> which seemed unhelpful, and that I couldn't remember why was pretty
> unsatisfying, so I dug around in the source which was unhelpful but
> eventually came across this from
> https://www.postgresql.org/docs/devel/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS
>
> "Best practice is to create a directory within the mount-point
> directory that is owned by the PostgreSQL user, and then create the
> data directory within that. This avoids permissions problems,..."
>
> Which I do remember having tried to do it directly and the OS
> complaining that my mount point wasn't owned by root and/or Postgres
> complaining that the xlog dir wasn't owned by Postgres, so I think
> this advice probably still holds.

I remembered an old advice related to lost+found being placed on mount
points, but couldn't remember the specific, so searched and found Tom's
advice for data directory. I feel that the advice would apply to WAL
directory, as well.

From https://www.postgresql.org/message-id/12168.1312921709%40sss.pgh.pa.us

A couple of reasons for [not placing data diretory at a mount point]
are:

1. Mount-point directories should be owned by root, never by an
unprivileged account such as postgres.  IIRC there are good security
reasons for this practice, though I don't recall all the details right
now.

2. Keeping the data directory one level down ensures a clean failure if
the disk is for some reason not mounted when Postgres starts, or goes
offline later.  Otherwise, particularly if you're using a start script
that will automatically try an initdb, you might end up with some data
files on the / volume underneath where the mount point should have been.
This is sure to lead to serious problems when the disk does come back
online.  There's at least one horror story in our archives from someone
who had an auto-initdb startup script and one day his NFS disk was a few
seconds slow to mount...

Best regards,
Gurjeet
http://Gurje.et




Re: Documentation for initdb option --waldir

From
Laurenz Albe
Date:
On Sat, 2025-03-29 at 10:26 -0400, Robert Treat wrote:
> "Best practice is to create a directory within the mount-point
> directory that is owned by the PostgreSQL user, and then create the
> data directory within that. This avoids permissions problems,..."
>
> Which I do remember having tried to do it directly and the OS
> complaining that my mount point wasn't owned by root and/or Postgres
> complaining that the xlog dir wasn't owned by Postgres, so I think
> this advice probably still holds.

The root directory of a file system, which will be mounted at the
mount point, should be owned by root.

As far as I know, the reason is that it contains a "lost+found"
directory, which is used by file system checks to put orphaned
files.  If a non-root user owned the mount point, the user could
remove that directory, which would be a bad idea.

On the other hand, PostgreSQL will protest if the directory isn't
empty...

Yours,
Laurenz Albe



Re: Documentation for initdb option --waldir

From
"Gurjeet Singh"
Date:
On Sat Mar 29, 2025 at 9:26 AM PDT, Laurenz Albe wrote:

> On the other hand, PostgreSQL will protest if the directory isn't
> empty...

Specifically, initdb will complain if the directory it's trying to
initialize is not empty.

Best regards,
Gurjeet
http://Gurje.et




Re: Documentation for initdb option --waldir

From
"David G. Johnston"
Date:
On Saturday, March 29, 2025, Gurjeet Singh <gurjeet@singh.im> wrote:
On Sat Mar 29, 2025 at 9:26 AM PDT, Laurenz Albe wrote:

> On the other hand, PostgreSQL will protest if the directory isn't
> empty...

Specifically, initdb will complain if the directory it's trying to
initialize is not empty.

For this patch I intend to just point the reader to the “secondary file systems” section mentioned above for more considerations.  We can improve that as desired separately.

David J.