Documentation for initdb option --waldir
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/17/app-initdb.html
Description:
Hi all,
During troubleshooting with a customer we found that it might help to know
that initdb will create a symbolic link from data/pg_wal to the custom
directory with the option --waldir.
https://www.postgresql.org/docs/current/app-initdb.html
For instance:
This option specifies the directory where the write-ahead log should be
stored. A symbolic link is created from pg_wal in the main data directory to
the specified location.
(Like in this documentation https://pgpedia.info/p/pg_wal.html, but there is
a typo or missing word in the initdb section: "A symbolic link from pg_wal
in the main data directory to the specified location.")
And maybe also here a hint, that moving the pg_wal directory to another
location and creating a symbolic is what initdb --waldir does.
https://www.postgresql.org/docs/current/wal-internals.html
For instance:
It is advantageous if the WAL is located on a different disk from the main
database files. This can be achieved by moving the pg_wal directory to
another location (while the server is shut down, of course) and creating a
symbolic link from the original location in the main data directory to the
new location. This is also what the --waldir initdb option does.
Thank you very much and kind regards
Theo
On Wed, Mar 26, 2025 at 2:35 AM PG Doc comments form <noreply@postgresql.org>
wrote:
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/17/app-initdb.html
Description:Hi all,
During troubleshooting with a customer we found that it might help to know
that initdb will create a symbolic link from data/pg_wal to the custom
directory with the option --waldir.
Thank you for the observation. This indeed seems like something we should
cover in a bit more detail. How does the proposed patch (attached) look?
The main initdb change, added to Notes, is:
The pg_wal subdirectory will always exist within the data directory. This
is where PostgreSQL sends its write-ahead log (WAL) files. Specifying the
--waldir option turns this subdirectory entry into a symbolic link. In
general, this is only useful if the remote location is on a different
physical device. An existing directory must be empty and should not be a
mount point. The directory will be created (including missing parents) if
necessary.
David J.
Attachments:
v1-0001-doc-Explain-use-of-symbolic-linking-for-pg_wal.patchtext/x-patch; charset=US-ASCII; name=v1-0001-doc-Explain-use-of-symbolic-linking-for-pg_wal.patchDownload+16-2
On Wed, 2025-03-26 at 17:34 -0700, David G. Johnston wrote:
+ <para> + The <filename>pg_wal</filename> subdirectory will always exist within the + data directory. This is where <productname>PostgreSQL</productname> + sends its write-ahead log (<acronym>WAL</acronym>) files. + Specifying the <option>--waldir</option> option turns this subdirectory entry + into a symbolic link. In general, this is only useful if the remote location + is on a different physical device. An existing directory must be empty and + should not be a mount point. The directory will be created + (including missing parents) if necessary. + </para>
I think that it is very valuable to have WAL on a different file system on the
same storage device. The idea is that growing data files cannot exhaust the
space available for WAL.
How about this:
There is always a <filename>pg_wal</filename> within the data directory.
By default, it is a directory where <productname>PostgreSQL</productname>
places its write-ahead log (<acronym>WAL</acronym>) segment files. If
you create the <acronym>WAL</acronym> location somewhere else using the option
<option>--waldir</option>, <filename>pg_wal</filename> will be created as
a symbolic link pointing to that <acronym>WAL</acronym> location.
If the directory already exists, it must be empty and should not be a mount
point. The directory will be created (including missing parents) if necessary.
Yours,
Laurenz Albe
Both look great to me! I thought the advantage of having different devices is also to easy I/O load. But I guess you have to decide for yourself what is the best strategy. For me it was important to know what initdb does with ---waldir. And this is nicely explained here.
One more little thing, I stumbled across " should not be a mount point", sounds thirst like it was not good to use the mount point. Maybe: " should not be the mount point itself but a directory in the mount point". It might be only me, you diced.
Thank you very much!
Regards,
Theo
-----Ursprüngliche Nachricht-----
Von: Laurenz Albe <laurenz.albe@cybertec.at>
Gesendet: Donnerstag, 27. März 2025 11:32
An: David G. Johnston <david.g.johnston@gmail.com>; Theodor Herlo <t.herlo@proventa.de>; pgsql-docs@lists.postgresql.org
Betreff: Re: Documentation for initdb option --waldir
[Sie erhalten nicht häufig E-Mails von laurenz.albe@cybertec.at. Weitere Informationen, warum dies wichtig ist, finden Sie unter https://aka.ms/LearnAboutSenderIdentification ]
On Wed, 2025-03-26 at 17:34 -0700, David G. Johnston wrote:
+ <para> + The <filename>pg_wal</filename> subdirectory will always exist within the + data directory. This is where <productname>PostgreSQL</productname> + sends its write-ahead log (<acronym>WAL</acronym>) files. + Specifying the <option>--waldir</option> option turns this subdirectory entry + into a symbolic link. In general, this is only useful if the remote location + is on a different physical device. An existing directory must be empty and + should not be a mount point. The directory will be created + (including missing parents) if necessary. + </para>
I think that it is very valuable to have WAL on a different file system on the same storage device. The idea is that growing data files cannot exhaust the space available for WAL.
How about this:
There is always a <filename>pg_wal</filename> within the data directory.
By default, it is a directory where <productname>PostgreSQL</productname>
places its write-ahead log (<acronym>WAL</acronym>) segment files. If
you create the <acronym>WAL</acronym> location somewhere else using the option
<option>--waldir</option>, <filename>pg_wal</filename> will be created as
a symbolic link pointing to that <acronym>WAL</acronym> location.
If the directory already exists, it must be empty and should not be a mount
point. The directory will be created (including missing parents) if necessary.
Yours,
Laurenz Albe
On Thu, Mar 27, 2025, at 8:40 AM, Theodor Herlo wrote:
Both look great to me! I thought the advantage of having different devices is also to easy I/O load. But I guess you have to decide for yourself what is the best strategy. For me it was important to know what initdb does with ---waldir. And this is nicely explained here.
One more little thing, I stumbled across " should not be a mount point", sounds thirst like it was not good to use the mount point. Maybe: " should not be the mount point itself but a directory in the mount point". It might be only me, you diced.
Laurenz suggestion looks good to me with some suggestions. I agree with
Theodor's addition but I would use the same sentence from the source code that
says 'subdirectory under the mount point'. I would also consider and/or modify
pg_basebackup documentation that has a similar option. My suggestion is that
waldir option in both tools has similar sentences. The term 'WAL location' is
not appropriate because this terminology is already used to indicate the LSN. I
would replace it with WAL directory.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Version 2 Attached
There being two places now (plus doing it manually) I decided to write this
material in the WAL chapter as opposed to application-specific Notes. A
new section seemed warranted.
I expanded upon the material regarding using different file systems and
disks.
I would like to add a similar "why" to the mount point recommendation but
don't know what that would be. Suggestions welcomed.
David J.
Attachments:
v2-0001-doc-Explain-use-of-symbolic-linking-for-pg_wal.patchtext/x-patch; charset=US-ASCII; name=v2-0001-doc-Explain-use-of-symbolic-linking-for-pg_wal.patchDownload+54-24
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
Version 2 Attached
- This option specifies the directory where the write-ahead log - should be stored. + This option specifies the directory in which to store write-ahead log files. + See <xref linkend="wal-internals-relocation"/> for more information.
To be honest, I think the proposed text is a little verbose, and we
could cut it right here. As an example, we don't go into how the
directory must be empty or chnmod settings for --pgdata. I'd also edit
the pg_basebackup changes to match if it were me.
+ the <option>--waldir</option> option. On an existing, but not running, + cluster use operating system commands to move the contents of the + <filename>pg_wal</filename> directory to the new location, remove the + empty directory, and create the symbolic link named <filename>pg_wal</filename> + pointing to the new location.
I found this to be awkward to read, and not as significant a warning
as the existing text... perhaps "For an existing cluster, once the
cluster has been shut down, you can use operating system commands... "
There being two places now (plus doing it manually) I decided to write this material in the WAL chapter as opposed to application-specific Notes. A new section seemed warranted.
+1
I expanded upon the material regarding using different file systems and disks.
I would like to add a similar "why" to the mount point recommendation but don't know what that would be. Suggestions welcomed.
I'm not sure I follow what you are asking for... but a non-performance
reason to use a seperate mount point for pg_wal, even if the
underlying storage is the same, would be for something like using
filesystem snapshots to grab contents of the data directory without
grabbing wal (which can be handled separately).
Robert Treat
https://xzilla.net
On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:Version 2 Attached
- This option specifies the directory where the write-ahead log - should be stored. + This option specifies the directory in which to storewrite-ahead log files.
+ See <xref linkend="wal-internals-relocation"/> for more
information.
To be honest, I think the proposed text is a little verbose, and we
could cut it right here. As an example, we don't go into how the
directory must be empty or chnmod settings for --pgdata. I'd also edit
the pg_basebackup changes to match if it were me.
Don't really see how completeness hurts here, but also not married to it.
+ the <option>--waldir</option> option. On an existing, but not
running,
+ cluster use operating system commands to move the contents of the + <filename>pg_wal</filename> directory to the new location, remove the + empty directory, and create the symbolic link named<filename>pg_wal</filename>
+ pointing to the new location.
I found this to be awkward to read, and not as significant a warning
as the existing text... perhaps "For an existing cluster, once the
cluster has been shut down, you can use operating system commands... "
I'll probably put it back into parentheses to get the desired emphasis.
That did nag at me too as I wrote it.
There being two places now (plus doing it manually) I decided to write
this material in the WAL chapter as opposed to application-specific Notes.
A new section seemed warranted.+1
I expanded upon the material regarding using different file systems and
disks.
I would like to add a similar "why" to the mount point recommendation
but don't know what that would be. Suggestions welcomed.
I'm not sure I follow what you are asking for... but a non-performance
reason to use a seperate mount point for pg_wal, even if the
underlying storage is the same, would be for something like using
filesystem snapshots to grab contents of the data directory without
grabbing wal (which can be handled separately).
If I mount the filesystem on disk2 to: /mnt/disk2
Why do I need to create "/mnt/disk2/wal_files/" and point there instead of:
"/mnt/disk2/"?
David J.
On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:I expanded upon the material regarding using different file systems and disks.
I would like to add a similar "why" to the mount point recommendation but don't know what that would be. Suggestions welcomed.
I'm not sure I follow what you are asking for... but a non-performance
reason to use a seperate mount point for pg_wal, even if the
underlying storage is the same, would be for something like using
filesystem snapshots to grab contents of the data directory without
grabbing wal (which can be handled separately).If I mount the filesystem on disk2 to: /mnt/disk2
Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?
My immediate response to this was "because Postgres won't let you"
which seemed unhelpful, and that I couldn't remember why was pretty
unsatisfying, so I dug around in the source which was unhelpful but
eventually came across this from
https://www.postgresql.org/docs/devel/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS
"Best practice is to create a directory within the mount-point
directory that is owned by the PostgreSQL user, and then create the
data directory within that. This avoids permissions problems,..."
Which I do remember having tried to do it directly and the OS
complaining that my mount point wasn't owned by root and/or Postgres
complaining that the xlog dir wasn't owned by Postgres, so I think
this advice probably still holds.
Robert Treat
https://xzilla.net
On Saturday, March 29, 2025, Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:I expanded upon the material regarding using different file systems
and disks.
I would like to add a similar "why" to the mount point recommendation
but don't know what that would be. Suggestions welcomed.
If I mount the filesystem on disk2 to: /mnt/disk2
Why do I need to create "/mnt/disk2/wal_files/" and point there insteadof: "/mnt/disk2/"?
eventually came across this from
https://www.postgresql.org/docs/devel/creating-cluster.
html#CREATING-CLUSTER-MOUNT-POINTS"Best practice is to create a directory within the mount-point
directory that is owned by the PostgreSQL user, and then create the
data directory within that. This avoids permissions problems,..."Which I do remember having tried to do it directly and the OS
complaining that my mount point wasn't owned by root and/or Postgres
complaining that the xlog dir wasn't owned by Postgres, so I think
this advice probably still holds.
Thank you, I can definitely work that in and it makes sense.
On the topic of verbosity, I found the wording for —pgdata in pg_basebackup
to support the more complete description.
https://www.postgresql.org/docs/current/app-pgbasebackup.html
I believe that at least documenting external side-effects should be
required. I’m less convinced that pre-conditions that will be checked by
the application need to be listed. But for now I’m going to copy
pg_basebackup as my example and at some point might get to doing a survey
and proposing a new standard wording for —waldir and —pgdata descriptions
regarding creation.
The comment regarding absolute paths will remain unwritten.
David J.
On Sat, Mar 29, 2025 at 10:58 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Saturday, March 29, 2025, Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:I expanded upon the material regarding using different file systems and disks.
I would like to add a similar "why" to the mount point recommendation but don't know what that would be. Suggestions welcomed.
If I mount the filesystem on disk2 to: /mnt/disk2
Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?eventually came across this from
https://www.postgresql.org/docs/devel/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS"Best practice is to create a directory within the mount-point
directory that is owned by the PostgreSQL user, and then create the
data directory within that. This avoids permissions problems,..."Which I do remember having tried to do it directly and the OS
complaining that my mount point wasn't owned by root and/or Postgres
complaining that the xlog dir wasn't owned by Postgres, so I think
this advice probably still holds.Thank you, I can definitely work that in and it makes sense.
On the topic of verbosity, I found the wording for —pgdata in pg_basebackup to support the more complete description.
https://www.postgresql.org/docs/current/app-pgbasebackup.html
I believe that at least documenting external side-effects should be required. I’m less convinced that pre-conditions that will be checked by the application need to be listed. But for now I’m going to copy pg_basebackup as my example and at some point might get to doing a survey and proposing a new standard wording for —waldir and —pgdata descriptions regarding creation.
The comment regarding absolute paths will remain unwritten.
Works for me.
Robert Treat
https://xzilla.net
On Sat Mar 29, 2025 at 7:26 AM PDT, Robert Treat wrote:
On Thu, Mar 27, 2025 at 5:57 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Thu, Mar 27, 2025 at 2:28 PM Robert Treat <rob@xzilla.net> wrote:
On Thu, Mar 27, 2025 at 12:06 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:I expanded upon the material regarding using different file systems and disks.
I would like to add a similar "why" to the mount point recommendation but don't know what that would be. Suggestions welcomed.
I'm not sure I follow what you are asking for... but a non-performance
reason to use a seperate mount point for pg_wal, even if the
underlying storage is the same, would be for something like using
filesystem snapshots to grab contents of the data directory without
grabbing wal (which can be handled separately).If I mount the filesystem on disk2 to: /mnt/disk2
Why do I need to create "/mnt/disk2/wal_files/" and point there instead of: "/mnt/disk2/"?My immediate response to this was "because Postgres won't let you"
which seemed unhelpful, and that I couldn't remember why was pretty
unsatisfying, so I dug around in the source which was unhelpful but
eventually came across this from
https://www.postgresql.org/docs/devel/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS"Best practice is to create a directory within the mount-point
directory that is owned by the PostgreSQL user, and then create the
data directory within that. This avoids permissions problems,..."Which I do remember having tried to do it directly and the OS
complaining that my mount point wasn't owned by root and/or Postgres
complaining that the xlog dir wasn't owned by Postgres, so I think
this advice probably still holds.
I remembered an old advice related to lost+found being placed on mount
points, but couldn't remember the specific, so searched and found Tom's
advice for data directory. I feel that the advice would apply to WAL
directory, as well.
From /messages/by-id/12168.1312921709@sss.pgh.pa.us
A couple of reasons for [not placing data diretory at a mount point]
are:
1. Mount-point directories should be owned by root, never by an
unprivileged account such as postgres. IIRC there are good security
reasons for this practice, though I don't recall all the details right
now.
2. Keeping the data directory one level down ensures a clean failure if
the disk is for some reason not mounted when Postgres starts, or goes
offline later. Otherwise, particularly if you're using a start script
that will automatically try an initdb, you might end up with some data
files on the / volume underneath where the mount point should have been.
This is sure to lead to serious problems when the disk does come back
online. There's at least one horror story in our archives from someone
who had an auto-initdb startup script and one day his NFS disk was a few
seconds slow to mount...
Best regards,
Gurjeet
http://Gurje.et
On Sat, 2025-03-29 at 10:26 -0400, Robert Treat wrote:
"Best practice is to create a directory within the mount-point
directory that is owned by the PostgreSQL user, and then create the
data directory within that. This avoids permissions problems,..."Which I do remember having tried to do it directly and the OS
complaining that my mount point wasn't owned by root and/or Postgres
complaining that the xlog dir wasn't owned by Postgres, so I think
this advice probably still holds.
The root directory of a file system, which will be mounted at the
mount point, should be owned by root.
As far as I know, the reason is that it contains a "lost+found"
directory, which is used by file system checks to put orphaned
files. If a non-root user owned the mount point, the user could
remove that directory, which would be a bad idea.
On the other hand, PostgreSQL will protest if the directory isn't
empty...
Yours,
Laurenz Albe
On Sat Mar 29, 2025 at 9:26 AM PDT, Laurenz Albe wrote:
On the other hand, PostgreSQL will protest if the directory isn't
empty...
Specifically, initdb will complain if the directory it's trying to
initialize is not empty.
Best regards,
Gurjeet
http://Gurje.et
On Saturday, March 29, 2025, Gurjeet Singh <gurjeet@singh.im> wrote:
On Sat Mar 29, 2025 at 9:26 AM PDT, Laurenz Albe wrote:
On the other hand, PostgreSQL will protest if the directory isn't
empty...Specifically, initdb will complain if the directory it's trying to
initialize is not empty.
For this patch I intend to just point the reader to the “secondary file
systems” section mentioned above for more considerations. We can improve
that as desired separately.
David J.