Why pg_dump overwrites dump file?

Started by Chao Li3 months ago5 messages
#1Chao Li
li.evan.chao@gmail.com

Hi Hacker,

I noticed this problem while testing the other patch.

When I do custom-format dump, if a target file exists, pg_dump will just go ahead overwrite the existing file; however, when I do directory dump, if a target dir exists, pg_dump will fail with an error “directory xxx is not empty”.

Behaviors of the two types of pg_dump are inconsistent, I wonder if that’s by design?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#2Daniel Gustafsson
daniel@yesql.se
In reply to: Chao Li (#1)
1 attachment(s)
Re: Why pg_dump overwrites dump file?

On 14 Oct 2025, at 07:42, Chao Li <li.evan.chao@gmail.com> wrote:

Behaviors of the two types of pg_dump are inconsistent, I wonder if that’s by design?

It does admittedly seem odd that --file works differently for files and
directories, but at this point it might be behavior that users expect and
changing it might break current usecases? Not sure what the best option is
here.

Another inconsistency is that the documentation states this:

"In this case the directory is created by pg_dump and must not exist
before."

..which isn't true, since it will happily reuse an existing directory as long as
it's empty, the comment in the code makes the intention clear:

/*
* create_or_open_dir
*
* This will create a new directory with the given dirname. If there is
* already an empty directory with that name, then use it.
*/

So regardless it seems we should something like the attached at least.

--
Daniel Gustafsson

Attachments:

pg_dump_emptydir.diffapplication/octet-stream; name=pg_dump_emptydir.diff; x-unix-mode=0644Download
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index fd4ecf01a0a..5ac3f3e8510 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -297,8 +297,8 @@ PostgreSQL documentation
         file based output formats, in which case the standard output is used.
         It must be given for the directory output format however, where it
         specifies the target directory instead of a file. In this case the
-        directory is created by <command>pg_dump</command> and must not exist
-        before.
+        directory is created by <command>pg_dump</command> unless the directory
+        exist and is empty.
        </para>
       </listitem>
      </varlistentry>
#3Bruce Momjian
bruce@momjian.us
In reply to: Daniel Gustafsson (#2)
Re: Why pg_dump overwrites dump file?

On Tue, Oct 14, 2025 at 10:44:37AM +0200, Daniel Gustafsson wrote:

Another inconsistency is that the documentation states this:

"In this case the directory is created by pg_dump and must not exist
before."

..which isn't true, since it will happily reuse an existing directory as long as
it's empty, the comment in the code makes the intention clear:

/*
* create_or_open_dir
*
* This will create a new directory with the given dirname. If there is
* already an empty directory with that name, then use it.
*/

So regardless it seems we should something like the attached at least.

--
Daniel Gustafsson

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index fd4ecf01a0a..5ac3f3e8510 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -297,8 +297,8 @@ PostgreSQL documentation
file based output formats, in which case the standard output is used.
It must be given for the directory output format however, where it
specifies the target directory instead of a file. In this case the
-        directory is created by <command>pg_dump</command> and must not exist
-        before.
+        directory is created by <command>pg_dump</command> unless the directory
+        exist and is empty.
</para>
</listitem>
</varlistentry>

Uh, Daniel, are you going to make this change?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#4Daniel Gustafsson
daniel@yesql.se
In reply to: Bruce Momjian (#3)
Re: Why pg_dump overwrites dump file?

On 29 Oct 2025, at 20:47, Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Oct 14, 2025 at 10:44:37AM +0200, Daniel Gustafsson wrote:

Another inconsistency is that the documentation states this:

"In this case the directory is created by pg_dump and must not exist
before."

..which isn't true, since it will happily reuse an existing directory as long as
it's empty, the comment in the code makes the intention clear:

/*
* create_or_open_dir
*
* This will create a new directory with the given dirname. If there is
* already an empty directory with that name, then use it.
*/

So regardless it seems we should something like the attached at least.

--
Daniel Gustafsson

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index fd4ecf01a0a..5ac3f3e8510 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -297,8 +297,8 @@ PostgreSQL documentation
file based output formats, in which case the standard output is used.
It must be given for the directory output format however, where it
specifies the target directory instead of a file. In this case the
-        directory is created by <command>pg_dump</command> and must not exist
-        before.
+        directory is created by <command>pg_dump</command> unless the directory
+        exist and is empty.
</para>
</listitem>
</varlistentry>

Uh, Daniel, are you going to make this change?

Yes, I had left it in my TODO for after my vacation (ie next week) to leave time for the OP (or someone else) to propose something different.

./daniel

#5Bruce Momjian
bruce@momjian.us
In reply to: Daniel Gustafsson (#4)
Re: Why pg_dump overwrites dump file?

On Wed, Oct 29, 2025 at 10:12:02PM +0200, Daniel Gustafsson wrote:

On 29 Oct 2025, at 20:47, Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Oct 14, 2025 at 10:44:37AM +0200, Daniel Gustafsson wrote:

Another inconsistency is that the documentation states this:

"In this case the directory is created by pg_dump and must not exist
before."

..which isn't true, since it will happily reuse an existing directory as long as
it's empty, the comment in the code makes the intention clear:

/*
* create_or_open_dir
*
* This will create a new directory with the given dirname. If there is
* already an empty directory with that name, then use it.
*/

So regardless it seems we should something like the attached at least.

Yes, I had left it in my TODO for after my vacation (ie next week) to leave time for the OP (or someone else) to propose something different.

Okay, just checking, thanks. My GUC random_page_cost doc patch is
in similar status.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.