[PATCH] minor bug fix for pg_dump --clean
Hello,
When using pg_dump (or pg_restore) with option "--clean", there is some
SQL code to drop every objects at the beginning.
The DROP statement for a view involving circular dependencies is :
CREATE OR REPLACE VIEW [...]
(see commit message of d8c05aff for a much better explanation)
If the view is not in the "public" schema, and the target database is
empty, this statement fails, because the schema hasn't been created yet.
The attached patches are a TAP test which can be used to reproduce the
bug, and a proposed fix. They apply to the master branch.
Best regards,
Frédéric
Attachments:
0002-pg_dump-fix-up-d8c05aff.patchtext/x-patch; charset=UTF-8; name=0002-pg_dump-fix-up-d8c05aff.patchDownload
From 96a334e2411794bdc4648475be286f159daa2484 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Wed, 31 Aug 2022 15:57:39 +0200
Subject: [PATCH 2/2] pg_dump: fix up d8c05aff
---
src/bin/pg_dump/pg_backup_archiver.c | 3 ++-
src/bin/pg_dump/pg_dump.c | 4 ++++
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 233198afc0..bb5031105e 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -585,7 +585,8 @@ RestoreArchive(Archive *AHX)
*/
if (strcmp(te->desc, "DEFAULT") == 0 ||
strcmp(te->desc, "DATABASE PROPERTIES") == 0 ||
- strncmp(dropStmt, "CREATE OR REPLACE VIEW", 22) == 0)
+ ( strncmp(dropStmt, "CREATE SCHEMA IF NOT EXISTS", 27) == 0 &&
+ strstr(dropStmt+29, "CREATE OR REPLACE VIEW") ))
appendPQExpBufferStr(ftStmt, dropStmt);
else
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d25709ad5f..8820984d9e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -17274,6 +17274,7 @@ dumpRule(Archive *fout, const RuleInfo *rinfo)
char *qtabname;
PGresult *res;
char *tag;
+ char *qnspname;
/* Do nothing in data-only dump */
if (dopt->dataOnly)
@@ -17374,6 +17375,9 @@ dumpRule(Archive *fout, const RuleInfo *rinfo)
*/
PQExpBuffer result;
+ qnspname = pg_strdup(fmtId(tbinfo->dobj.namespace->dobj.name));
+ appendPQExpBuffer(delcmd, "CREATE SCHEMA IF NOT EXISTS %s;\n", qnspname);
+ free(qnspname);
appendPQExpBuffer(delcmd, "CREATE OR REPLACE VIEW %s",
fmtQualifiedDumpable(tbinfo));
result = createDummyViewAsClause(fout, tbinfo);
--
2.30.2
0001-Add-TAP-test-for-pg_restore.patchtext/x-patch; charset=UTF-8; name=0001-Add-TAP-test-for-pg_restore.patchDownload
From 463774feb3b968b577ac082a71fdc2de51c8bac4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Wed, 31 Aug 2022 16:03:34 +0200
Subject: [PATCH 1/2] Add TAP test for pg_restore
---
src/bin/pg_dump/t/002_pg_dump.pl | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 2873b662fb..ff826e4688 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -4042,6 +4042,28 @@ $node->command_ok(
'pg_dumpall: option --exclude-database handles database names with embedded dots'
);
+########################################
+# Test pg_restore with --clean --if-exists
+
+$node->psql('postgres', 'create database postgres2;');
+$node->psql('postgres', 'create database postgres3;');
+
+$node->psql('postgres3',
+ 'CREATE SCHEMA dump_test;'
+ . 'CREATE TABLE dump_test.foo (id INT primary key, payload TEXT);'
+ . 'CREATE VIEW dump_test.babar AS SELECT * FROM dump_test.foo GROUP BY id;',
+ );
+
+$node->run_log(
+ [ 'pg_dump', '-p', "$port", '-Fc', '--no-sync', "--file=$tempdir/clean_if_exists.dump", 'postgres3' ]
+);
+
+$node->command_like(
+ [ 'pg_restore', '-p', "$port", '--clean', '--if-exists', '-d', 'postgres2', "$tempdir/clean_if_exists.dump" ],
+ '/^\s*$/',
+ 'pg_restore should output no warnings on stderr'
+);
+
#########################################
# Test invalid multipart schema names
--
2.30.2
Hi,
Good catch! Here are several points for improvement:
1. pg_dump.c:17380
Maybe better to write simpler
appendPQExpBuffer(delcmd, "CREATE SCHEMA IF NOT EXISTS %s;\n",
tbinfo->dobj.namespace->dobj.name);
because there is a schema name inside the `tbinfo->dobj.namespace->dobj.name
`
2. pg_backup_archiver.c:588
Here are no necessary spaces before and after braces, and no spaces around
the '+' sign.
( strncmp(dropStmt, "CREATE SCHEMA IF NOT EXISTS", 27) == 0 &&
strstr(dropStmt+29, "CREATE OR REPLACE VIEW") ))
Best regards,
Viktoria Shepard
чт, 1 сент. 2022 г. в 12:13, Frédéric Yhuel <frederic.yhuel@dalibo.com>:
Show quoted text
Hello,
When using pg_dump (or pg_restore) with option "--clean", there is some
SQL code to drop every objects at the beginning.The DROP statement for a view involving circular dependencies is :
CREATE OR REPLACE VIEW [...]
(see commit message of d8c05aff for a much better explanation)
If the view is not in the "public" schema, and the target database is
empty, this statement fails, because the schema hasn't been created yet.The attached patches are a TAP test which can be used to reproduce the
bug, and a proposed fix. They apply to the master branch.Best regards,
Frédéric
=?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Yhuel?= <frederic.yhuel@dalibo.com> writes:
When using pg_dump (or pg_restore) with option "--clean", there is some
SQL code to drop every objects at the beginning.
Yup ...
The DROP statement for a view involving circular dependencies is :
CREATE OR REPLACE VIEW [...]
(see commit message of d8c05aff for a much better explanation)
If the view is not in the "public" schema, and the target database is
empty, this statement fails, because the schema hasn't been created yet.
The attached patches are a TAP test which can be used to reproduce the
bug, and a proposed fix. They apply to the master branch.
I am disinclined to accept this as a valid bug, because there's never
been any guarantee that a --clean script would execute error-free in
a database that doesn't match what the source database contains.
(The pg_dump documentation used to say that in so many words.
I see that whoever added the --if-exists option was under the
fond illusion that that fixes all cases, which it surely does not.
We need to back off the promises a bit there.)
An example of a case that won't execute error-free is if the view
having a circular dependency includes a column of a non-built-in
data type. If you try to run that in an empty database, you'll
get an error from the CREATE OR REPLACE VIEW's reference to that
data type. For instance, if I adjust your test case to make
the "payload" column be of type hstore, I get something like
psql:dumpresult.sql:22: ERROR: type "public.hstore" does not exist
LINE 4: NULL::public.hstore AS payload;
^
The same type of failure occurs for user-defined functions and
operators that use a non-built-in type, and I'm sure there are
more cases in the same vein. But it gets *really* messy if
the target database isn't completely empty, but contains objects
with different properties than the dump script expects; for example,
if the view being discussed here exists with a different column set
than the script thinks, or if the dependency chains aren't all the
same.
If this fix were cleaner I might be inclined to accept it anyway,
but it's not very clean at all --- for example, it's far from
obvious to me what are the side-effects of changing the filter
in RestoreArchive like that. Nor am I sure that the schema
you want to create is guaranteed to get dropped again later in
every use-case.
So I think mainly what we ought to do here is to adjust the
documentation to make it clearer that --clean is not guaranteed
to work without errors unless the target database has the same
set of objects as the source. --if-exists can reduce the set
of error cases, but not eliminate it. Possibly we should be
more enthusiastic about recommending --create --clean (ie,
drop and recreate the whole database) instead.
regards, tom lane
On 10/24/22 03:01, Tom Lane wrote:
=?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Yhuel?= <frederic.yhuel@dalibo.com> writes:
When using pg_dump (or pg_restore) with option "--clean", there is some
SQL code to drop every objects at the beginning.Yup ...
The DROP statement for a view involving circular dependencies is :
CREATE OR REPLACE VIEW [...]
(see commit message of d8c05aff for a much better explanation)
If the view is not in the "public" schema, and the target database is
empty, this statement fails, because the schema hasn't been created yet.
The attached patches are a TAP test which can be used to reproduce the
bug, and a proposed fix. They apply to the master branch.I am disinclined to accept this as a valid bug, because there's never
been any guarantee that a --clean script would execute error-free in
a database that doesn't match what the source database contains.(The pg_dump documentation used to say that in so many words.
I see that whoever added the --if-exists option was under the
fond illusion that that fixes all cases, which it surely does not.
We need to back off the promises a bit there.)An example of a case that won't execute error-free is if the view
having a circular dependency includes a column of a non-built-in
data type. If you try to run that in an empty database, you'll
get an error from the CREATE OR REPLACE VIEW's reference to that
data type. For instance, if I adjust your test case to make
the "payload" column be of type hstore, I get something likepsql:dumpresult.sql:22: ERROR: type "public.hstore" does not exist
LINE 4: NULL::public.hstore AS payload;
^The same type of failure occurs for user-defined functions and
operators that use a non-built-in type, and I'm sure there are
more cases in the same vein. But it gets *really* messy if
the target database isn't completely empty, but contains objects
with different properties than the dump script expects; for example,
if the view being discussed here exists with a different column set
than the script thinks, or if the dependency chains aren't all the
same.If this fix were cleaner I might be inclined to accept it anyway,
but it's not very clean at all --- for example, it's far from
obvious to me what are the side-effects of changing the filter
in RestoreArchive like that. Nor am I sure that the schema
you want to create is guaranteed to get dropped again later in
every use-case.
Hi Tom, Viktoria,
Thank you for your review Viktoria!
Thank you for this detailed explanation, Tom! I didn't have great hope
for this patch. I thought that the TAP test could be accepted, but now I
can see that it is clearly useless.
So I think mainly what we ought to do here is to adjust the
documentation to make it clearer that --clean is not guaranteed
to work without errors unless the target database has the same
set of objects as the source. --if-exists can reduce the set
of error cases, but not eliminate it. Possibly we should be
more enthusiastic about recommending --create --clean (ie,
drop and recreate the whole database) instead.
I beleive a documentation patch would be useful, indeed.
Best regards,
Frédéric
FYI, this was improved in a recent commit:
commit 75af0f401f
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Sep 29 13:13:54 2023 -0400
Doc: improve description of dump/restore's --clean and --if-exists.
Try to make these option descriptions a little clearer for novices.
Per gripe from Attila Gulyás.
Discussion: /messages/by-id/169590536647.3727336.11070254203649648453@wrigleys.postgresql.org
---------------------------------------------------------------------------
On Mon, Oct 24, 2022 at 09:02:46AM +0200, Frédéric Yhuel wrote:
On 10/24/22 03:01, Tom Lane wrote:
=?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Yhuel?= <frederic.yhuel@dalibo.com> writes:
When using pg_dump (or pg_restore) with option "--clean", there is some
SQL code to drop every objects at the beginning.Yup ...
The DROP statement for a view involving circular dependencies is :
CREATE OR REPLACE VIEW [...]
(see commit message of d8c05aff for a much better explanation)
If the view is not in the "public" schema, and the target database is
empty, this statement fails, because the schema hasn't been created yet.
The attached patches are a TAP test which can be used to reproduce the
bug, and a proposed fix. They apply to the master branch.I am disinclined to accept this as a valid bug, because there's never
been any guarantee that a --clean script would execute error-free in
a database that doesn't match what the source database contains.(The pg_dump documentation used to say that in so many words.
I see that whoever added the --if-exists option was under the
fond illusion that that fixes all cases, which it surely does not.
We need to back off the promises a bit there.)An example of a case that won't execute error-free is if the view
having a circular dependency includes a column of a non-built-in
data type. If you try to run that in an empty database, you'll
get an error from the CREATE OR REPLACE VIEW's reference to that
data type. For instance, if I adjust your test case to make
the "payload" column be of type hstore, I get something likepsql:dumpresult.sql:22: ERROR: type "public.hstore" does not exist
LINE 4: NULL::public.hstore AS payload;
^The same type of failure occurs for user-defined functions and
operators that use a non-built-in type, and I'm sure there are
more cases in the same vein. But it gets *really* messy if
the target database isn't completely empty, but contains objects
with different properties than the dump script expects; for example,
if the view being discussed here exists with a different column set
than the script thinks, or if the dependency chains aren't all the
same.If this fix were cleaner I might be inclined to accept it anyway,
but it's not very clean at all --- for example, it's far from
obvious to me what are the side-effects of changing the filter
in RestoreArchive like that. Nor am I sure that the schema
you want to create is guaranteed to get dropped again later in
every use-case.Hi Tom, Viktoria,
Thank you for your review Viktoria!
Thank you for this detailed explanation, Tom! I didn't have great hope for
this patch. I thought that the TAP test could be accepted, but now I can see
that it is clearly useless.So I think mainly what we ought to do here is to adjust the
documentation to make it clearer that --clean is not guaranteed
to work without errors unless the target database has the same
set of objects as the source. --if-exists can reduce the set
of error cases, but not eliminate it. Possibly we should be
more enthusiastic about recommending --create --clean (ie,
drop and recreate the whole database) instead.I beleive a documentation patch would be useful, indeed.
Best regards,
Frédéric
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.