[Proposal] vacuumdb --schema only

Started by Gilles Daroldalmost 4 years ago33 messages

gilles@migops.com

almost 4 years ago

1 attachment(s)

Hi,

When we want to vacuum and/or analyze all tables in a dedicated schema,
let's say pg_catalog for example, there is no easy way to do that. The
VACUUM command doesn't allow it so we have to use \gexec or a SQL script
to do that. We have an external command vacuumdb that could be used to
simplify this task. For example the following command can be used to
clean all tables stored in the pg_catalog schema:

vacuumdb --schema pg_catalog -d foo

The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.

Common use cases are an application that creates lot of temporary
objects then drop them which can bloat a lot the catalog or which have
heavy work in some schemas only. Of course the good practice is to find
the bloated tables and execute VACUUM on each table but if most of the
tables in the schema are regularly bloated the use of the vacuumdb
--schema script can save time.

I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that and also
because I'm not really in favor of such change. But if there is interest
in improving these commands I will be pleased to do that, with the
syntax suggested.

Best regards,

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..e4f6d32ba9 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,24 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+      <arg choice="plain"><option>-n</option></arg>
+      <arg choice="plain"><option>--schema</option></arg>
+     </group>
+     <replaceable>schema</replaceable>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +262,17 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +648,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..4c4f47e32a 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,12 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".*/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd39..69b470598f 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,10 +46,10 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
-
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
+								SimpleStringList *schemas,
 								SimpleStringList *tables,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
@@ -94,6 +94,7 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -125,6 +126,7 @@ main(int argc, char *argv[])
 	SimpleStringList tables = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
+	SimpleStringList schemas = {NULL, NULL};
 
 	/* initialize options */
 	memset(&vacopts, 0, sizeof(vacopts));
@@ -140,7 +142,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -202,6 +204,9 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':                       /* include schema(s) */
+				simple_string_list_append(&schemas, optarg);
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -341,6 +346,16 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtereing on schema name, filter by table is not allowed.
+	 * The schema name can already be set in a fqdn table name.
+	 */
+	if (tbl_count && schemas.head != NULL)
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+		exit(1);
+	}
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -352,6 +367,13 @@ main(int argc, char *argv[])
 			pg_log_error("cannot vacuum all databases and a specific one at the same time");
 			exit(1);
 		}
+
+		if (schemas.head != NULL)
+		{
+			pg_log_error("cannot vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
 		if (tables.head != NULL)
 		{
 			pg_log_error("cannot vacuum specific table(s) in all databases");
@@ -387,6 +409,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
+									&schemas,
 									&tables,
 									concurrentCons,
 									progname, echo, quiet);
@@ -395,6 +418,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
+								&schemas,
 								&tables,
 								concurrentCons,
 								progname, echo, quiet);
@@ -420,6 +444,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
+					SimpleStringList *schemas,
 					SimpleStringList *tables,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
@@ -436,6 +461,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			ntups;
 	bool		failed = false;
 	bool		tables_listed = false;
+	bool		schemas_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -618,6 +644,24 @@ vacuum_one_database(ConnParams *cparams,
 							 CppAsString2(RELKIND_RELATION) ", "
 							 CppAsString2(RELKIND_MATVIEW) "])\n");
 		has_where = true;
+
+		for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
+		{
+			if (!schemas_listed) {
+				appendPQExpBufferStr(&catalog_query,
+									 " AND pg_catalog.quote_ident(ns.nspname) IN (");
+				schemas_listed = true;
+			}
+			else
+				appendPQExpBufferStr(&catalog_query, ", ");
+
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name");
+
+		}
+		/* Finish formatting schema filter */
+		if (schemas_listed)
+			appendPQExpBufferStr(&catalog_query, ")\n");
 	}
 
 	/*
@@ -814,6 +858,7 @@ vacuum_all_databases(ConnParams *cparams,
 				vacuum_one_database(cparams, vacopts,
 									stage,
 									NULL,
+									NULL,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -828,6 +873,7 @@ vacuum_all_databases(ConnParams *cparams,
 			vacuum_one_database(cparams, vacopts,
 								ANALYZE_NO_STAGE,
 								NULL,
+								NULL,
 								concurrentCons,
 								progname, echo, quiet);
 		}
@@ -1027,6 +1073,7 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Gilles Darold (#1)

Re: [Proposal] vacuumdb --schema only

On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:

The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.

Yes, thanks.

I suggest there should also be an --exclude-schema.

I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that

I think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.

+	/*
+	 * When filtereing on schema name, filter by table is not allowed.
+	 * The schema name can already be set in a fqdn table name.

set *to*

--
Justin

Dinesh Chemuduru

dinesh.kumar@migops.com

almost 4 years ago

In reply to: Gilles Darold (#1)

Re: [Proposal] vacuumdb --schema only

On Fri, 4 Mar 2022 at 14:41, Gilles Darold <gilles@migops.com> wrote:

Hi,

When we want to vacuum and/or analyze all tables in a dedicated schema,
let's say pg_catalog for example, there is no easy way to do that. The
VACUUM command doesn't allow it so we have to use \gexec or a SQL script
to do that. We have an external command vacuumdb that could be used to
simplify this task. For example the following command can be used to
clean all tables stored in the pg_catalog schema:

vacuumdb --schema pg_catalog -d foo

+1
This gives much better flexibility to users.

Show quoted text

The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.

Common use cases are an application that creates lot of temporary
objects then drop them which can bloat a lot the catalog or which have
heavy work in some schemas only. Of course the good practice is to find
the bloated tables and execute VACUUM on each table but if most of the
tables in the schema are regularly bloated the use of the vacuumdb
--schema script can save time.

I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that and also
because I'm not really in favor of such change. But if there is interest
in improving these commands I will be pleased to do that, with the
syntax suggested.

Best regards,

--
Gilles Darold

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Justin Pryzby (#2)

Re: [Proposal] vacuumdb --schema only

Le 04/03/2022 à 11:56, Justin Pryzby a écrit :

On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:

The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.

Yes, thanks.

I suggest there should also be an --exclude-schema.

Ok, I will add it too.

I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that

I think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.

Yes this is what I've though, something a la EXPLAIN, for example :
"VACUUM (ANALYZE, SCHEMA foo)" but this is a change in the VACUUM syntax
that needs to keep the compatibility with the current syntax. We will
have two syntax something like "VACUUM ANALYZE FULL dbname" and "VACUUM
(ANALYZE, FULL) dbname". The other syntax "problem" is to be able to use
multiple schema values in the VACUUM command, perhaps "VACUUM (ANALYZE,
SCHEMA (foo,bar))".

+	/*
+	 * When filtereing on schema name, filter by table is not allowed.
+	 * The schema name can already be set in a fqdn table name.

set *to*

Thanks, will be fixed in next patch version.

--
Gilles Darold

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Justin Pryzby (#2)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Le 04/03/2022 à 11:56, Justin Pryzby a écrit :

On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:

The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.

Yes, thanks.

I suggest there should also be an --exclude-schema.

I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that

I think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.
+	/*
+	 * When filtereing on schema name, filter by table is not allowed.
+	 * The schema name can already be set in a fqdn table name.
set *to*

Attached a new patch version that adds the -N | --exclude-schema option
to the vacuumdb command as suggested. Documentation updated too.

I will add this patch to the commitfest unless there is cons about
adding these options.

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v2.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v2.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..378328afb3 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-n</option></arg>
+	      <arg choice="plain"><option>--schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-N</option></arg>
+	      <arg choice="plain"><option>--exclude-schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..4c4f47e32a 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,12 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".*/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd39..3dca22e1c8 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,10 +46,12 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+static bool schema_exclusion = false;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
+								SimpleStringList *schemas,
 								SimpleStringList *tables,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
@@ -94,6 +96,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -125,6 +129,8 @@ main(int argc, char *argv[])
 	SimpleStringList tables = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
+	SimpleStringList schemas = {NULL, NULL};
+	SimpleStringList schemas_excluded = {NULL, NULL};
 
 	/* initialize options */
 	memset(&vacopts, 0, sizeof(vacopts));
@@ -140,7 +146,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -202,6 +208,13 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':                       /* include schema(s) */
+				simple_string_list_append(&schemas, optarg);
+				break;
+			case 'N':			/* exclude schema(s) */
+				simple_string_list_append(&schemas_excluded, optarg);
+				schema_exclusion = true;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -341,6 +354,26 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtereing on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && (schemas.head != NULL || schemas_excluded.head != NULL))
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+		exit(1);
+	}
+
+	/*
+	 * Options -n | --schema and -N | --exclude-schema
+	 * can not be used together
+	 */
+	if (schemas.head != NULL && schemas_excluded.head != NULL)
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and and exclude specific schema(s) at the same time");
+		exit(1);
+	}
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -352,6 +385,19 @@ main(int argc, char *argv[])
 			pg_log_error("cannot vacuum all databases and a specific one at the same time");
 			exit(1);
 		}
+
+		if (schemas.head != NULL)
+		{
+			pg_log_error("cannot vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
+		if (schemas_excluded.head != NULL)
+		{
+			pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
 		if (tables.head != NULL)
 		{
 			pg_log_error("cannot vacuum specific table(s) in all databases");
@@ -387,6 +433,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
+									(schema_exclusion) ? &schemas_excluded : &schemas,
 									&tables,
 									concurrentCons,
 									progname, echo, quiet);
@@ -395,6 +442,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
+								(schema_exclusion) ? &schemas_excluded : &schemas,
 								&tables,
 								concurrentCons,
 								progname, echo, quiet);
@@ -420,6 +468,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
+					SimpleStringList *schemas,
 					SimpleStringList *tables,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
@@ -436,6 +485,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			ntups;
 	bool		failed = false;
 	bool		tables_listed = false;
+	bool		schemas_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -618,6 +668,29 @@ vacuum_one_database(ConnParams *cparams,
 							 CppAsString2(RELKIND_RELATION) ", "
 							 CppAsString2(RELKIND_MATVIEW) "])\n");
 		has_where = true;
+
+		for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
+		{
+			if (!schemas_listed) {
+				appendPQExpBufferStr(&catalog_query,
+									 " AND pg_catalog.quote_ident(ns.nspname)");
+				if (schema_exclusion)
+					appendPQExpBufferStr(&catalog_query, " NOT IN (");
+				else
+					appendPQExpBufferStr(&catalog_query, " IN (");
+
+				schemas_listed = true;
+			}
+			else
+				appendPQExpBufferStr(&catalog_query, ", ");
+
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name");
+
+		}
+		/* Finish formatting schema filter */
+		if (schemas_listed)
+			appendPQExpBufferStr(&catalog_query, ")\n");
 	}
 
 	/*
@@ -814,6 +887,7 @@ vacuum_all_databases(ConnParams *cparams,
 				vacuum_one_database(cparams, vacopts,
 									stage,
 									NULL,
+									NULL,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -828,6 +902,7 @@ vacuum_all_databases(ConnParams *cparams,
 			vacuum_one_database(cparams, vacopts,
 								ANALYZE_NO_STAGE,
 								NULL,
+								NULL,
 								concurrentCons,
 								progname, echo, quiet);
 		}
@@ -1027,6 +1102,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Gilles Darold (#5)

3 attachment(s)

Re: [Proposal] vacuumdb --schema only

On Sun, Mar 06, 2022 at 09:39:37AM +0100, Gilles Darold wrote:

Attached a new patch version that adds the -N | --exclude-schema option
to the vacuumdb command as suggested. Documentation updated too.

+ pg_log_error("cannot vacuum all tables in schema(s) and and exclude specific schema(s) at the same time");

and and

It's odd that schema_exclusion is a global var, but schemas/excluded are not.

Also, it seems unnecessary to have two schemas vars, since they can't be used
together. Maybe there's a better way than what I did in 003.

+ for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)

It's preferred to write cell != NULL

+ bool schemas_listed = false;

...

+		for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
+		{
+			if (!schemas_listed) {
+				appendPQExpBufferStr(&catalog_query,
+									 " AND pg_catalog.quote_ident(ns.nspname)");
+				if (schema_exclusion)
+					appendPQExpBufferStr(&catalog_query, " NOT IN (");
+				else
+					appendPQExpBufferStr(&catalog_query, " IN (");
+
+				schemas_listed = true;
+			}
+			else
+				appendPQExpBufferStr(&catalog_query, ", ");
+
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name");
+
+		}
+		/* Finish formatting schema filter */
+		if (schemas_listed)
+			appendPQExpBufferStr(&catalog_query, ")\n");
}

Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.

--
Justin

Attachments:

0001-vacuumdb-schema-only.patchtext/x-diff; charset=iso-8859-1Download

From 9f1b7f2fb0849a810dc1fa5c1c03d5ff4e2e7d55 Mon Sep 17 00:00:00 2001
From: Gilles Darold <gilles@migops.com>
Date: Sun, 6 Mar 2022 09:39:37 +0100
Subject: [PATCH 1/3] vacuumdb --schema only
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Le 04/03/2022 Ã  11:56, Justin Pryzby a Ã©critÂ :
> On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:
>> The attached patch implements that. Option -n | --schema can be used
>> multiple time and can not be used together with options -a or -t.
> Yes, thanks.
>
> I suggest there should also be an --exclude-schema.
>
>> I do not propose to extend the VACUUM and ANALYZE commands because their
>> current syntax doesn't allow me to see an easy way to do that
> I think this would be easy with the parenthesized syntax.
> I'm not suggesting to do it there, though.
>
>> +	/*
>> +	 * When filtereing on schema name, filter by table is not allowed.
>> +	 * The schema name can already be set in a fqdn table name.
> set *to*
>

Attached a new patch version that adds the -N | --exclude-schema option
to the vacuumdb command as suggested. Documentation updated too.

I will add this patch to the commitfest unless there is cons about
adding these options.

--
Gilles Darold
---
 doc/src/sgml/ref/vacuumdb.sgml        | 64 ++++++++++++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl     |  9 +++
 src/bin/scripts/t/101_vacuumdb_all.pl |  3 +
 src/bin/scripts/vacuumdb.c            | 79 ++++++++++++++++++++++++++-
 4 files changed, 154 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cbc..378328afb3d 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-n</option></arg>
+	      <arg choice="plain"><option>--schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-N</option></arg>
+	      <arg choice="plain"><option>--exclude-schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c16..4c4f47e32a4 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,12 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".*/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf4117671..b122c995b15 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd392..3dca22e1c88 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,10 +46,12 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+static bool schema_exclusion = false;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
+								SimpleStringList *schemas,
 								SimpleStringList *tables,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
@@ -94,6 +96,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -125,6 +129,8 @@ main(int argc, char *argv[])
 	SimpleStringList tables = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
+	SimpleStringList schemas = {NULL, NULL};
+	SimpleStringList schemas_excluded = {NULL, NULL};
 
 	/* initialize options */
 	memset(&vacopts, 0, sizeof(vacopts));
@@ -140,7 +146,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -202,6 +208,13 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':                       /* include schema(s) */
+				simple_string_list_append(&schemas, optarg);
+				break;
+			case 'N':			/* exclude schema(s) */
+				simple_string_list_append(&schemas_excluded, optarg);
+				schema_exclusion = true;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -341,6 +354,26 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtereing on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && (schemas.head != NULL || schemas_excluded.head != NULL))
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+		exit(1);
+	}
+
+	/*
+	 * Options -n | --schema and -N | --exclude-schema
+	 * can not be used together
+	 */
+	if (schemas.head != NULL && schemas_excluded.head != NULL)
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and and exclude specific schema(s) at the same time");
+		exit(1);
+	}
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -352,6 +385,19 @@ main(int argc, char *argv[])
 			pg_log_error("cannot vacuum all databases and a specific one at the same time");
 			exit(1);
 		}
+
+		if (schemas.head != NULL)
+		{
+			pg_log_error("cannot vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
+		if (schemas_excluded.head != NULL)
+		{
+			pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
 		if (tables.head != NULL)
 		{
 			pg_log_error("cannot vacuum specific table(s) in all databases");
@@ -387,6 +433,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
+									(schema_exclusion) ? &schemas_excluded : &schemas,
 									&tables,
 									concurrentCons,
 									progname, echo, quiet);
@@ -395,6 +442,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
+								(schema_exclusion) ? &schemas_excluded : &schemas,
 								&tables,
 								concurrentCons,
 								progname, echo, quiet);
@@ -420,6 +468,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
+					SimpleStringList *schemas,
 					SimpleStringList *tables,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
@@ -436,6 +485,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			ntups;
 	bool		failed = false;
 	bool		tables_listed = false;
+	bool		schemas_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -618,6 +668,29 @@ vacuum_one_database(ConnParams *cparams,
 							 CppAsString2(RELKIND_RELATION) ", "
 							 CppAsString2(RELKIND_MATVIEW) "])\n");
 		has_where = true;
+
+		for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
+		{
+			if (!schemas_listed) {
+				appendPQExpBufferStr(&catalog_query,
+									 " AND pg_catalog.quote_ident(ns.nspname)");
+				if (schema_exclusion)
+					appendPQExpBufferStr(&catalog_query, " NOT IN (");
+				else
+					appendPQExpBufferStr(&catalog_query, " IN (");
+
+				schemas_listed = true;
+			}
+			else
+				appendPQExpBufferStr(&catalog_query, ", ");
+
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name");
+
+		}
+		/* Finish formatting schema filter */
+		if (schemas_listed)
+			appendPQExpBufferStr(&catalog_query, ")\n");
 	}
 
 	/*
@@ -814,6 +887,7 @@ vacuum_all_databases(ConnParams *cparams,
 				vacuum_one_database(cparams, vacopts,
 									stage,
 									NULL,
+									NULL,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -828,6 +902,7 @@ vacuum_all_databases(ConnParams *cparams,
 			vacuum_one_database(cparams, vacopts,
 								ANALYZE_NO_STAGE,
 								NULL,
+								NULL,
 								concurrentCons,
 								progname, echo, quiet);
 		}
@@ -1027,6 +1102,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
-- 
2.17.1

0002-f1.patchtext/x-diff; charset=us-asciiDownload

From 14d518ae31e0110a4325f922882955ead554ca64 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sun, 6 Mar 2022 08:18:20 -0600
Subject: [PATCH 2/3] f1

---
 src/bin/scripts/vacuumdb.c | 48 ++++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 3dca22e1c88..f76d8472c1f 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -485,7 +485,6 @@ vacuum_one_database(ConnParams *cparams,
 	int			ntups;
 	bool		failed = false;
 	bool		tables_listed = false;
-	bool		schemas_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -655,42 +654,41 @@ vacuum_one_database(ConnParams *cparams,
 	if (tables_listed)
 		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
 							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
-
-	/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
-	 */
-	if (!tables_listed)
+	else
 	{
+		/*
+		 * If no tables were listed, filter for the relevant relation types.  If
+		 * tables were given via --table, don't bother filtering by relation type.
+		 * Instead, let the server decide whether a given relation can be
+		 * processed in which case the user will know about it.
+		 */
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
 							 CppAsString2(RELKIND_MATVIEW) "])\n");
 		has_where = true;
 
-		for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
+		if (schemas != NULL && schemas->head != NULL)
 		{
-			if (!schemas_listed) {
+			appendPQExpBufferStr(&catalog_query,
+								 " AND c.relnamespace");
+			if (schema_exclusion)
 				appendPQExpBufferStr(&catalog_query,
-									 " AND pg_catalog.quote_ident(ns.nspname)");
-				if (schema_exclusion)
-					appendPQExpBufferStr(&catalog_query, " NOT IN (");
-				else
-					appendPQExpBufferStr(&catalog_query, " IN (");
-
-				schemas_listed = true;
-			}
+									" OPERATOR(pg_catalog.!=) ALL (ARRAY[");
 			else
-				appendPQExpBufferStr(&catalog_query, ", ");
+				appendPQExpBufferStr(&catalog_query,
+									" OPERATOR(pg_catalog.=) ANY (ARRAY[");
 
-			appendStringLiteralConn(&catalog_query, cell->val, conn);
-			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name");
+			for (cell = schemas->head; cell != NULL; cell = cell->next)
+			{
+				appendStringLiteralConn(&catalog_query, cell->val, conn);
+
+				if (cell->next != NULL)
+					appendPQExpBufferStr(&catalog_query, ", ");
+			}
 
+			/* Finish formatting schema filter */
+			appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n");
 		}
-		/* Finish formatting schema filter */
-		if (schemas_listed)
-			appendPQExpBufferStr(&catalog_query, ")\n");
 	}
 
 	/*
-- 
2.17.1

0003-f2.patchtext/x-diff; charset=us-asciiDownload

From 139a9cb7acd969848ee55fc4c08619cff4376c66 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sun, 6 Mar 2022 09:02:13 -0600
Subject: [PATCH 3/3] f2

---
 src/bin/scripts/vacuumdb.c | 51 +++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 26 deletions(-)

diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index f76d8472c1f..9ed2b95d7a5 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,7 +46,7 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
-static bool schema_exclusion = false;
+enum trivalue schema_is_exclude = TRI_DEFAULT;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
@@ -130,7 +130,6 @@ main(int argc, char *argv[])
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 	SimpleStringList schemas = {NULL, NULL};
-	SimpleStringList schemas_excluded = {NULL, NULL};
 
 	/* initialize options */
 	memset(&vacopts, 0, sizeof(vacopts));
@@ -209,11 +208,24 @@ main(int argc, char *argv[])
 					exit(1);
 				break;
 			case 'n':                       /* include schema(s) */
+				if (schema_is_exclude == TRI_YES)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
 				simple_string_list_append(&schemas, optarg);
+				schema_is_exclude = TRI_NO;
 				break;
 			case 'N':			/* exclude schema(s) */
-				simple_string_list_append(&schemas_excluded, optarg);
-				schema_exclusion = true;
+				if (schema_is_exclude == TRI_NO)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&schemas, optarg);
+				schema_is_exclude = TRI_YES;
 				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
@@ -358,22 +370,12 @@ main(int argc, char *argv[])
 	 * When filtereing on schema name, filter by table is not allowed.
 	 * The schema name can already be set to a fqdn table name.
 	 */
-	if (tbl_count && (schemas.head != NULL || schemas_excluded.head != NULL))
+	if (tbl_count && (schemas.head != NULL))
 	{
 		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
 		exit(1);
 	}
 
-	/*
-	 * Options -n | --schema and -N | --exclude-schema
-	 * can not be used together
-	 */
-	if (schemas.head != NULL && schemas_excluded.head != NULL)
-	{
-		pg_log_error("cannot vacuum all tables in schema(s) and and exclude specific schema(s) at the same time");
-		exit(1);
-	}
-
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -388,13 +390,10 @@ main(int argc, char *argv[])
 
 		if (schemas.head != NULL)
 		{
-			pg_log_error("cannot vacuum specific schema(s) in all databases");
-			exit(1);
-		}
-
-		if (schemas_excluded.head != NULL)
-		{
-			pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			if (schema_is_exclude == TRI_YES)
+				pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_log_error("cannot vacuum specific schema(s) in all databases");
 			exit(1);
 		}
 
@@ -433,7 +432,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									(schema_exclusion) ? &schemas_excluded : &schemas,
+									&schemas,
 									&tables,
 									concurrentCons,
 									progname, echo, quiet);
@@ -442,7 +441,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								(schema_exclusion) ? &schemas_excluded : &schemas,
+								&schemas,
 								&tables,
 								concurrentCons,
 								progname, echo, quiet);
@@ -671,10 +670,10 @@ vacuum_one_database(ConnParams *cparams,
 		{
 			appendPQExpBufferStr(&catalog_query,
 								 " AND c.relnamespace");
-			if (schema_exclusion)
+			if (schema_is_exclude == TRI_YES)
 				appendPQExpBufferStr(&catalog_query,
 									" OPERATOR(pg_catalog.!=) ALL (ARRAY[");
-			else
+			else if (schema_is_exclude == TRI_NO)
 				appendPQExpBufferStr(&catalog_query,
 									" OPERATOR(pg_catalog.=) ANY (ARRAY[");
 
-- 
2.17.1

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Justin Pryzby (#6)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Le 06/03/2022 à 16:04, Justin Pryzby a écrit :

On Sun, Mar 06, 2022 at 09:39:37AM +0100, Gilles Darold wrote:

Attached a new patch version that adds the -N | --exclude-schema option
to the vacuumdb command as suggested. Documentation updated too.

+ pg_log_error("cannot vacuum all tables in schema(s) and and exclude specific schema(s) at the same time");

and and

It's odd that schema_exclusion is a global var, but schemas/excluded are not.

Also, it seems unnecessary to have two schemas vars, since they can't be used
together. Maybe there's a better way than what I did in 003.

+ for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)

It's preferred to write cell != NULL

+ bool schemas_listed = false;

...
+		for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
+		{
+			if (!schemas_listed) {
+				appendPQExpBufferStr(&catalog_query,
+									 " AND pg_catalog.quote_ident(ns.nspname)");
+				if (schema_exclusion)
+					appendPQExpBufferStr(&catalog_query, " NOT IN (");
+				else
+					appendPQExpBufferStr(&catalog_query, " IN (");
+
+				schemas_listed = true;
+			}
+			else
+				appendPQExpBufferStr(&catalog_query, ", ");
+
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name");
+
+		}
+		/* Finish formatting schema filter */
+		if (schemas_listed)
+			appendPQExpBufferStr(&catalog_query, ")\n");
}
Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.

I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v3.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v3.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..378328afb3 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-n</option></arg>
+	      <arg choice="plain"><option>--schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-N</option></arg>
+	      <arg choice="plain"><option>--exclude-schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..4c4f47e32a 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,12 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".*/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd39..9ed2b95d7a 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,10 +46,12 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+enum trivalue schema_is_exclude = TRI_DEFAULT;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
+								SimpleStringList *schemas,
 								SimpleStringList *tables,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
@@ -94,6 +96,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -125,6 +129,7 @@ main(int argc, char *argv[])
 	SimpleStringList tables = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
+	SimpleStringList schemas = {NULL, NULL};
 
 	/* initialize options */
 	memset(&vacopts, 0, sizeof(vacopts));
@@ -140,7 +145,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -202,6 +207,26 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':                       /* include schema(s) */
+				if (schema_is_exclude == TRI_YES)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&schemas, optarg);
+				schema_is_exclude = TRI_NO;
+				break;
+			case 'N':			/* exclude schema(s) */
+				if (schema_is_exclude == TRI_NO)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&schemas, optarg);
+				schema_is_exclude = TRI_YES;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -341,6 +366,16 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtereing on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && (schemas.head != NULL))
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+		exit(1);
+	}
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -352,6 +387,16 @@ main(int argc, char *argv[])
 			pg_log_error("cannot vacuum all databases and a specific one at the same time");
 			exit(1);
 		}
+
+		if (schemas.head != NULL)
+		{
+			if (schema_is_exclude == TRI_YES)
+				pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_log_error("cannot vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
 		if (tables.head != NULL)
 		{
 			pg_log_error("cannot vacuum specific table(s) in all databases");
@@ -387,6 +432,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
+									&schemas,
 									&tables,
 									concurrentCons,
 									progname, echo, quiet);
@@ -395,6 +441,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
+								&schemas,
 								&tables,
 								concurrentCons,
 								progname, echo, quiet);
@@ -420,6 +467,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
+					SimpleStringList *schemas,
 					SimpleStringList *tables,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
@@ -605,19 +653,41 @@ vacuum_one_database(ConnParams *cparams,
 	if (tables_listed)
 		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
 							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
-
-	/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
-	 */
-	if (!tables_listed)
+	else
 	{
+		/*
+		 * If no tables were listed, filter for the relevant relation types.  If
+		 * tables were given via --table, don't bother filtering by relation type.
+		 * Instead, let the server decide whether a given relation can be
+		 * processed in which case the user will know about it.
+		 */
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
 							 CppAsString2(RELKIND_MATVIEW) "])\n");
 		has_where = true;
+
+		if (schemas != NULL && schemas->head != NULL)
+		{
+			appendPQExpBufferStr(&catalog_query,
+								 " AND c.relnamespace");
+			if (schema_is_exclude == TRI_YES)
+				appendPQExpBufferStr(&catalog_query,
+									" OPERATOR(pg_catalog.!=) ALL (ARRAY[");
+			else if (schema_is_exclude == TRI_NO)
+				appendPQExpBufferStr(&catalog_query,
+									" OPERATOR(pg_catalog.=) ANY (ARRAY[");
+
+			for (cell = schemas->head; cell != NULL; cell = cell->next)
+			{
+				appendStringLiteralConn(&catalog_query, cell->val, conn);
+
+				if (cell->next != NULL)
+					appendPQExpBufferStr(&catalog_query, ", ");
+			}
+
+			/* Finish formatting schema filter */
+			appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n");
+		}
 	}
 
 	/*
@@ -814,6 +884,7 @@ vacuum_all_databases(ConnParams *cparams,
 				vacuum_one_database(cparams, vacopts,
 									stage,
 									NULL,
+									NULL,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -828,6 +899,7 @@ vacuum_all_databases(ConnParams *cparams,
 			vacuum_one_database(cparams, vacopts,
 								ANALYZE_NO_STAGE,
 								NULL,
+								NULL,
 								concurrentCons,
 								progname, echo, quiet);
 		}
@@ -1027,6 +1099,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Gilles Darold (#1)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Hi,

New version v4 of the patch to fix a typo in a comment.

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v4.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v4.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..378328afb3 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-n</option></arg>
+	      <arg choice="plain"><option>--schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+
+	   <arg choice="plain">
+	    <arg choice="opt">
+	     <group choice="plain">
+	      <arg choice="plain"><option>-N</option></arg>
+	      <arg choice="plain"><option>--exclude-schema</option></arg>
+	     </group>
+	     <replaceable>schema</replaceable>
+	    </arg>
+	   </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..4c4f47e32a 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,12 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".*/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd39..d94f7459d5 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,10 +46,12 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+enum trivalue schema_is_exclude = TRI_DEFAULT;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
+								SimpleStringList *schemas,
 								SimpleStringList *tables,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
@@ -94,6 +96,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -125,6 +129,7 @@ main(int argc, char *argv[])
 	SimpleStringList tables = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
+	SimpleStringList schemas = {NULL, NULL};
 
 	/* initialize options */
 	memset(&vacopts, 0, sizeof(vacopts));
@@ -140,7 +145,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -202,6 +207,26 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':                       /* include schema(s) */
+				if (schema_is_exclude == TRI_YES)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&schemas, optarg);
+				schema_is_exclude = TRI_NO;
+				break;
+			case 'N':			/* exclude schema(s) */
+				if (schema_is_exclude == TRI_NO)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&schemas, optarg);
+				schema_is_exclude = TRI_YES;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -341,6 +366,16 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtering on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && (schemas.head != NULL))
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+		exit(1);
+	}
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -352,6 +387,16 @@ main(int argc, char *argv[])
 			pg_log_error("cannot vacuum all databases and a specific one at the same time");
 			exit(1);
 		}
+
+		if (schemas.head != NULL)
+		{
+			if (schema_is_exclude == TRI_YES)
+				pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_log_error("cannot vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
 		if (tables.head != NULL)
 		{
 			pg_log_error("cannot vacuum specific table(s) in all databases");
@@ -387,6 +432,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
+									&schemas,
 									&tables,
 									concurrentCons,
 									progname, echo, quiet);
@@ -395,6 +441,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
+								&schemas,
 								&tables,
 								concurrentCons,
 								progname, echo, quiet);
@@ -420,6 +467,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
+					SimpleStringList *schemas,
 					SimpleStringList *tables,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
@@ -605,19 +653,41 @@ vacuum_one_database(ConnParams *cparams,
 	if (tables_listed)
 		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
 							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
-
-	/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
-	 */
-	if (!tables_listed)
+	else
 	{
+		/*
+		 * If no tables were listed, filter for the relevant relation types.  If
+		 * tables were given via --table, don't bother filtering by relation type.
+		 * Instead, let the server decide whether a given relation can be
+		 * processed in which case the user will know about it.
+		 */
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
 							 CppAsString2(RELKIND_MATVIEW) "])\n");
 		has_where = true;
+
+		if (schemas != NULL && schemas->head != NULL)
+		{
+			appendPQExpBufferStr(&catalog_query,
+								 " AND c.relnamespace");
+			if (schema_is_exclude == TRI_YES)
+				appendPQExpBufferStr(&catalog_query,
+									" OPERATOR(pg_catalog.!=) ALL (ARRAY[");
+			else if (schema_is_exclude == TRI_NO)
+				appendPQExpBufferStr(&catalog_query,
+									" OPERATOR(pg_catalog.=) ANY (ARRAY[");
+
+			for (cell = schemas->head; cell != NULL; cell = cell->next)
+			{
+				appendStringLiteralConn(&catalog_query, cell->val, conn);
+
+				if (cell->next != NULL)
+					appendPQExpBufferStr(&catalog_query, ", ");
+			}
+
+			/* Finish formatting schema filter */
+			appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n");
+		}
 	}
 
 	/*
@@ -814,6 +884,7 @@ vacuum_all_databases(ConnParams *cparams,
 				vacuum_one_database(cparams, vacopts,
 									stage,
 									NULL,
+									NULL,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -828,6 +899,7 @@ vacuum_all_databases(ConnParams *cparams,
 			vacuum_one_database(cparams, vacopts,
 								ANALYZE_NO_STAGE,
 								NULL,
+								NULL,
 								concurrentCons,
 								progname, echo, quiet);
 		}
@@ -1027,6 +1099,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Gilles Darold (#7)

Re: [Proposal] vacuumdb --schema only

On Mon, Mar 07, 2022 at 08:38:04AM +0100, Gilles Darold wrote:

Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.

I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/

I wondered whether my patches were improvements, and it occurred to me that
your patch didn't fail if the specified schema didn't exist. That's arguably
preferable, but that's the pre-existing behavior for tables. So I think the
behavior of my patch is more consistent.

$ ./src/bin/scripts/vacuumdb -h /tmp -d postgres --table foo
vacuumdb: vacuuming database "postgres"
2022-03-09 15:04:06.922 CST client backend[25540] vacuumdb ERROR: relation "foo" does not exist at character 60

$ ./src/bin/scripts/vacuumdb -h /tmp -d postgres --schema foo
vacuumdb: vacuuming database "postgres"
2022-03-09 15:02:59.926 CST client backend[23516] vacuumdb ERROR: schema "foo" does not exist at character 335

#10

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Justin Pryzby (#9)

Re: [Proposal] vacuumdb --schema only

Le 09/03/2022 à 22:10, Justin Pryzby a écrit :

On Mon, Mar 07, 2022 at 08:38:04AM +0100, Gilles Darold wrote:

Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.

I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/

I wondered whether my patches were improvements, and it occurred to me that
your patch didn't fail if the specified schema didn't exist. That's arguably
preferable, but that's the pre-existing behavior for tables. So I think the
behavior of my patch is more consistent.

--
Gilles Darold

#11

Robert Treat

rob@xzilla.net

almost 4 years ago

In reply to: Gilles Darold (#10)

Re: [Proposal] vacuumdb --schema only

On Thu, Mar 10, 2022 at 1:32 AM Gilles Darold <gilles@migops.com> wrote:

Le 09/03/2022 à 22:10, Justin Pryzby a écrit :

On Mon, Mar 07, 2022 at 08:38:04AM +0100, Gilles Darold wrote:

Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.

I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/

I wondered whether my patches were improvements, and it occurred to me that
your patch didn't fail if the specified schema didn't exist. That's arguably
preferable, but that's the pre-existing behavior for tables. So I think the
behavior of my patch is more consistent.

+1

+1 for consistency.

Robert Treat
https://xzilla.net

#12

Nathan Bossart

nathandbossart@gmail.com

almost 4 years ago

In reply to: Robert Treat (#11)

Re: [Proposal] vacuumdb --schema only

I took a look at the v4 patch.

'git-apply' complains about whitespace errors:

0001-vacuumdb-schema-only-v4.patch:17: tab in indent.
<arg choice="plain">
0001-vacuumdb-schema-only-v4.patch:18: tab in indent.
<arg choice="opt">
0001-vacuumdb-schema-only-v4.patch:19: tab in indent.
<group choice="plain">
0001-vacuumdb-schema-only-v4.patch:20: tab in indent.
<arg choice="plain"><option>-n</option></arg>
0001-vacuumdb-schema-only-v4.patch:21: tab in indent.
<arg choice="plain"><option>--schema</option></arg>
warning: squelched 13 whitespace errors
warning: 18 lines add whitespace errors.

+ printf(_(" -N, --exclude-schema=PATTERN do NOT vacuum tables in the specified schema(s)\n"));

I'm personally -1 for the --exclude-schema option. I don't see any
existing "exclude" options in vacuumdb, and the uses for such an option
seem rather limited. If we can point to specific use-cases for this
option, I might be willing to change my vote.

+   <para>
+    To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>

nitpicks: I think the phrasing should be "To only clean tables in the...".
Also, is there any reason to use a schema name with a capital letter as an
example? IMO that just adds unnecessary complexity to the example.

+$node->issues_sql_like(
+    [ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+    qr/VACUUM "Foo".*/,
+    'vacuumdb --schema schema only');

IIUC there should only be one table in the schema. Can we avoid matching
"*" and check for the exact command instead?

I think there should be a few more test cases. For example, we should test
using -n and -N at the same time, and we should test what happens when
those options are used for missing schemas.

+    /*
+     * When filtering on schema name, filter by table is not allowed.
+     * The schema name can already be set to a fqdn table name.
+     */
+    if (tbl_count && (schemas.head != NULL))
+    {
+        pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+        exit(1);
+    }

I think there might be some useful refactoring we can do that would
simplify adding similar options in the future. Specifically, can we have a
global variable that stores the type of vacuumdb command (e.g., all,
tables, or schemas)? If so, perhaps the tables list could be renamed and
reused for schemas (and any other objects that need listing in the future).

+        if (schemas != NULL && schemas->head != NULL)
+        {
+            appendPQExpBufferStr(&catalog_query,
+                                 " AND c.relnamespace");
+            if (schema_is_exclude == TRI_YES)
+                appendPQExpBufferStr(&catalog_query,
+                                    " OPERATOR(pg_catalog.!=) ALL (ARRAY[");
+            else if (schema_is_exclude == TRI_NO)
+                appendPQExpBufferStr(&catalog_query,
+                                    " OPERATOR(pg_catalog.=) ANY (ARRAY[");
+
+            for (cell = schemas->head; cell != NULL; cell = cell->next)
+            {
+                appendStringLiteralConn(&catalog_query, cell->val, conn);
+
+                if (cell->next != NULL)
+                    appendPQExpBufferStr(&catalog_query, ", ");
+            }
+
+            /* Finish formatting schema filter */
+            appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n");
+        }

IMO we should use a CTE for specified schemas like we do for the specified
tables. I wonder if we could even have a mostly-shared CTE code path for
all vacuumdb commands with a list of names.

-    /*
-     * If no tables were listed, filter for the relevant relation types.  If
-     * tables were given via --table, don't bother filtering by relation type.
-     * Instead, let the server decide whether a given relation can be
-     * processed in which case the user will know about it.
-     */
-    if (!tables_listed)
+    else
     {
+        /*
+         * If no tables were listed, filter for the relevant relation types.  If
+         * tables were given via --table, don't bother filtering by relation type.
+         * Instead, let the server decide whether a given relation can be
+         * processed in which case the user will know about it.
+         */

nitpick: This change seems unnecessary.

I noticed upthread that there was some discussion around adding a way to
specify a schema in VACUUM and ANALYZE commands. I think this patch is
useful even if such an option is eventually added, as we'll still want
vacuumdb to obtain the full list of tables to process so that it can
effectively parallelize.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#13

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Nathan Bossart (#12)

Re: [Proposal] vacuumdb --schema only

On Wed, Mar 30, 2022 at 02:22:58PM -0700, Nathan Bossart wrote:

I'm personally -1 for the --exclude-schema option. I don't see any
existing "exclude" options in vacuumdb, and the uses for such an option
seem rather limited. If we can point to specific use-cases for this
option, I might be willing to change my vote.

I suggested it because I would consider using it, even though I don't currently
use the vacuumdb script at all. I think this would allow partially
retiring/simplifying our existing vacuum script.

We 1) put all our partitions in a separate "child" schema (so \d is more
usable), and also 2) put some short-lived tables into their own schemas. Some
of those tables may only exist for ~1 day so I'd perfer to neither vacuum nor
analyze them (they're only used for SELECT *). But there can be a lot of them,
so a nightly job could do something like vacuumdb --schema public or vacuumdb
--exclude-schema ephemeral.

Everything would be processed nightly using vacuumdb --min-xid (to keep the
monitoring system happy).

The non-partitioned tables could be vacuumed nightly (without min-xid), with
--exclude ephemeral.

The partitioned tables could be processed monthly with vacuumdb --analyze.

I'd also want to be able to run vacuumdb --analyze nightly, but I'd want to
exclude the schema with short-lived tables. I'd also need a way to exclude
our partitioned tables from nightly analyze (they should run monthly only).

Maybe this could share something with this patch:
https://commitfest.postgresql.org/37/2573/
pg_dump - read data for some options from external file

The goal of that patch was to put it in a file, which isn't really needed here.
But if there were common infrastructure for matching tables, it could be
shared. The interesting part for this patch is to avoid adding separate
commandline arguments for --include-table, --exclude-table, --include-schema,
--exclude-schema (and anything else?)

#14

Nathan Bossart

nathandbossart@gmail.com

almost 4 years ago

In reply to: Justin Pryzby (#13)

Re: [Proposal] vacuumdb --schema only

On Fri, Apr 01, 2022 at 10:01:28AM -0500, Justin Pryzby wrote:

On Wed, Mar 30, 2022 at 02:22:58PM -0700, Nathan Bossart wrote:

I'm personally -1 for the --exclude-schema option. I don't see any
existing "exclude" options in vacuumdb, and the uses for such an option
seem rather limited. If we can point to specific use-cases for this
option, I might be willing to change my vote.

I suggested it because I would consider using it, even though I don't currently
use the vacuumdb script at all. I think this would allow partially
retiring/simplifying our existing vacuum script.

We 1) put all our partitions in a separate "child" schema (so \d is more
usable), and also 2) put some short-lived tables into their own schemas. Some
of those tables may only exist for ~1 day so I'd perfer to neither vacuum nor
analyze them (they're only used for SELECT *). But there can be a lot of them,
so a nightly job could do something like vacuumdb --schema public or vacuumdb
--exclude-schema ephemeral.

Everything would be processed nightly using vacuumdb --min-xid (to keep the
monitoring system happy).

The non-partitioned tables could be vacuumed nightly (without min-xid), with
--exclude ephemeral.

The partitioned tables could be processed monthly with vacuumdb --analyze.

I'd also want to be able to run vacuumdb --analyze nightly, but I'd want to
exclude the schema with short-lived tables. I'd also need a way to exclude
our partitioned tables from nightly analyze (they should run monthly only).

Thanks for elaborating. I retract my -1 vote.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#15

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Nathan Bossart (#12)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Le 30/03/2022 à 23:22, Nathan Bossart a écrit :

I took a look at the v4 patch.

'git-apply' complains about whitespace errors:

Fixed.

+   <para>
+    To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>
nitpicks: I think the phrasing should be "To only clean tables in the...".
Also, is there any reason to use a schema name with a capital letter as an
example? IMO that just adds unnecessary complexity to the example.

I have though that an example of a schema with case sensitivity was
missing in the documentation but I agree with your comment, this is
probably not he best place to do that. Fixed.

+$node->issues_sql_like(
+    [ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+    qr/VACUUM "Foo".*/,
+    'vacuumdb --schema schema only');
IIUC there should only be one table in the schema. Can we avoid matching
"*" and check for the exact command instead?

Fixed.

I think there should be a few more test cases. For example, we should test
using -n and -N at the same time, and we should test what happens when
those options are used for missing schemas.

Fixed

+    /*
+     * When filtering on schema name, filter by table is not allowed.
+     * The schema name can already be set to a fqdn table name.
+     */
+    if (tbl_count && (schemas.head != NULL))
+    {
+        pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+        exit(1);
+    }
I think there might be some useful refactoring we can do that would
simplify adding similar options in the future. Specifically, can we have a
global variable that stores the type of vacuumdb command (e.g., all,
tables, or schemas)? If so, perhaps the tables list could be renamed and
reused for schemas (and any other objects that need listing in the future).

I don't think there will be much more options like this one that will be
added to this command but anyway I have changed the patch that way.

+        if (schemas != NULL && schemas->head != NULL)
+        {
+            appendPQExpBufferStr(&catalog_query,
+                                 " AND c.relnamespace");
+            if (schema_is_exclude == TRI_YES)
+                appendPQExpBufferStr(&catalog_query,
+                                    " OPERATOR(pg_catalog.!=) ALL (ARRAY[");
+            else if (schema_is_exclude == TRI_NO)
+                appendPQExpBufferStr(&catalog_query,
+                                    " OPERATOR(pg_catalog.=) ANY (ARRAY[");
+
+            for (cell = schemas->head; cell != NULL; cell = cell->next)
+            {
+                appendStringLiteralConn(&catalog_query, cell->val, conn);
+
+                if (cell->next != NULL)
+                    appendPQExpBufferStr(&catalog_query, ", ");
+            }
+
+            /* Finish formatting schema filter */
+            appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n");
+        }

IMO we should use a CTE for specified schemas like we do for the specified
tables. I wonder if we could even have a mostly-shared CTE code path for
all vacuumdb commands with a list of names.

Fixed

-    /*
-     * If no tables were listed, filter for the relevant relation types.  If
-     * tables were given via --table, don't bother filtering by relation type.
-     * Instead, let the server decide whether a given relation can be
-     * processed in which case the user will know about it.
-     */
-    if (!tables_listed)
+    else
{
+        /*
+         * If no tables were listed, filter for the relevant relation types.  If
+         * tables were given via --table, don't bother filtering by relation type.
+         * Instead, let the server decide whether a given relation can be
+         * processed in which case the user will know about it.
+         */
nitpick: This change seems unnecessary.

Fixed

Thanks for the review, all these changes are available in new version v6
of the patch and attached here.

Best regards,

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v6.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v6.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..0de001ef24 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..7bbfb97246 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,15 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-N',  '"Foo"', 'postgres' ],
+	'cannot use option -n | --schema and -N | --exclude-schema at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd39..87261ebd2b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,18 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+enum trivalue schema_is_exclude = TRI_DEFAULT;
+
+/*
+ * The kind of object filter to use. '0': none, 'n': schema, 't': table
+ * these values correspond to the -n | -N and -t command line options.
+ */
+char objfilter = '0';
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *schemas,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -94,6 +101,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -122,7 +131,7 @@ main(int argc, char *argv[])
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
 	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +149,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -181,11 +190,19 @@ main(int argc, char *argv[])
 				alldb = true;
 				break;
 			case 't':
+			{
+				/* When filtering on schema name, filter by table is not allowed. */
+				if (schema_is_exclude != TRI_DEFAULT)
 				{
-					simple_string_list_append(&tables, optarg);
-					tbl_count++;
-					break;
+					pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					exit(1);
 				}
+
+				simple_string_list_append(&objects, optarg);
+				objfilter = 't';
+				tbl_count++;
+				break;
+			}
 			case 'f':
 				vacopts.full = true;
 				break;
@@ -202,6 +219,41 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':			/* include schema(s) */
+				/* When filtering on schema name, filter by table is not allowed. */
+				if (tbl_count)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					exit(1);
+				}
+
+				if (schema_is_exclude == TRI_YES)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&objects, optarg);
+				objfilter = 'n';
+				schema_is_exclude = TRI_NO;
+				break;
+			case 'N':			/* exclude schema(s) */
+				/* When filtering on schema name, filter by table is not allowed. */
+				if (tbl_count)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					exit(1);
+				}
+				if (schema_is_exclude == TRI_NO)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&objects, optarg);
+				objfilter = 'n';
+				schema_is_exclude = TRI_YES;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -341,6 +393,16 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtering on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && objfilter == 'n' && objects.head != NULL)
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+		exit(1);
+	}
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -352,7 +414,17 @@ main(int argc, char *argv[])
 			pg_log_error("cannot vacuum all databases and a specific one at the same time");
 			exit(1);
 		}
-		if (tables.head != NULL)
+
+		if (objfilter == 'n' && objects.head != NULL)
+		{
+			if (schema_is_exclude == TRI_YES)
+				pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_log_error("cannot vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
+		if (objfilter == 't' && objects.head != NULL)
 		{
 			pg_log_error("cannot vacuum specific table(s) in all databases");
 			exit(1);
@@ -387,7 +459,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -395,7 +467,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -420,7 +492,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -435,7 +507,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -549,31 +621,43 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
 		char	   *just_table;
 		const char *just_columns;
 
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
-
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+
+		switch (objfilter)
+		{
+			case 'n':
+				appendStringLiteralConn(&catalog_query, cell->val, conn);
+				appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.oid, ");
+				break;
+			case 't':
+				/*
+				 * Split relation and column names given by the user, this is used to
+				 * feed the CTE with values on which are performed pre-run validity
+				 * checks as well.  For now these happen only on the relation name.
+				 */
+				splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+									  &just_table, &just_columns);
+
+				appendStringLiteralConn(&catalog_query, just_table, conn);
+				appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+				break;
+			default:
+				break;
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -586,13 +670,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -601,10 +685,21 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+				 " ON listed_objects.object_oid OPERATOR(pg_catalog.=) ");
+		switch (objfilter)
+		{
+			case 't':
+				appendPQExpBufferStr(&catalog_query, "c.oid\n");
+				break;
+			case 'n':
+				appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+				break;
+		}
+	}
 
 	/*
 	 * If no tables were listed, filter for the relevant relation types.  If
@@ -612,7 +707,7 @@ vacuum_one_database(ConnParams *cparams,
 	 * Instead, let the server decide whether a given relation can be
 	 * processed in which case the user will know about it.
 	 */
-	if (!tables_listed)
+	if (!objects_listed || objfilter == 'n')
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -683,7 +778,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -1027,6 +1122,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

#16

Justin Pryzby

pryzby@telsasoft.com

almost 4 years ago

In reply to: Gilles Darold (#15)

Re: [Proposal] vacuumdb --schema only

On Wed, Apr 06, 2022 at 07:43:42PM +0200, Gilles Darold wrote:

Thanks for the review, all these changes are available in new version v6
of the patch and attached here.

This is failing in CI (except on macos, which is strangely passing).
http://cfbot.cputube.org/gilles-darold.html

https://api.cirrus-ci.com/v1/artifact/task/5379693443547136/log/src/bin/scripts/tmp_check/log/regress_log_100_vacuumdb

not ok 59 - vacuumdb --schema "Foo" postgres exit code 0

# Failed test 'vacuumdb --schema "Foo" postgres exit code 0'
# at t/100_vacuumdb.pl line 151.
not ok 60 - vacuumdb --schema schema only: SQL found in server log

# Failed test 'vacuumdb --schema schema only: SQL found in server log'
# at t/100_vacuumdb.pl line 151.
# '2022-04-06 18:15:36.313 UTC [34857][not initialized] [[unknown]][:0] LOG: connection received: host=[local]
# 2022-04-06 18:15:36.314 UTC [34857][client backend] [[unknown]][3/2801:0] LOG: connection authorized: user=postgres database=postgres application_name=100_vacuumdb.pl
# 2022-04-06 18:15:36.318 UTC [34857][client backend] [100_vacuumdb.pl][3/2802:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false);
# 2022-04-06 18:15:36.586 UTC [34857][client backend] [100_vacuumdb.pl][:0] LOG: disconnection: session time: 0:00:00.273 user=postgres database=postgres host=[local]
# '
# doesn't match '(?^:VACUUM "Foo".bar)'

#17

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Justin Pryzby (#16)

Re: [Proposal] vacuumdb --schema only

Le 08/04/2022 à 02:46, Justin Pryzby a écrit :

On Wed, Apr 06, 2022 at 07:43:42PM +0200, Gilles Darold wrote:

Thanks for the review, all these changes are available in new version v6
of the patch and attached here.

This is failing in CI (except on macos, which is strangely passing).
http://cfbot.cputube.org/gilles-darold.html

https://api.cirrus-ci.com/v1/artifact/task/5379693443547136/log/src/bin/scripts/tmp_check/log/regress_log_100_vacuumdb

not ok 59 - vacuumdb --schema "Foo" postgres exit code 0

# Failed test 'vacuumdb --schema "Foo" postgres exit code 0'
# at t/100_vacuumdb.pl line 151.
not ok 60 - vacuumdb --schema schema only: SQL found in server log

# Failed test 'vacuumdb --schema schema only: SQL found in server log'
# at t/100_vacuumdb.pl line 151.
# '2022-04-06 18:15:36.313 UTC [34857][not initialized] [[unknown]][:0] LOG: connection received: host=[local]
# 2022-04-06 18:15:36.314 UTC [34857][client backend] [[unknown]][3/2801:0] LOG: connection authorized: user=postgres database=postgres application_name=100_vacuumdb.pl
# 2022-04-06 18:15:36.318 UTC [34857][client backend] [100_vacuumdb.pl][3/2802:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false);
# 2022-04-06 18:15:36.586 UTC [34857][client backend] [100_vacuumdb.pl][:0] LOG: disconnection: session time: 0:00:00.273 user=postgres database=postgres host=[local]
# '
# doesn't match '(?^:VACUUM "Foo".bar)'

I'm surprised because make check do do not reports errors running on an
Ubuntu 20.04 and CentOs 8:

t/010_clusterdb.pl ........ ok
t/011_clusterdb_all.pl .... ok
t/020_createdb.pl ......... ok
t/040_createuser.pl ....... ok
t/050_dropdb.pl ........... ok
t/070_dropuser.pl ......... ok
t/080_pg_isready.pl ....... ok
t/090_reindexdb.pl ........ ok
t/091_reindexdb_all.pl .... ok
t/100_vacuumdb.pl ......... ok
t/101_vacuumdb_all.pl ..... ok
t/102_vacuumdb_stages.pl .. ok
t/200_connstr.pl .......... ok
All tests successful.
Files=13, Tests=233, 17 wallclock secs ( 0.09 usr 0.02 sys + 6.63 cusr
2.68 csys = 9.42 CPU)
Result: PASS

In tmp_check/log/regress_log_100_vacuumdb:

# Running: vacuumdb --schema "Foo" postgres
vacuumdb: vacuuming database "postgres"
ok 59 - vacuumdb --schema "Foo" postgres exit code 0
ok 60 - vacuumdb --schema schema only: SQL found in server log

In PG log:

2022-04-08 11:01:44.519 CEST [17223] 100_vacuumdb.pl LOG: statement:
RESET search_path;
2022-04-08 11:01:44.519 CEST [17223] 100_vacuumdb.pl LOG: statement:
WITH listed_objects (object_oid, column_list) AS (
          VALUES ('"Foo"'::pg_catalog.regnamespace::pg_catalog.oid,
NULL::pg_catalog.text)
        )
        SELECT c.relname, ns.nspname, listed_objects.column_list FROM
pg_catalog.pg_class c
         JOIN pg_catalog.pg_namespace ns ON c.relnamespace
OPERATOR(pg_catalog.=) ns.oid
         LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid
OPERATOR(pg_catalog.=) t.oid
         JOIN listed_objects ON listed_objects.object_oid
OPERATOR(pg_catalog.=) ns.oid
         WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm'])
         ORDER BY c.relpages DESC;
2022-04-08 11:01:44.521 CEST [17223] 100_vacuumdb.pl LOG: statement:
SELECT pg_catalog.set_config('search_path', '', false);
2022-04-08 11:01:44.521 CEST [17223] 100_vacuumdb.pl LOG: statement:
VACUUM "Foo".bar;

And if I run the command manually:

$ /usr/local/pgsql/bin/vacuumdb -e -h localhost --schema '"Foo"' -d
contrib_regress -U postgres
SELECT pg_catalog.set_config('search_path', '', false);
vacuumdb: vacuuming database "contrib_regress"
RESET search_path;
WITH listed_objects (object_oid, column_list) AS (
VALUES ('"Foo"'::pg_catalog.regnamespace::pg_catalog.oid,
NULL::pg_catalog.text)
)
SELECT c.relname, ns.nspname, listed_objects.column_list FROM
pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace ns ON c.relnamespace
OPERATOR(pg_catalog.=) ns.oid
LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid
OPERATOR(pg_catalog.=) t.oid
JOIN listed_objects ON listed_objects.object_oid
OPERATOR(pg_catalog.=) ns.oid
WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm'])
ORDER BY c.relpages DESC;
SELECT pg_catalog.set_config('search_path', '', false);

VACUUM "Foo".bar;

$ echo $?
0

I don't know what happen on cfbot, investigating...

--
Gilles Darold

#18

Gilles Darold

gilles@migops.com

almost 4 years ago

In reply to: Justin Pryzby (#16)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Le 08/04/2022 à 02:46, Justin Pryzby a écrit :

On Wed, Apr 06, 2022 at 07:43:42PM +0200, Gilles Darold wrote:

Thanks for the review, all these changes are available in new version v6
of the patch and attached here.

This is failing in CI (except on macos, which is strangely passing).
http://cfbot.cputube.org/gilles-darold.html

https://api.cirrus-ci.com/v1/artifact/task/5379693443547136/log/src/bin/scripts/tmp_check/log/regress_log_100_vacuumdb

not ok 59 - vacuumdb --schema "Foo" postgres exit code 0

# Failed test 'vacuumdb --schema "Foo" postgres exit code 0'
# at t/100_vacuumdb.pl line 151.
not ok 60 - vacuumdb --schema schema only: SQL found in server log

# Failed test 'vacuumdb --schema schema only: SQL found in server log'
# at t/100_vacuumdb.pl line 151.
# '2022-04-06 18:15:36.313 UTC [34857][not initialized] [[unknown]][:0] LOG: connection received: host=[local]
# 2022-04-06 18:15:36.314 UTC [34857][client backend] [[unknown]][3/2801:0] LOG: connection authorized: user=postgres database=postgres application_name=100_vacuumdb.pl
# 2022-04-06 18:15:36.318 UTC [34857][client backend] [100_vacuumdb.pl][3/2802:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false);
# 2022-04-06 18:15:36.586 UTC [34857][client backend] [100_vacuumdb.pl][:0] LOG: disconnection: session time: 0:00:00.273 user=postgres database=postgres host=[local]
# '
# doesn't match '(?^:VACUUM "Foo".bar)'

Ok, got it with the help of rjuju. Actually it was compiling well using
gcc but clang give some warnings. A fix of these warning makes CI happy.

Attached v7 of the patch that should pass cfbot.

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v7.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v7.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..0de001ef24 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..7bbfb97246 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,15 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-N',  '"Foo"', 'postgres' ],
+	'cannot use option -n | --schema and -N | --exclude-schema at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd39..f118b05169 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,18 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+enum trivalue schema_is_exclude = TRI_DEFAULT;
+
+/*
+ * The kind of object filter to use. '0': none, 'n': schema, 't': table
+ * these values correspond to the -n | -N and -t command line options.
+ */
+char objfilter = '0';
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *schemas,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -94,6 +101,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -122,7 +131,7 @@ main(int argc, char *argv[])
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
 	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +149,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -181,11 +190,19 @@ main(int argc, char *argv[])
 				alldb = true;
 				break;
 			case 't':
+			{
+				/* When filtering on schema name, filter by table is not allowed. */
+				if (schema_is_exclude != TRI_DEFAULT)
 				{
-					simple_string_list_append(&tables, optarg);
-					tbl_count++;
-					break;
+					pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					exit(1);
 				}
+
+				simple_string_list_append(&objects, optarg);
+				objfilter = 't';
+				tbl_count++;
+				break;
+			}
 			case 'f':
 				vacopts.full = true;
 				break;
@@ -202,6 +219,41 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':			/* include schema(s) */
+				/* When filtering on schema name, filter by table is not allowed. */
+				if (tbl_count)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					exit(1);
+				}
+
+				if (schema_is_exclude == TRI_YES)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&objects, optarg);
+				objfilter = 'n';
+				schema_is_exclude = TRI_NO;
+				break;
+			case 'N':			/* exclude schema(s) */
+				/* When filtering on schema name, filter by table is not allowed. */
+				if (tbl_count)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					exit(1);
+				}
+				if (schema_is_exclude == TRI_NO)
+				{
+					pg_log_error("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					exit(1);
+				}
+
+				simple_string_list_append(&objects, optarg);
+				objfilter = 'n';
+				schema_is_exclude = TRI_YES;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -341,6 +393,16 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtering on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && objfilter == 'n' && objects.head != NULL)
+	{
+		pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+		exit(1);
+	}
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -352,7 +414,17 @@ main(int argc, char *argv[])
 			pg_log_error("cannot vacuum all databases and a specific one at the same time");
 			exit(1);
 		}
-		if (tables.head != NULL)
+
+		if (objfilter == 'n' && objects.head != NULL)
+		{
+			if (schema_is_exclude == TRI_YES)
+				pg_log_error("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_log_error("cannot vacuum specific schema(s) in all databases");
+			exit(1);
+		}
+
+		if (objfilter == 't' && objects.head != NULL)
 		{
 			pg_log_error("cannot vacuum specific table(s) in all databases");
 			exit(1);
@@ -387,7 +459,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -395,7 +467,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -420,7 +492,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -435,7 +507,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -549,31 +621,41 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
-		char	   *just_table;
-		const char *just_columns;
-
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
+		char	   *just_table = NULL;
+		const char *just_columns = NULL;
 
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+
+		switch (objfilter)
+		{
+			case 'n':
+				appendStringLiteralConn(&catalog_query, cell->val, conn);
+				appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.oid, ");
+				break;
+			case 't':
+				/*
+				 * Split relation and column names given by the user, this is used to
+				 * feed the CTE with values on which are performed pre-run validity
+				 * checks as well.  For now these happen only on the relation name.
+				 */
+				splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+									  &just_table, &just_columns);
+
+				appendStringLiteralConn(&catalog_query, just_table, conn);
+				appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+				break;
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -586,13 +668,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -601,10 +683,21 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+				 " ON listed_objects.object_oid OPERATOR(pg_catalog.=) ");
+		switch (objfilter)
+		{
+			case 't':
+				appendPQExpBufferStr(&catalog_query, "c.oid\n");
+				break;
+			case 'n':
+				appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+				break;
+		}
+	}
 
 	/*
 	 * If no tables were listed, filter for the relevant relation types.  If
@@ -612,7 +705,7 @@ vacuum_one_database(ConnParams *cparams,
 	 * Instead, let the server decide whether a given relation can be
 	 * processed in which case the user will know about it.
 	 */
-	if (!tables_listed)
+	if (!objects_listed || objfilter == 'n')
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -683,7 +776,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -1027,6 +1120,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

#19

Nathan Bossart

nathandbossart@gmail.com

almost 4 years ago

In reply to: Gilles Darold (#18)

Re: [Proposal] vacuumdb --schema only

On Fri, Apr 08, 2022 at 05:16:06PM +0200, Gilles Darold wrote:

Attached v7 of the patch that should pass cfbot.

Thanks for the new patch! Unfortunately, it looks like some recent changes
have broken it again.

+enum trivalue schema_is_exclude = TRI_DEFAULT;
+
+/*
+ * The kind of object filter to use. '0': none, 'n': schema, 't': table
+ * these values correspond to the -n | -N and -t command line options.
+ */
+char objfilter = '0';

I think these should be combined into a single enum for simplicity and
readability (e.g., OBJFILTER_NONE, OBJFILTER_INCLUDE_SCHEMA,
OBJFILTER_EXCLUDE_SCHEMA, OBJFILTER_TABLE).

* Instead, let the server decide whether a given relation can be
* processed in which case the user will know about it.
*/
-	if (!tables_listed)
+	if (!objects_listed || objfilter == 'n')
{
appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
CppAsString2(RELKIND_RELATION) ", "

I think this deserveѕ a comment.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#20

Gilles Darold

gilles@migops.com

over 3 years ago

In reply to: Nathan Bossart (#19)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Le 11/04/2022 à 20:37, Nathan Bossart a écrit :

On Fri, Apr 08, 2022 at 05:16:06PM +0200, Gilles Darold wrote:

Attached v7 of the patch that should pass cfbot.

Thanks for the new patch! Unfortunately, it looks like some recent changes
have broken it again.
+enum trivalue schema_is_exclude = TRI_DEFAULT;
+
+/*
+ * The kind of object filter to use. '0': none, 'n': schema, 't': table
+ * these values correspond to the -n | -N and -t command line options.
+ */
+char objfilter = '0';
I think these should be combined into a single enum for simplicity and
readability (e.g., OBJFILTER_NONE, OBJFILTER_INCLUDE_SCHEMA,
OBJFILTER_EXCLUDE_SCHEMA, OBJFILTER_TABLE).
* Instead, let the server decide whether a given relation can be
* processed in which case the user will know about it.
*/
-	if (!tables_listed)
+	if (!objects_listed || objfilter == 'n')
{
appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
CppAsString2(RELKIND_RELATION) ", "
I think this deserveѕ a comment.

Attached v8 of the patch that tries to address the remarks above, fixes
patch apply failure to master and replace calls to pg_log_error+exit
with pg_fatal.

.Thanks.

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v8.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v8.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..0de001ef24 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..7bbfb97246 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,15 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-N',  '"Foo"', 'postgres' ],
+	'cannot use option -n | --schema and -N | --exclude-schema at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 92f1ffe147..ce353593ce 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,28 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+enum trivalue schema_is_exclude = TRI_DEFAULT;
+
+/*
+ * The kind of object use in the command line filter.
+ *   OBJFILTER_NONE: no filter used
+ *   OBJFILTER_SCHEMA: -n | --schema or -N | --exclude-schema
+ *   OBJFILTER_TABLE: -t | --table
+ */
+enum VacObjectFilter
+{
+	OBJFILTER_NONE,
+	OBJFILTER_TABLE,
+	OBJFILTER_SCHEMA
+};
+
+enum VacObjectFilter objfilter = OBJFILTER_NONE;
+
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *objects,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -72,7 +89,6 @@ static void help(const char *progname);
 #define ANALYZE_NO_STAGE	-1
 #define ANALYZE_NUM_STAGES	3
 
-
 int
 main(int argc, char *argv[])
 {
@@ -94,6 +110,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -122,7 +140,7 @@ main(int argc, char *argv[])
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
 	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +158,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -182,7 +200,11 @@ main(int argc, char *argv[])
 				break;
 			case 't':
 				{
-					simple_string_list_append(&tables, optarg);
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (schema_is_exclude != TRI_DEFAULT)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_TABLE;
 					tbl_count++;
 					break;
 				}
@@ -202,6 +224,32 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':			/* include schema(s) */
+				{
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (tbl_count)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+
+					if (schema_is_exclude == TRI_YES)
+						pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					schema_is_exclude = TRI_NO;
+					break;
+				}
+			case 'N':			/* exclude schema(s) */
+				{
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (tbl_count)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					if (schema_is_exclude == TRI_NO)
+						pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					schema_is_exclude = TRI_YES;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -312,6 +360,13 @@ main(int argc, char *argv[])
 
 	setup_cancel_handler(NULL);
 
+	/*
+	 * When filtering on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
+		pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+
 	/* Avoid opening extra connections. */
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
@@ -320,7 +375,15 @@ main(int argc, char *argv[])
 	{
 		if (dbname)
 			pg_fatal("cannot vacuum all databases and a specific one at the same time");
-		if (tables.head != NULL)
+		if (objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
+		{
+			if (schema_is_exclude == TRI_YES)
+				pg_fatal("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+		}
+
+		if (objfilter == OBJFILTER_TABLE && objects.head != NULL)
 			pg_fatal("cannot vacuum specific table(s) in all databases");
 
 		cparams.dbname = maintenance_db;
@@ -352,7 +415,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -360,7 +423,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -385,7 +448,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -400,7 +463,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -499,31 +562,41 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
-		char	   *just_table;
-		const char *just_columns;
+		char	   *just_table = NULL;
+		const char *just_columns = NULL;
 
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
-
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+
+		if (objfilter == OBJFILTER_SCHEMA)
+		{
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.oid, ");
+		}
+
+		if (objfilter == OBJFILTER_TABLE)
+		{
+			/*
+			 * Split relation and column names given by the user, this is used to
+			 * feed the CTE with values on which are performed pre-run validity
+			 * checks as well.  For now these happen only on the relation name.
+			 */
+			splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+								  &just_table, &just_columns);
+
+			appendStringLiteralConn(&catalog_query, just_table, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -536,13 +609,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -551,18 +624,26 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+				 " ON listed_objects.object_oid OPERATOR(pg_catalog.=) ");
+		if (objfilter == OBJFILTER_TABLE)
+			appendPQExpBufferStr(&catalog_query, "c.oid\n");
+		if (objfilter == OBJFILTER_SCHEMA)
+			appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+	}
 
 	/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
+	 * If no tables were listed or that a filter by schema is used, filter
+	 * for the relevant relation types.  If tables were given via --table,
+	 * don't bother filtering by relation type.  Instead, let the server
+	 * decide whether a given relation can be processed in which case the
+	 * user will know about it.  If there is a filter by schema the use of
+	 * --table is not possible so we have to filter by relation type too.
 	 */
-	if (!tables_listed)
+	if (!objects_listed || objfilter == OBJFILTER_SCHEMA)
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -633,7 +714,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -977,6 +1058,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

#21

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Gilles Darold (#20)

Re: [Proposal] vacuumdb --schema only

On Thu, Apr 14, 2022 at 10:27:46PM +0200, Gilles Darold wrote:

Attached v8 of the patch that tries to address the remarks above, fixes
patch apply failure to master and replace calls to pg_log_error+exit
with pg_fatal.

Thanks for the new patch.

+enum trivalue schema_is_exclude = TRI_DEFAULT;
+
+/*
+ * The kind of object use in the command line filter.
+ *   OBJFILTER_NONE: no filter used
+ *   OBJFILTER_SCHEMA: -n | --schema or -N | --exclude-schema
+ *   OBJFILTER_TABLE: -t | --table
+ */
+enum VacObjectFilter
+{
+	OBJFILTER_NONE,
+	OBJFILTER_TABLE,
+	OBJFILTER_SCHEMA
+};
+
+enum VacObjectFilter objfilter = OBJFILTER_NONE;

I still think we ought to remove schema_is_exclude in favor of adding
OBJFILTER_SCHEMA_EXCLUDE to the enum. I think that would simplify some of
the error handling and improve readability. IMO we should add
OBJFILTER_ALL, too.

-					simple_string_list_append(&tables, optarg);
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (schema_is_exclude != TRI_DEFAULT)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_TABLE;
tbl_count++;
break;
}
@@ -202,6 +224,32 @@ main(int argc, char *argv[])
&vacopts.parallel_workers))
exit(1);
break;
+			case 'n':			/* include schema(s) */
+				{
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (tbl_count)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+
+					if (schema_is_exclude == TRI_YES)
+						pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					schema_is_exclude = TRI_NO;
+					break;
+				}
+			case 'N':			/* exclude schema(s) */
+				{
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (tbl_count)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					if (schema_is_exclude == TRI_NO)
+						pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					schema_is_exclude = TRI_YES;
+					break;

I was expecting these to check objfilter. For example:

case 'N':
{
if (objfilter == OBJFILTER_TABLE)
pg_fatal("...");
else if (objfilter == OBJFILTER_SCHEMA)
pg_fatal("...");
else if (objfilter == OBJFILTER_ALL)
pg_fatal("...");

simple_string_list_append(&objects, optarg);
objfilter = OBJFILTER_SCHEMA_EXCLUDE;
break;
}

Another possible improvement could be to move the pg_fatal() calls to a
helper function that generates the message based on the current objfilter
setting and the current option. I'm envisioning something like
check_objfilter(VacObjFilter curr_objfilter, VacObjFilter curr_option).

+	/*
+	 * When filtering on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
+		pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");

Isn't this redundant with the error in the option handling?

-		if (tables.head != NULL)
+		if (objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
+		{
+			if (schema_is_exclude == TRI_YES)
+				pg_fatal("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+		}
+
+		if (objfilter == OBJFILTER_TABLE && objects.head != NULL)
pg_fatal("cannot vacuum specific table(s) in all databases");

I think we could move all these into check_objfilter(), too.

nitpick: Why do we need to check that objects.head is not NULL? Isn't the
objfilter check enough?

/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
+	 * If no tables were listed or that a filter by schema is used, filter
+	 * for the relevant relation types.  If tables were given via --table,
+	 * don't bother filtering by relation type.  Instead, let the server
+	 * decide whether a given relation can be processed in which case the
+	 * user will know about it.  If there is a filter by schema the use of
+	 * --table is not possible so we have to filter by relation type too.
*/
-	if (!tables_listed)
+	if (!objects_listed || objfilter == OBJFILTER_SCHEMA)

Do we need to check for objects_listed here? IIUC we can just check for
objfilter != OBJFILTER_TABLE.

Unless I'm missing something, schema_is_exclude appears to only be used for
error checking and doesn't impact the generated catalog query. It looks
like the relevant logic disappeared after v4 of the patch.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#22

Gilles Darold

gilles@migops.com

over 3 years ago

In reply to: Nathan Bossart (#21)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Le 18/04/2022 à 23:56, Nathan Bossart a écrit :

On Thu, Apr 14, 2022 at 10:27:46PM +0200, Gilles Darold wrote:

Attached v8 of the patch that tries to address the remarks above, fixes
patch apply failure to master and replace calls to pg_log_error+exit
with pg_fatal.

Thanks for the new patch.
+enum trivalue schema_is_exclude = TRI_DEFAULT;
+
+/*
+ * The kind of object use in the command line filter.
+ *   OBJFILTER_NONE: no filter used
+ *   OBJFILTER_SCHEMA: -n | --schema or -N | --exclude-schema
+ *   OBJFILTER_TABLE: -t | --table
+ */
+enum VacObjectFilter
+{
+	OBJFILTER_NONE,
+	OBJFILTER_TABLE,
+	OBJFILTER_SCHEMA
+};
+
+enum VacObjectFilter objfilter = OBJFILTER_NONE;
I still think we ought to remove schema_is_exclude in favor of adding
OBJFILTER_SCHEMA_EXCLUDE to the enum. I think that would simplify some of
the error handling and improve readability. IMO we should add
OBJFILTER_ALL, too.

Fixed.

-					simple_string_list_append(&tables, optarg);
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (schema_is_exclude != TRI_DEFAULT)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_TABLE;
tbl_count++;
break;
}
@@ -202,6 +224,32 @@ main(int argc, char *argv[])
&vacopts.parallel_workers))
exit(1);
break;
+			case 'n':			/* include schema(s) */
+				{
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (tbl_count)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+
+					if (schema_is_exclude == TRI_YES)
+						pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					schema_is_exclude = TRI_NO;
+					break;
+				}
+			case 'N':			/* exclude schema(s) */
+				{
+					/* When filtering on schema name, filter by table is not allowed. */
+					if (tbl_count)
+						pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+					if (schema_is_exclude == TRI_NO)
+						pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					schema_is_exclude = TRI_YES;
+					break;

I was expecting these to check objfilter.

Fixed.

Another possible improvement could be to move the pg_fatal() calls to a
helper function that generates the message based on the current objfilter
setting and the current option. I'm envisioning something like
check_objfilter(VacObjFilter curr_objfilter, VacObjFilter curr_option).

I agree, done.

+	/*
+	 * When filtering on schema name, filter by table is not allowed.
+	 * The schema name can already be set to a fqdn table name.
+	 */
+	if (tbl_count && objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
+		pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");

Isn't this redundant with the error in the option handling?

Fixed.

-		if (tables.head != NULL)
+		if (objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
+		{
+			if (schema_is_exclude == TRI_YES)
+				pg_fatal("cannot exclude from vacuum specific schema(s) in all databases");
+			else if (schema_is_exclude == TRI_NO)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+		}
+
+		if (objfilter == OBJFILTER_TABLE && objects.head != NULL)
pg_fatal("cannot vacuum specific table(s) in all databases");

I think we could move all these into check_objfilter(), too.

nitpick: Why do we need to check that objects.head is not NULL? Isn't the
objfilter check enough?

Done.

/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
+	 * If no tables were listed or that a filter by schema is used, filter
+	 * for the relevant relation types.  If tables were given via --table,
+	 * don't bother filtering by relation type.  Instead, let the server
+	 * decide whether a given relation can be processed in which case the
+	 * user will know about it.  If there is a filter by schema the use of
+	 * --table is not possible so we have to filter by relation type too.
*/
-	if (!tables_listed)
+	if (!objects_listed || objfilter == OBJFILTER_SCHEMA)

Do we need to check for objects_listed here? IIUC we can just check for
objfilter != OBJFILTER_TABLE.

Yes we need it otherwise test 'vacuumdb with view' fail because we are
not trying to vacuum the view so the PG doesn't report:

WARNING: cannot vacuum non-tables or special system tables

Unless I'm missing something, schema_is_exclude appears to only be used for
error checking and doesn't impact the generated catalog query. It looks
like the relevant logic disappeared after v4 of the patch.

Right, removed.

New patch attached v9.

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v9.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v9.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..0de001ef24 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..7bbfb97246 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,15 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-N',  '"Foo"', 'postgres' ],
+	'cannot use option -n | --schema and -N | --exclude-schema at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 92f1ffe147..de2671955e 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,32 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+/*
+ * The kind of object use in the command line filter.
+ *   OBJFILTER_NONE: no filter used
+ *   OBJFILTER_TABLE: -t | --table
+ *   OBJFILTER_SCHEMA: -n | --schema
+ *   OBJFILTER_SCHEMA_EXCLUDE: -N | --exclude-schema
+ *   OBJFILTER_DATABASE: -d | --dbname
+ *   OBJFILTER_ALL: -a | --all
+ */
+typedef enum
+{
+	OBJFILTER_NONE,
+	OBJFILTER_ALL,
+	OBJFILTER_DATABASE,
+	OBJFILTER_TABLE,
+	OBJFILTER_SCHEMA,
+	OBJFILTER_SCHEMA_EXCLUDE
+} VacObjectFilter;
+
+VacObjectFilter objfilter = OBJFILTER_NONE;
+
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *objects,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -68,11 +89,12 @@ static void run_vacuum_command(PGconn *conn, const char *sql, bool echo,
 
 static void help(const char *progname);
 
+void check_objfilter(VacObjectFilter curr_objfilter, VacObjectFilter curr_option);
+
 /* For analyze-in-stages mode */
 #define ANALYZE_NO_STAGE	-1
 #define ANALYZE_NUM_STAGES	3
 
-
 int
 main(int argc, char *argv[])
 {
@@ -94,6 +116,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -122,7 +146,7 @@ main(int argc, char *argv[])
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
 	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +164,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -166,6 +190,8 @@ main(int argc, char *argv[])
 				quiet = true;
 				break;
 			case 'd':
+				check_objfilter(objfilter, OBJFILTER_DATABASE);
+				objfilter = OBJFILTER_DATABASE;
 				dbname = pg_strdup(optarg);
 				break;
 			case 'z':
@@ -178,11 +204,15 @@ main(int argc, char *argv[])
 				vacopts.freeze = true;
 				break;
 			case 'a':
+				check_objfilter(objfilter, OBJFILTER_ALL);
+				objfilter = OBJFILTER_ALL;
 				alldb = true;
 				break;
 			case 't':
 				{
-					simple_string_list_append(&tables, optarg);
+					check_objfilter(objfilter, OBJFILTER_TABLE);
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_TABLE;
 					tbl_count++;
 					break;
 				}
@@ -202,6 +232,20 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':			/* include schema(s) */
+				{
+					check_objfilter(objfilter, OBJFILTER_SCHEMA);
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					break;
+				}
+			case 'N':			/* exclude schema(s) */
+				{
+					check_objfilter(objfilter, OBJFILTER_SCHEMA_EXCLUDE);
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA_EXCLUDE;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -318,11 +362,6 @@ main(int argc, char *argv[])
 
 	if (alldb)
 	{
-		if (dbname)
-			pg_fatal("cannot vacuum all databases and a specific one at the same time");
-		if (tables.head != NULL)
-			pg_fatal("cannot vacuum specific table(s) in all databases");
-
 		cparams.dbname = maintenance_db;
 
 		vacuum_all_databases(&cparams, &vacopts,
@@ -352,7 +391,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -360,7 +399,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -368,6 +407,65 @@ main(int argc, char *argv[])
 	exit(0);
 }
 
+/*
+ * Verify that the filters used at command line are compatible
+ */
+void
+check_objfilter(VacObjectFilter curr_objfilter, VacObjectFilter curr_option)
+{
+	switch (curr_option)
+	{
+		case OBJFILTER_NONE:
+			break;
+		case OBJFILTER_DATABASE:
+			/* When filtering on database name, vacuum on all database is not allowed. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			break;
+		case OBJFILTER_ALL:
+			/* When vacuuming all database, filter on database name is not allowed. */
+			if (curr_objfilter == OBJFILTER_DATABASE)
+				pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			/* When vacuuming all database, filter on schema name is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+			/* When vacuuming all database, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot exclude from vacuum specific schema(s) in all databases");
+			/* When vacuuming all database, filter on table name is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum specific table(s) in all databases");
+			break;
+		case OBJFILTER_TABLE:
+			/* When filtering on table name, filter by schema is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filtering on table name, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot vacuum specific table(s) and exclude specific schema(s) at the same time");
+			break;
+		case OBJFILTER_SCHEMA:
+			/* When filtering on schema name, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filtering on schema name, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+			/* filtering on schema name can not be use on all database. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+			break;
+		case OBJFILTER_SCHEMA_EXCLUDE:
+			/* When filtering on schema exclusion, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filetring on schema exclusion, filter by schema is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+			break;
+	}
+}
+
 /*
  * vacuum_one_database
  *
@@ -385,7 +483,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -400,7 +498,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -499,31 +597,41 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
-		char	   *just_table;
-		const char *just_columns;
-
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
+		char	   *just_table = NULL;
+		const char *just_columns = NULL;
 
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+
+		if (objfilter == OBJFILTER_SCHEMA || objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+		{
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.oid, ");
+		}
+
+		if (objfilter == OBJFILTER_TABLE)
+		{
+			/*
+			 * Split relation and column names given by the user, this is used to
+			 * feed the CTE with values on which are performed pre-run validity
+			 * checks as well.  For now these happen only on the relation name.
+			 */
+			splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+								  &just_table, &just_columns);
+
+			appendStringLiteralConn(&catalog_query, just_table, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -536,13 +644,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -551,18 +659,24 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+				 " ON listed_objects.object_oid OPERATOR(pg_catalog.=) ");
+		if (objfilter == OBJFILTER_TABLE)
+			appendPQExpBufferStr(&catalog_query, "c.oid\n");
+		if (objfilter == OBJFILTER_SCHEMA || objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+			appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+	}
 
 	/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
+	 * If no tables were listed, filter for the relevant relation types.
+	 * If tables were given via --table, don't bother filtering by relation
+	 * type.  Instead, let the server decide whether a given relation can
+	 * be processed in which case the user will know about it.
 	 */
-	if (!tables_listed)
+	if (!objects_listed || objfilter == OBJFILTER_SCHEMA)
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -633,7 +747,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -977,6 +1091,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

#23

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Gilles Darold (#22)

Re: [Proposal] vacuumdb --schema only

Thanks for the new patch! I think this is on the right track.

On Wed, Apr 20, 2022 at 05:15:02PM +0200, Gilles Darold wrote:

Le 18/04/2022 ï¿½ 23:56, Nathan Bossart a ï¿½critï¿½:
-	if (!tables_listed)
+	if (!objects_listed || objfilter == OBJFILTER_SCHEMA)
Do we need to check for objects_listed here? IIUC we can just check for
objfilter != OBJFILTER_TABLE.
Yes we need it otherwise test 'vacuumdb with view' fail because we are not
trying to vacuum the view so the PG doesn't report:

ï¿½ï¿½ï¿½ WARNING:ï¿½ cannot vacuum non-tables or special system tables

My point is that the only time we don't want to filter for relevant
relation types is when the user provides a list of tables. So my
suggestion would be to simplify this to the following:

if (objfilter != OBJFILTER_TABLE)
{
appendPQExpBufferStr(...);
has_where = true;
}

Unless I'm missing something, schema_is_exclude appears to only be used for
error checking and doesn't impact the generated catalog query. It looks
like the relevant logic disappeared after v4 of the patch.

Right, removed.

I don't think -N works at the moment. I tested it out, and vacuumdb was
still processing tables in schemas I excluded. Can we add a test case for
this, too?

+/*
+ * Verify that the filters used at command line are compatible
+ */
+void
+check_objfilter(VacObjectFilter curr_objfilter, VacObjectFilter curr_option)
+{
+	switch (curr_option)
+	{
+		case OBJFILTER_NONE:
+			break;
+		case OBJFILTER_DATABASE:
+			/* When filtering on database name, vacuum on all database is not allowed. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			break;
+		case OBJFILTER_ALL:
+			/* When vacuuming all database, filter on database name is not allowed. */
+			if (curr_objfilter == OBJFILTER_DATABASE)
+				pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			/* When vacuuming all database, filter on schema name is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+			/* When vacuuming all database, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot exclude from vacuum specific schema(s) in all databases");
+			/* When vacuuming all database, filter on table name is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum specific table(s) in all databases");
+			break;
+		case OBJFILTER_TABLE:
+			/* When filtering on table name, filter by schema is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filtering on table name, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot vacuum specific table(s) and exclude specific schema(s) at the same time");
+			break;
+		case OBJFILTER_SCHEMA:
+			/* When filtering on schema name, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filtering on schema name, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+			/* filtering on schema name can not be use on all database. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+			break;
+		case OBJFILTER_SCHEMA_EXCLUDE:
+			/* When filtering on schema exclusion, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filetring on schema exclusion, filter by schema is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+			break;
+	}
+}

I don't think this handles all combinations. For example, the following
command does not fail:

vacuumdb -a -N test postgres

Furthermore, do you think it'd be possible to dynamically generate the
message? If it doesn't add too much complexity, this might be a nice way
to simplify this function.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#24

Justin Pryzby

pryzby@telsasoft.com

over 3 years ago

In reply to: Nathan Bossart (#23)

Re: [Proposal] vacuumdb --schema only

On Wed, Apr 20, 2022 at 10:38:46AM -0700, Nathan Bossart wrote:

+void
+check_objfilter(VacObjectFilter curr_objfilter, VacObjectFilter curr_option)
+{
+	switch (curr_option)
+	{
+		case OBJFILTER_NONE:
+			break;
+		case OBJFILTER_DATABASE:
+			/* When filtering on database name, vacuum on all database is not allowed. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			break;
+		case OBJFILTER_ALL:
+			/* When vacuuming all database, filter on database name is not allowed. */
+			if (curr_objfilter == OBJFILTER_DATABASE)
+				pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			/* When vacuuming all database, filter on schema name is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+			/* When vacuuming all database, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot exclude from vacuum specific schema(s) in all databases");
+			/* When vacuuming all database, filter on table name is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum specific table(s) in all databases");
+			break;
+		case OBJFILTER_TABLE:
+			/* When filtering on table name, filter by schema is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filtering on table name, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot vacuum specific table(s) and exclude specific schema(s) at the same time");
+			break;
+		case OBJFILTER_SCHEMA:
+			/* When filtering on schema name, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filtering on schema name, schema exclusion is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+			/* filtering on schema name can not be use on all database. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				pg_fatal("cannot vacuum specific schema(s) in all databases");
+			break;
+		case OBJFILTER_SCHEMA_EXCLUDE:
+			/* When filtering on schema exclusion, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			/* When filetring on schema exclusion, filter by schema is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA)
+				pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same time");
+			break;
+	}
+}

I don't think this handles all combinations. For example, the following
command does not fail:

vacuumdb -a -N test postgres

Furthermore, do you think it'd be possible to dynamically generate the
message?

Not in the obvious way, because that breaks translatability.

--
Justin

#25

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Justin Pryzby (#24)

Re: [Proposal] vacuumdb --schema only

On Wed, Apr 20, 2022 at 12:40:52PM -0500, Justin Pryzby wrote:

On Wed, Apr 20, 2022 at 10:38:46AM -0700, Nathan Bossart wrote:

Furthermore, do you think it'd be possible to dynamically generate the
message?

Not in the obvious way, because that breaks translatability.

Ah, right.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#26

Gilles Darold

gilles@migops.com

over 3 years ago

In reply to: Nathan Bossart (#23)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

Le 20/04/2022 à 19:38, Nathan Bossart a écrit :

Thanks for the new patch! I think this is on the right track.

On Wed, Apr 20, 2022 at 05:15:02PM +0200, Gilles Darold wrote:
Le 18/04/2022 à 23:56, Nathan Bossart a écrit :
-	if (!tables_listed)
+	if (!objects_listed || objfilter == OBJFILTER_SCHEMA)
Do we need to check for objects_listed here? IIUC we can just check for
objfilter != OBJFILTER_TABLE.
Yes we need it otherwise test 'vacuumdb with view' fail because we are not
trying to vacuum the view so the PG doesn't report:

WARNING: cannot vacuum non-tables or special system tables
My point is that the only time we don't want to filter for relevant
relation types is when the user provides a list of tables. So my
suggestion would be to simplify this to the following:

if (objfilter != OBJFILTER_TABLE)
{
appendPQExpBufferStr(...);
has_where = true;
}

Right, I must have gotten mixed up in the test results. Fixed.

Unless I'm missing something, schema_is_exclude appears to only be used for
error checking and doesn't impact the generated catalog query. It looks
like the relevant logic disappeared after v4 of the patch.

Right, removed.

I don't think -N works at the moment. I tested it out, and vacuumdb was
still processing tables in schemas I excluded. Can we add a test case for
this, too?

Fixed and regression tests added as well as some others to test the
filter options compatibility.

+/*
+ * Verify that the filters used at command line are compatible
+ */
+void
+check_objfilter(VacObjectFilter curr_objfilter, VacObjectFilter curr_option)
+{
+	switch (curr_option)
+	{
+		case OBJFILTER_NONE:
+			break;
+		case OBJFILTER_DATABASE:
+			/* When filtering on database name, vacuum on all database is not allowed. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			break;
[...]
+	}
+}
I don't think this handles all combinations.  For example, the following
command does not fail:

vacuumdb -a -N test postgres

Right, I have fix them all in this new patch.

Furthermore, do you think it'd be possible to dynamically generate the
message? If it doesn't add too much complexity, this might be a nice way
to simplify this function.

I have tried to avoid reusing the same error message several time by
using a new enum and function filter_error(). I also use the same
messages with --schema and --exclude-schema related errors.

Patch v10 attached.

--
Gilles Darold

Attachments:

0001-vacuumdb-schema-only-v10.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v10.patchDownload

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..0de001ef24 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,28 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in <replaceable class="parameter">schema</replaceable> only.
+        Multiple schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables NOT in <replaceable class="parameter">schema</replaceable>.
+        Multiple schemas can be excluded from the vacuum by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +675,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..9ef5c789e0 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,38 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema schema only');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--exclude-schema', '"Foo"', 'postgres' ],
+	qr/(?:(?!VACUUM "Foo".bar).)*/,
+	'vacuumdb --exclude-schema schema');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-N',  '"Foo"', 'postgres' ],
+	'cannot use option -n | --schema and -N | --exclude-schema at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-a', '-N',  '"Foo"' ],
+	'cannot use option -a and -N | --exclude-schema at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-a', '-n',  '"Foo"' ],
+	'cannot use option -a and -n | --schema at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-a', '-t',  '"Foo".bar' ],
+	'cannot use option -a and -t | --table at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-a', '-n',  '"Foo"', 'postgres' ],
+	'cannot use option -a and -n | --schema at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-a', '-d',  'postgres' ],
+	'cannot use option -a and -d | --dbname at the same time');
+$node->command_fails(
+	[ 'vacuumdb',   '-a', 'postgres' ],
+	'cannot use option -a and a dbname as argument at the same time');
+
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 92f1ffe147..28408ed36b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,41 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+/*
+ * The kind of object use in the command line filter.
+ *   OBJFILTER_NONE: no filter used
+ *   OBJFILTER_TABLE: -t | --table
+ *   OBJFILTER_SCHEMA: -n | --schema
+ *   OBJFILTER_SCHEMA_EXCLUDE: -N | --exclude-schema
+ *   OBJFILTER_DATABASE: -d | --dbname
+ *   OBJFILTER_ALL: -a | --all
+ */
+typedef enum
+{
+	OBJFILTER_NONE,
+	OBJFILTER_ALL,
+	OBJFILTER_DATABASE,
+	OBJFILTER_TABLE,
+	OBJFILTER_SCHEMA,
+	OBJFILTER_SCHEMA_EXCLUDE
+} VacObjectFilter;
+
+VacObjectFilter objfilter = OBJFILTER_NONE;
+
+
+typedef enum
+{
+	FILTER_ERR_ALL,
+	FILTER_ERR_SCHEMA_ALL,
+	FILTER_ERR_TABLE_ALL,
+	FILTER_ERR_SCHEMA_TABLE
+} VacFilterErrorr;
+
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *objects,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -68,11 +98,14 @@ static void run_vacuum_command(PGconn *conn, const char *sql, bool echo,
 
 static void help(const char *progname);
 
+void check_objfilter(VacObjectFilter curr_objfilter, VacObjectFilter curr_option);
+
+void filter_error(VacFilterErrorr filter_error);
+
 /* For analyze-in-stages mode */
 #define ANALYZE_NO_STAGE	-1
 #define ANALYZE_NUM_STAGES	3
 
-
 int
 main(int argc, char *argv[])
 {
@@ -94,6 +127,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -122,7 +157,7 @@ main(int argc, char *argv[])
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
 	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +175,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -166,6 +201,8 @@ main(int argc, char *argv[])
 				quiet = true;
 				break;
 			case 'd':
+				check_objfilter(objfilter, OBJFILTER_DATABASE);
+				objfilter = OBJFILTER_DATABASE;
 				dbname = pg_strdup(optarg);
 				break;
 			case 'z':
@@ -178,11 +215,15 @@ main(int argc, char *argv[])
 				vacopts.freeze = true;
 				break;
 			case 'a':
+				check_objfilter(objfilter, OBJFILTER_ALL);
+				objfilter = OBJFILTER_ALL;
 				alldb = true;
 				break;
 			case 't':
 				{
-					simple_string_list_append(&tables, optarg);
+					check_objfilter(objfilter, OBJFILTER_TABLE);
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_TABLE;
 					tbl_count++;
 					break;
 				}
@@ -202,6 +243,20 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':			/* include schema(s) */
+				{
+					check_objfilter(objfilter, OBJFILTER_SCHEMA);
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA;
+					break;
+				}
+			case 'N':			/* exclude schema(s) */
+				{
+					check_objfilter(objfilter, OBJFILTER_SCHEMA_EXCLUDE);
+					simple_string_list_append(&objects, optarg);
+					objfilter = OBJFILTER_SCHEMA_EXCLUDE;
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -249,6 +304,7 @@ main(int argc, char *argv[])
 	 */
 	if (optind < argc && dbname == NULL)
 	{
+		check_objfilter(objfilter, OBJFILTER_DATABASE);
 		dbname = argv[optind];
 		optind++;
 	}
@@ -318,11 +374,6 @@ main(int argc, char *argv[])
 
 	if (alldb)
 	{
-		if (dbname)
-			pg_fatal("cannot vacuum all databases and a specific one at the same time");
-		if (tables.head != NULL)
-			pg_fatal("cannot vacuum specific table(s) in all databases");
-
 		cparams.dbname = maintenance_db;
 
 		vacuum_all_databases(&cparams, &vacopts,
@@ -352,7 +403,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -360,7 +411,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -368,6 +419,83 @@ main(int argc, char *argv[])
 	exit(0);
 }
 
+/*
+ * Verify that the filters used at command line are compatible
+ */
+void
+check_objfilter(VacObjectFilter curr_objfilter, VacObjectFilter curr_option)
+{
+	switch (curr_option)
+	{
+		case OBJFILTER_NONE:
+			break;
+		case OBJFILTER_DATABASE:
+			/* When filtering on database name, vacuum on all database is not allowed. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				filter_error(FILTER_ERR_ALL);
+			break;
+		case OBJFILTER_ALL:
+			/* When vacuuming all database, filter on database name is not allowed. */
+			if (curr_objfilter == OBJFILTER_DATABASE)
+				filter_error(FILTER_ERR_ALL);
+			/* When vacuuming all database, filter on schema name is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA
+					|| curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				filter_error(FILTER_ERR_SCHEMA_ALL);
+			/* When vacuuming all database, filter on table name is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE)
+				filter_error(FILTER_ERR_TABLE_ALL);
+			break;
+		case OBJFILTER_TABLE:
+			/* When filtering on table name, filter by schema is not allowed. */
+			if (curr_objfilter == OBJFILTER_SCHEMA
+					|| curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				filter_error(FILTER_ERR_SCHEMA_TABLE);
+			/* When vacuuming all database, filter on table name is not allowed. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				filter_error(FILTER_ERR_TABLE_ALL);
+			break;
+		case OBJFILTER_SCHEMA:
+			/* When filtering on schema name, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE
+					|| curr_objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+				filter_error(FILTER_ERR_SCHEMA_TABLE);
+			/* filtering on schema name can not be use on all database. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				filter_error(FILTER_ERR_SCHEMA_ALL);
+			break;
+		case OBJFILTER_SCHEMA_EXCLUDE:
+			/* When filtering on schema exclusion, filter by table is not allowed. */
+			if (curr_objfilter == OBJFILTER_TABLE
+					|| curr_objfilter == OBJFILTER_SCHEMA)
+				filter_error(FILTER_ERR_SCHEMA_TABLE);
+			/* filtering on schema name can not be use on all database. */
+			if (curr_objfilter == OBJFILTER_ALL)
+				filter_error(FILTER_ERR_SCHEMA_ALL);
+			break;
+	}
+}
+
+void
+filter_error(VacFilterErrorr filter_error)
+{
+	switch (filter_error)
+	{
+		case FILTER_ERR_ALL:
+			pg_fatal("cannot vacuum all databases and a specific one at the same time");
+			break;
+		case FILTER_ERR_SCHEMA_ALL:
+			pg_fatal("cannot vacuum specific schema(s) in all databases");
+			break;
+		case FILTER_ERR_TABLE_ALL:
+			pg_fatal("cannot vacuum specific table(s) in all databases");
+			break;
+		case FILTER_ERR_SCHEMA_TABLE:
+			pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+			break;
+	}
+}
+
 /*
  * vacuum_one_database
  *
@@ -385,7 +513,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -400,7 +528,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -499,31 +627,41 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
-		char	   *just_table;
-		const char *just_columns;
-
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
+		char	   *just_table = NULL;
+		const char *just_columns = NULL;
 
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+
+		if (objfilter == OBJFILTER_SCHEMA || objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+		{
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.oid, ");
+		}
+
+		if (objfilter == OBJFILTER_TABLE)
+		{
+			/*
+			 * Split relation and column names given by the user, this is used to
+			 * feed the CTE with values on which are performed pre-run validity
+			 * checks as well.  For now these happen only on the relation name.
+			 */
+			splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+								  &just_table, &just_columns);
+
+			appendStringLiteralConn(&catalog_query, just_table, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -536,13 +674,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -551,18 +689,31 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+				 " ON listed_objects.object_oid ");
+
+		if (objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.!=) ");
+		else
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.=) ");
+
+		if (objfilter == OBJFILTER_TABLE)
+			appendPQExpBufferStr(&catalog_query, "c.oid\n");
+		else if (objfilter == OBJFILTER_SCHEMA || objfilter == OBJFILTER_SCHEMA_EXCLUDE)
+			appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+	}
 
 	/*
-	 * If no tables were listed, filter for the relevant relation types.  If
-	 * tables were given via --table, don't bother filtering by relation type.
-	 * Instead, let the server decide whether a given relation can be
-	 * processed in which case the user will know about it.
+	 * If no tables were listed, filter for the relevant relation types.
+	 * If tables were given via --table, don't bother filtering by relation
+	 * type.  Instead, let the server decide whether a given relation can
+	 * be processed in which case the user will know about it.
 	 */
-	if (!tables_listed)
+	if (!objects_listed || objfilter == OBJFILTER_SCHEMA
+			|| objfilter == OBJFILTER_SCHEMA_EXCLUDE)
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -633,7 +784,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -977,6 +1128,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do NOT vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));

#27

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Gilles Darold (#26)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

On Fri, Apr 22, 2022 at 11:57:05AM +0200, Gilles Darold wrote:

Patch v10 attached.

Thanks! I've attached a v11 with some minor editorialization. I think I
was able to improve the error handling for invalid combinations of
command-line options a bit, but please let me know what you think.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v11-0001-Add-schema-and-exclude-schema-options-to-vacuumd.patchtext/x-diff; charset=us-asciiDownload

From c180a56055bb9c61c4bb81d586aeb99241596457 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Fri, 22 Apr 2022 22:34:38 -0700
Subject: [PATCH v11 1/1] Add --schema and --exclude-schema options to
 vacuumdb.

These two new options can be used to either process all tables in
specific schemas or to skip processing all tables in specific
schemas.  This change also refactors the handling of invalid
combinations of command-line options to a new helper function.

Author: Gilles Darold
Reviewed-by: Justin Pryzby, Nathan Bossart
Discussion: https://postgr.es/m/929fbf3c-24b8-d454-811f-1d5898ab3e91%40migops.com
---
 doc/src/sgml/ref/vacuumdb.sgml        |  66 ++++++++++
 src/bin/scripts/t/100_vacuumdb.pl     |  35 ++++++
 src/bin/scripts/t/101_vacuumdb_all.pl |   3 +
 src/bin/scripts/vacuumdb.c            | 170 +++++++++++++++++++-------
 4 files changed, 233 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..841aced3bd 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,30 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in
+        <replaceable class="parameter">schema</replaceable> only.  Multiple
+        schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Do not clean or analyze any tables in
+        <replaceable class="parameter">schema</replaceable>.  Multiple schemas
+        can be excluded by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +677,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..e4aac53249 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,38 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema schema only');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--exclude-schema', '"Foo"', 'postgres' ],
+	qr/(?:(?!VACUUM "Foo".bar).)*/,
+	'vacuumdb --exclude-schema schema');
+$node->command_fails(
+	[ 'vacuumdb', '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	'cannot use options -n and -t at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-n', 'pg_catalog', '-N', '"Foo"', 'postgres' ],
+	'cannot use options -n and -N at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-N', '"Foo"' ],
+	'cannot use options -a and -N at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-n', '"Foo"' ],
+	'cannot use options -a and -n at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-t', '"Foo".bar' ],
+	'cannot use options -a and -t at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-n', '"Foo"', 'postgres' ],
+	'cannot use options -a and -n at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-d', 'postgres' ],
+	'cannot use options -a and -d at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', 'postgres' ],
+	'cannot use option -a and a dbname as argument at the same time');
+
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..ccb05166df 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 92f1ffe147..9ff81e5a17 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,23 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+/* object filter options */
+typedef enum
+{
+	OBJFILTER_NONE = 0,					/* no filter used */
+	OBJFILTER_ALL_DBS = (1 << 0),		/* -a | --all */
+	OBJFILTER_DATABASE = (1 << 1),		/* -d | --dbname */
+	OBJFILTER_TABLE = (1 << 2),			/* -t | --table */
+	OBJFILTER_SCHEMA = (1 << 3),		/* -n | --schema */
+	OBJFILTER_SCHEMA_EXCLUDE = (1 << 4)	/* -N | --exclude-schema */
+} VacObjFilter;
+
+VacObjFilter objfilter = OBJFILTER_NONE;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *objects,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -68,6 +80,8 @@ static void run_vacuum_command(PGconn *conn, const char *sql, bool echo,
 
 static void help(const char *progname);
 
+void check_objfilter(VacObjFilter curr_option);
+
 /* For analyze-in-stages mode */
 #define ANALYZE_NO_STAGE	-1
 #define ANALYZE_NUM_STAGES	3
@@ -94,6 +108,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -121,8 +137,7 @@ main(int argc, char *argv[])
 	bool		quiet = false;
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
-	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +155,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -166,6 +181,7 @@ main(int argc, char *argv[])
 				quiet = true;
 				break;
 			case 'd':
+				check_objfilter(OBJFILTER_DATABASE);
 				dbname = pg_strdup(optarg);
 				break;
 			case 'z':
@@ -178,11 +194,12 @@ main(int argc, char *argv[])
 				vacopts.freeze = true;
 				break;
 			case 'a':
-				alldb = true;
+				check_objfilter(OBJFILTER_ALL_DBS);
 				break;
 			case 't':
 				{
-					simple_string_list_append(&tables, optarg);
+					check_objfilter(OBJFILTER_TABLE);
+					simple_string_list_append(&objects, optarg);
 					tbl_count++;
 					break;
 				}
@@ -202,6 +219,18 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':
+				{
+					check_objfilter(OBJFILTER_SCHEMA);
+					simple_string_list_append(&objects, optarg);
+					break;
+				}
+			case 'N':
+				{
+					check_objfilter(OBJFILTER_SCHEMA_EXCLUDE);
+					simple_string_list_append(&objects, optarg);
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -249,6 +278,7 @@ main(int argc, char *argv[])
 	 */
 	if (optind < argc && dbname == NULL)
 	{
+		check_objfilter(OBJFILTER_DATABASE);
 		dbname = argv[optind];
 		optind++;
 	}
@@ -316,13 +346,8 @@ main(int argc, char *argv[])
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
 
-	if (alldb)
+	if (objfilter & OBJFILTER_ALL_DBS)
 	{
-		if (dbname)
-			pg_fatal("cannot vacuum all databases and a specific one at the same time");
-		if (tables.head != NULL)
-			pg_fatal("cannot vacuum specific table(s) in all databases");
-
 		cparams.dbname = maintenance_db;
 
 		vacuum_all_databases(&cparams, &vacopts,
@@ -352,7 +377,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -360,7 +385,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -368,6 +393,45 @@ main(int argc, char *argv[])
 	exit(0);
 }
 
+/*
+ * Verify that the filters used at command line are compatible.
+ */
+void
+check_objfilter(VacObjFilter curr_option)
+{
+	VacObjFilter	invalid_combo;
+
+	objfilter |= curr_option;
+
+	invalid_combo = (OBJFILTER_ALL_DBS | OBJFILTER_DATABASE);
+	if ((objfilter & invalid_combo) == invalid_combo)
+		pg_fatal("cannot vacuum all databases and a specific one at the same time");
+
+	invalid_combo = (OBJFILTER_ALL_DBS | OBJFILTER_TABLE);
+	if ((objfilter & invalid_combo) == invalid_combo)
+		pg_fatal("cannot vacuum specific table(s) in all databases");
+
+	invalid_combo = (OBJFILTER_ALL_DBS | OBJFILTER_SCHEMA);
+	if ((objfilter & invalid_combo) == invalid_combo)
+		pg_fatal("cannot vacuum specific schema(s) in all databases");
+
+	invalid_combo = (OBJFILTER_ALL_DBS | OBJFILTER_SCHEMA_EXCLUDE);
+	if ((objfilter & invalid_combo) == invalid_combo)
+		pg_fatal("cannot exclude specific schema(s) in all databases");
+
+	invalid_combo = (OBJFILTER_TABLE | OBJFILTER_SCHEMA);
+	if ((objfilter & invalid_combo) == invalid_combo)
+		pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+
+	invalid_combo = (OBJFILTER_TABLE | OBJFILTER_SCHEMA_EXCLUDE);
+	if ((objfilter & invalid_combo) == invalid_combo)
+		pg_fatal("cannot vacuum specific table(s) and exclude schema(s) at the same time");
+
+	invalid_combo = (OBJFILTER_SCHEMA | OBJFILTER_SCHEMA_EXCLUDE);
+	if ((objfilter & invalid_combo) == invalid_combo)
+		pg_fatal("cannot vacuum all tables in schema(s) and exclude schema(s) at the same time");
+}
+
 /*
  * vacuum_one_database
  *
@@ -385,7 +449,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -400,7 +464,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -499,31 +563,41 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
-		char	   *just_table;
-		const char *just_columns;
+		char	   *just_table = NULL;
+		const char *just_columns = NULL;
 
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
-
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		if (objfilter & (OBJFILTER_SCHEMA | OBJFILTER_SCHEMA_EXCLUDE))
+		{
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace, ");
+		}
+
+		if (objfilter & OBJFILTER_TABLE)
+		{
+			/*
+			 * Split relation and column names given by the user, this is used
+			 * to feed the CTE with values on which are performed pre-run
+			 * validity checks as well.  For now these happen only on the
+			 * relation name.
+			 */
+			splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+								  &just_table, &just_columns);
+
+			appendStringLiteralConn(&catalog_query, just_table, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -536,13 +610,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -551,10 +625,22 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+							 " ON listed_objects.object_oid ");
+
+		if (objfilter & OBJFILTER_SCHEMA_EXCLUDE)
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.!=) ");
+		else
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.=) ");
+
+		if (objfilter & OBJFILTER_TABLE)
+			appendPQExpBufferStr(&catalog_query, "c.oid\n");
+		else
+			appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+	}
 
 	/*
 	 * If no tables were listed, filter for the relevant relation types.  If
@@ -562,7 +648,7 @@ vacuum_one_database(ConnParams *cparams,
 	 * Instead, let the server decide whether a given relation can be
 	 * processed in which case the user will know about it.
 	 */
-	if (!tables_listed)
+	if ((objfilter & OBJFILTER_TABLE) == 0)
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -633,7 +719,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -977,6 +1063,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do not vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
-- 
2.25.1

#28

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Nathan Bossart (#27)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

On Fri, Apr 22, 2022 at 10:57:46PM -0700, Nathan Bossart wrote:

On Fri, Apr 22, 2022 at 11:57:05AM +0200, Gilles Darold wrote:

Patch v10 attached.

Thanks! I've attached a v11 with some minor editorialization. I think I
was able to improve the error handling for invalid combinations of
command-line options a bit, but please let me know what you think.

I've attached a v12 of the patch that further simplifieѕ check_objfilter().
Apologies for the noise.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v12-0001-Add-schema-and-exclude-schema-options-to-vacuumd.patchtext/x-diff; charset=us-asciiDownload

From b920a5d32d67d45b6cf25879c7bac81393612e59 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Fri, 22 Apr 2022 22:34:38 -0700
Subject: [PATCH v12 1/1] Add --schema and --exclude-schema options to
 vacuumdb.

These two new options can be used to either process all tables in
specific schemas or to skip processing all tables in specific
schemas.  This change also refactors the handling of invalid
combinations of command-line options to a new helper function.

Author: Gilles Darold
Reviewed-by: Justin Pryzby, Nathan Bossart
Discussion: https://postgr.es/m/929fbf3c-24b8-d454-811f-1d5898ab3e91%40migops.com
---
 doc/src/sgml/ref/vacuumdb.sgml        |  66 ++++++++++
 src/bin/scripts/t/100_vacuumdb.pl     |  35 ++++++
 src/bin/scripts/t/101_vacuumdb_all.pl |   3 +
 src/bin/scripts/vacuumdb.c            | 168 +++++++++++++++++++-------
 4 files changed, 231 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..841aced3bd 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,30 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in
+        <replaceable class="parameter">schema</replaceable> only.  Multiple
+        schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Do not clean or analyze any tables in
+        <replaceable class="parameter">schema</replaceable>.  Multiple schemas
+        can be excluded by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +677,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..e4aac53249 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,38 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema schema only');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--exclude-schema', '"Foo"', 'postgres' ],
+	qr/(?:(?!VACUUM "Foo".bar).)*/,
+	'vacuumdb --exclude-schema schema');
+$node->command_fails(
+	[ 'vacuumdb', '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	'cannot use options -n and -t at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-n', 'pg_catalog', '-N', '"Foo"', 'postgres' ],
+	'cannot use options -n and -N at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-N', '"Foo"' ],
+	'cannot use options -a and -N at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-n', '"Foo"' ],
+	'cannot use options -a and -n at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-t', '"Foo".bar' ],
+	'cannot use options -a and -t at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-n', '"Foo"', 'postgres' ],
+	'cannot use options -a and -n at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-d', 'postgres' ],
+	'cannot use options -a and -d at the same time');
+$node->command_fails(
+	[ 'vacuumdb', '-a', 'postgres' ],
+	'cannot use option -a and a dbname as argument at the same time');
+
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..ccb05166df 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a', '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 92f1ffe147..5e24e63919 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,23 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+/* object filter options */
+typedef enum
+{
+	OBJFILTER_NONE = 0,					/* no filter used */
+	OBJFILTER_ALL_DBS = (1 << 0),		/* -a | --all */
+	OBJFILTER_DATABASE = (1 << 1),		/* -d | --dbname */
+	OBJFILTER_TABLE = (1 << 2),			/* -t | --table */
+	OBJFILTER_SCHEMA = (1 << 3),		/* -n | --schema */
+	OBJFILTER_SCHEMA_EXCLUDE = (1 << 4)	/* -N | --exclude-schema */
+} VacObjFilter;
+
+VacObjFilter objfilter = OBJFILTER_NONE;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *objects,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -68,6 +80,8 @@ static void run_vacuum_command(PGconn *conn, const char *sql, bool echo,
 
 static void help(const char *progname);
 
+void check_objfilter(VacObjFilter curr_option);
+
 /* For analyze-in-stages mode */
 #define ANALYZE_NO_STAGE	-1
 #define ANALYZE_NUM_STAGES	3
@@ -94,6 +108,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -121,8 +137,7 @@ main(int argc, char *argv[])
 	bool		quiet = false;
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
-	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +155,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -166,6 +181,7 @@ main(int argc, char *argv[])
 				quiet = true;
 				break;
 			case 'd':
+				check_objfilter(OBJFILTER_DATABASE);
 				dbname = pg_strdup(optarg);
 				break;
 			case 'z':
@@ -178,11 +194,12 @@ main(int argc, char *argv[])
 				vacopts.freeze = true;
 				break;
 			case 'a':
-				alldb = true;
+				check_objfilter(OBJFILTER_ALL_DBS);
 				break;
 			case 't':
 				{
-					simple_string_list_append(&tables, optarg);
+					check_objfilter(OBJFILTER_TABLE);
+					simple_string_list_append(&objects, optarg);
 					tbl_count++;
 					break;
 				}
@@ -202,6 +219,18 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':
+				{
+					check_objfilter(OBJFILTER_SCHEMA);
+					simple_string_list_append(&objects, optarg);
+					break;
+				}
+			case 'N':
+				{
+					check_objfilter(OBJFILTER_SCHEMA_EXCLUDE);
+					simple_string_list_append(&objects, optarg);
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -249,6 +278,7 @@ main(int argc, char *argv[])
 	 */
 	if (optind < argc && dbname == NULL)
 	{
+		check_objfilter(OBJFILTER_DATABASE);
 		dbname = argv[optind];
 		optind++;
 	}
@@ -316,13 +346,8 @@ main(int argc, char *argv[])
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
 
-	if (alldb)
+	if (objfilter & OBJFILTER_ALL_DBS)
 	{
-		if (dbname)
-			pg_fatal("cannot vacuum all databases and a specific one at the same time");
-		if (tables.head != NULL)
-			pg_fatal("cannot vacuum specific table(s) in all databases");
-
 		cparams.dbname = maintenance_db;
 
 		vacuum_all_databases(&cparams, &vacopts,
@@ -352,7 +377,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -360,7 +385,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -368,6 +393,43 @@ main(int argc, char *argv[])
 	exit(0);
 }
 
+/*
+ * Verify that the filters used at command line are compatible.
+ */
+void
+check_objfilter(VacObjFilter curr_option)
+{
+	objfilter |= curr_option;
+
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_DATABASE))
+		pg_fatal("cannot vacuum all databases and a specific one at the same time");
+
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_TABLE))
+		pg_fatal("cannot vacuum specific table(s) in all databases");
+
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_SCHEMA))
+		pg_fatal("cannot vacuum specific schema(s) in all databases");
+
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_SCHEMA_EXCLUDE))
+		pg_fatal("cannot exclude specific schema(s) in all databases");
+
+	if ((objfilter & OBJFILTER_TABLE) &&
+		(objfilter & OBJFILTER_SCHEMA))
+		pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+
+	if ((objfilter & OBJFILTER_TABLE) &&
+		(objfilter & OBJFILTER_SCHEMA_EXCLUDE))
+		pg_fatal("cannot vacuum specific table(s) and exclude schema(s) at the same time");
+
+	if ((objfilter & OBJFILTER_SCHEMA) &&
+		(objfilter & OBJFILTER_SCHEMA_EXCLUDE))
+		pg_fatal("cannot vacuum all tables in schema(s) and exclude schema(s) at the same time");
+}
+
 /*
  * vacuum_one_database
  *
@@ -385,7 +447,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -400,7 +462,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -499,31 +561,41 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
-		char	   *just_table;
-		const char *just_columns;
+		char	   *just_table = NULL;
+		const char *just_columns = NULL;
 
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
-
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		if (objfilter & (OBJFILTER_SCHEMA | OBJFILTER_SCHEMA_EXCLUDE))
+		{
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace, ");
+		}
+
+		if (objfilter & OBJFILTER_TABLE)
+		{
+			/*
+			 * Split relation and column names given by the user, this is used
+			 * to feed the CTE with values on which are performed pre-run
+			 * validity checks as well.  For now these happen only on the
+			 * relation name.
+			 */
+			splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+								  &just_table, &just_columns);
+
+			appendStringLiteralConn(&catalog_query, just_table, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -536,13 +608,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -551,10 +623,22 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+							 " ON listed_objects.object_oid ");
+
+		if (objfilter & OBJFILTER_SCHEMA_EXCLUDE)
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.!=) ");
+		else
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.=) ");
+
+		if (objfilter & OBJFILTER_TABLE)
+			appendPQExpBufferStr(&catalog_query, "c.oid\n");
+		else
+			appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+	}
 
 	/*
 	 * If no tables were listed, filter for the relevant relation types.  If
@@ -562,7 +646,7 @@ vacuum_one_database(ConnParams *cparams,
 	 * Instead, let the server decide whether a given relation can be
 	 * processed in which case the user will know about it.
 	 */
-	if (!tables_listed)
+	if ((objfilter & OBJFILTER_TABLE) == 0)
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -633,7 +717,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -977,6 +1061,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do not vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
-- 
2.25.1

#29

Gilles Darold

gilles@migops.com

over 3 years ago

In reply to: Nathan Bossart (#28)

Re: [Proposal] vacuumdb --schema only

goo
Le 25/04/2022 à 03:27, Nathan Bossart a écrit :

On Fri, Apr 22, 2022 at 10:57:46PM -0700, Nathan Bossart wrote:

On Fri, Apr 22, 2022 at 11:57:05AM +0200, Gilles Darold wrote:

Patch v10 attached.

Thanks! I've attached a v11 with some minor editorialization. I think I
was able to improve the error handling for invalid combinations of
command-line options a bit, but please let me know what you think.

I've attached a v12 of the patch that further simplifieѕ check_objfilter().
Apologies for the noise.

Looks good for me, there is a failure on cfbot on FreeBSD but I have run
a CI with latest master and it pass.

Can I change the commitfest status to ready for committers?

--
Gilles Darold

#30

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Gilles Darold (#29)

Re: [Proposal] vacuumdb --schema only

On Mon, Apr 25, 2022 at 08:50:09AM +0200, Gilles Darold wrote:

Can I change the commitfest status to ready for committers?

I've marked it as ready-for-committer.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#31

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Nathan Bossart (#30)

Re: [Proposal] vacuumdb --schema only

On Mon, Apr 25, 2022 at 09:18:53AM -0700, Nathan Bossart wrote:

I've marked it as ready-for-committer.

The refactoring logic to build the queries is clear to follow. I have
a few comments about the shape of the patch, though.

            case 'a':
-               alldb = true;
+               check_objfilter(OBJFILTER_ALL_DBS);
                break;
The cross-option checks are usually done after all the options
switches are check.  Why does this need to be different?  It does not
strike me as a huge problem to do one filter check at the end.

+void
+check_objfilter(VacObjFilter curr_option)
+{
+   objfilter |= curr_option;
+
+   if ((objfilter & OBJFILTER_ALL_DBS) &&
+       (objfilter & OBJFILTER_DATABASE))
+       pg_fatal("cannot vacuum all databases and a specific one at the same time");
The addition of more OBJFILTER_* (unlikely going to happen, but who
knows) would make it hard to know which option should not interact
with each other.  Wouldn't it be better to use a kind of compatibility
table for that?  As one OBJFILTER_* matches with one option, you could
simplify the number of strings in need of translation by switching to
an error message like "cannot use options %s and %s together", or
something like that?

+$node->command_fails(
+   [ 'vacuumdb', '-a', '-d', 'postgres' ],
+   'cannot use options -a and -d at the same time');
This set of tests had better use command_fails_like() to make sure
that the correct error patterns from check_objfilter() show up?
--
Michael

#32

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Michael Paquier (#31)

1 attachment(s)

Re: [Proposal] vacuumdb --schema only

On Tue, Apr 26, 2022 at 11:36:02AM +0900, Michael Paquier wrote:

The refactoring logic to build the queries is clear to follow. I have
a few comments about the shape of the patch, though.

Thanks for taking a look!

case 'a':
-               alldb = true;
+               check_objfilter(OBJFILTER_ALL_DBS);
break;
The cross-option checks are usually done after all the options
switches are check.  Why does this need to be different?  It does not
strike me as a huge problem to do one filter check at the end.

Makes sense. I fixed this in v13.

+void
+check_objfilter(VacObjFilter curr_option)
+{
+   objfilter |= curr_option;
+
+   if ((objfilter & OBJFILTER_ALL_DBS) &&
+       (objfilter & OBJFILTER_DATABASE))
+       pg_fatal("cannot vacuum all databases and a specific one at the same time");
The addition of more OBJFILTER_* (unlikely going to happen, but who
knows) would make it hard to know which option should not interact
with each other.  Wouldn't it be better to use a kind of compatibility
table for that?  As one OBJFILTER_* matches with one option, you could
simplify the number of strings in need of translation by switching to
an error message like "cannot use options %s and %s together", or
something like that?

I think this might actually make things more complicated. In addition to
the compatibility table, we'd need to define the strings to use in the
error message somewhere. I can give this a try if you feel strongly about
it.

+$node->command_fails(
+   [ 'vacuumdb', '-a', '-d', 'postgres' ],
+   'cannot use options -a and -d at the same time');
This set of tests had better use command_fails_like() to make sure
that the correct error patterns from check_objfilter() show up?

Yes. I did this in v13.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v13-0001-Add-schema-and-exclude-schema-options-to-vacuumd.patchtext/x-diff; charset=us-asciiDownload

From 7559e6a289058a36eb3b10414b929e52b0bc5cbf Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Fri, 22 Apr 2022 22:34:38 -0700
Subject: [PATCH v13 1/1] Add --schema and --exclude-schema options to
 vacuumdb.

These two new options can be used to either process all tables in
specific schemas or to skip processing all tables in specific
schemas.  This change also refactors the handling of invalid
combinations of command-line options to a new helper function.

Author: Gilles Darold
Reviewed-by: Justin Pryzby, Nathan Bossart
Discussion: https://postgr.es/m/929fbf3c-24b8-d454-811f-1d5898ab3e91%40migops.com
---
 doc/src/sgml/ref/vacuumdb.sgml    |  66 ++++++++++++
 src/bin/scripts/t/100_vacuumdb.pl |  42 ++++++++
 src/bin/scripts/vacuumdb.c        | 172 +++++++++++++++++++++++-------
 3 files changed, 239 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..841aced3bd 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,40 @@ PostgreSQL documentation
    <arg choice="opt"><replaceable>dbname</replaceable></arg>
   </cmdsynopsis>
 
+  <cmdsynopsis>
+   <command>vacuumdb</command>
+   <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+
+   <arg choice="plain" rep="repeat">
+    <arg choice="opt">
+     <group choice="plain">
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-n</option></arg>
+          <arg choice="plain"><option>--schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+
+       <arg choice="plain">
+        <arg choice="opt">
+         <group choice="plain">
+          <arg choice="plain"><option>-N</option></arg>
+          <arg choice="plain"><option>--exclude-schema</option></arg>
+         </group>
+         <replaceable>schema</replaceable>
+        </arg>
+       </arg>
+     </group>
+    </arg>
+   </arg>
+
+   <arg choice="opt"><replaceable>dbname</replaceable></arg>
+  </cmdsynopsis>
+
   <cmdsynopsis>
    <command>vacuumdb</command>
    <arg rep="repeat"><replaceable>connection-option</replaceable></arg>
@@ -244,6 +278,30 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Clean or analyze all tables in
+        <replaceable class="parameter">schema</replaceable> only.  Multiple
+        schemas can be vacuumed by writing multiple <option>-n</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-N <replaceable class="parameter">schema</replaceable></option></term>
+      <term><option>--exclude-schema=<replaceable class="parameter">schema</replaceable></option></term>
+      <listitem>
+       <para>
+        Do not clean or analyze any tables in
+        <replaceable class="parameter">schema</replaceable>.  Multiple schemas
+        can be excluded by writing multiple <option>-N</option> switches.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--no-index-cleanup</option></term>
       <listitem>
@@ -619,6 +677,14 @@ PostgreSQL documentation
 <prompt>$ </prompt><userinput>vacuumdb --analyze --verbose --table='foo(bar)' xyzzy</userinput>
 </screen></para>
 
+   <para>
+    To clean all tables in the <literal>foo</literal> and <literal>bar</literal> schemas
+    in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='foo' --schema='bar' xyzzy</userinput>
+</screen></para>
+
+
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..e5343774fe 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,45 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".bar/,
+	'vacuumdb --schema');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--exclude-schema', '"Foo"', 'postgres' ],
+	qr/(?:(?!VACUUM "Foo".bar).)*/,
+	'vacuumdb --exclude-schema');
+$node->command_fails_like(
+	[ 'vacuumdb', '-N', 'pg_catalog', '-t', 'pg_class', 'postgres', ],
+	qr/cannot vacuum specific table\(s\) and exclude schema\(s\) at the same time/,
+	'cannot use options -N and -t at the same time');
+$node->command_fails_like(
+	[ 'vacuumdb', '-n', 'pg_catalog', '-t', 'pg_class', 'postgres' ],
+	qr/cannot vacuum all tables in schema\(s\) and specific table\(s\) at the same time/,
+	'cannot use options -n and -t at the same time');
+$node->command_fails_like(
+	[ 'vacuumdb', '-n', 'pg_catalog', '-N', '"Foo"', 'postgres' ],
+	qr/cannot vacuum all tables in schema\(s\) and exclude schema\(s\) at the same time/,
+	'cannot use options -n and -N at the same time');
+$node->command_fails_like(
+	[ 'vacuumdb', '-a', '-N', '"Foo"' ],
+	qr/cannot exclude specific schema\(s\) in all databases/,
+	'cannot use options -a and -N at the same time');
+$node->command_fails_like(
+	[ 'vacuumdb', '-a', '-n', '"Foo"' ],
+	qr/cannot vacuum specific schema\(s\) in all databases/,
+	'cannot use options -a and -n at the same time');
+$node->command_fails_like(
+	[ 'vacuumdb', '-a', '-t', '"Foo".bar' ],
+	qr/cannot vacuum specific table\(s\) in all databases/,
+	'cannot use options -a and -t at the same time');
+$node->command_fails_like(
+	[ 'vacuumdb', '-a', '-d', 'postgres' ],
+	qr/cannot vacuum all databases and a specific one at the same time/,
+	'cannot use options -a and -d at the same time');
+$node->command_fails_like(
+	[ 'vacuumdb', '-a', 'postgres' ],
+	qr/cannot vacuum all databases and a specific one at the same time/,
+	'cannot use option -a and a dbname as argument at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 92f1ffe147..0482aa9e88 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,11 +46,23 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
+/* object filter options */
+typedef enum
+{
+	OBJFILTER_NONE = 0,					/* no filter used */
+	OBJFILTER_ALL_DBS = (1 << 0),		/* -a | --all */
+	OBJFILTER_DATABASE = (1 << 1),		/* -d | --dbname */
+	OBJFILTER_TABLE = (1 << 2),			/* -t | --table */
+	OBJFILTER_SCHEMA = (1 << 3),		/* -n | --schema */
+	OBJFILTER_SCHEMA_EXCLUDE = (1 << 4)	/* -N | --exclude-schema */
+} VacObjFilter;
+
+VacObjFilter objfilter = OBJFILTER_NONE;
 
 static void vacuum_one_database(ConnParams *cparams,
 								vacuumingOptions *vacopts,
 								int stage,
-								SimpleStringList *tables,
+								SimpleStringList *objects,
 								int concurrentCons,
 								const char *progname, bool echo, bool quiet);
 
@@ -68,6 +80,8 @@ static void run_vacuum_command(PGconn *conn, const char *sql, bool echo,
 
 static void help(const char *progname);
 
+void check_objfilter(void);
+
 /* For analyze-in-stages mode */
 #define ANALYZE_NO_STAGE	-1
 #define ANALYZE_NUM_STAGES	3
@@ -94,6 +108,8 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
+		{"exclude-schema", required_argument, NULL, 'N'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -121,8 +137,7 @@ main(int argc, char *argv[])
 	bool		quiet = false;
 	vacuumingOptions vacopts;
 	bool		analyze_in_stages = false;
-	bool		alldb = false;
-	SimpleStringList tables = {NULL, NULL};
+	SimpleStringList objects = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
 
@@ -140,7 +155,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "vacuumdb", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "h:p:U:wWeqd:zZFat:fvj:P:n:N:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -166,6 +181,7 @@ main(int argc, char *argv[])
 				quiet = true;
 				break;
 			case 'd':
+				objfilter |= OBJFILTER_DATABASE;
 				dbname = pg_strdup(optarg);
 				break;
 			case 'z':
@@ -178,11 +194,12 @@ main(int argc, char *argv[])
 				vacopts.freeze = true;
 				break;
 			case 'a':
-				alldb = true;
+				objfilter |= OBJFILTER_ALL_DBS;
 				break;
 			case 't':
 				{
-					simple_string_list_append(&tables, optarg);
+					objfilter |= OBJFILTER_TABLE;
+					simple_string_list_append(&objects, optarg);
 					tbl_count++;
 					break;
 				}
@@ -202,6 +219,18 @@ main(int argc, char *argv[])
 									  &vacopts.parallel_workers))
 					exit(1);
 				break;
+			case 'n':
+				{
+					objfilter |= OBJFILTER_SCHEMA;
+					simple_string_list_append(&objects, optarg);
+					break;
+				}
+			case 'N':
+				{
+					objfilter |= OBJFILTER_SCHEMA_EXCLUDE;
+					simple_string_list_append(&objects, optarg);
+					break;
+				}
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -249,6 +278,7 @@ main(int argc, char *argv[])
 	 */
 	if (optind < argc && dbname == NULL)
 	{
+		objfilter |= OBJFILTER_DATABASE;
 		dbname = argv[optind];
 		optind++;
 	}
@@ -261,6 +291,12 @@ main(int argc, char *argv[])
 		exit(1);
 	}
 
+	/*
+	 * Validate the combination of filters specified in the command-line
+	 * options.
+	 */
+	check_objfilter();
+
 	if (vacopts.analyze_only)
 	{
 		if (vacopts.full)
@@ -316,13 +352,8 @@ main(int argc, char *argv[])
 	if (tbl_count && (concurrentCons > tbl_count))
 		concurrentCons = tbl_count;
 
-	if (alldb)
+	if (objfilter & OBJFILTER_ALL_DBS)
 	{
-		if (dbname)
-			pg_fatal("cannot vacuum all databases and a specific one at the same time");
-		if (tables.head != NULL)
-			pg_fatal("cannot vacuum specific table(s) in all databases");
-
 		cparams.dbname = maintenance_db;
 
 		vacuum_all_databases(&cparams, &vacopts,
@@ -352,7 +383,7 @@ main(int argc, char *argv[])
 			{
 				vacuum_one_database(&cparams, &vacopts,
 									stage,
-									&tables,
+									&objects,
 									concurrentCons,
 									progname, echo, quiet);
 			}
@@ -360,7 +391,7 @@ main(int argc, char *argv[])
 		else
 			vacuum_one_database(&cparams, &vacopts,
 								ANALYZE_NO_STAGE,
-								&tables,
+								&objects,
 								concurrentCons,
 								progname, echo, quiet);
 	}
@@ -368,6 +399,41 @@ main(int argc, char *argv[])
 	exit(0);
 }
 
+/*
+ * Verify that the filters used at command line are compatible.
+ */
+void
+check_objfilter(void)
+{
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_DATABASE))
+		pg_fatal("cannot vacuum all databases and a specific one at the same time");
+
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_TABLE))
+		pg_fatal("cannot vacuum specific table(s) in all databases");
+
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_SCHEMA))
+		pg_fatal("cannot vacuum specific schema(s) in all databases");
+
+	if ((objfilter & OBJFILTER_ALL_DBS) &&
+		(objfilter & OBJFILTER_SCHEMA_EXCLUDE))
+		pg_fatal("cannot exclude specific schema(s) in all databases");
+
+	if ((objfilter & OBJFILTER_TABLE) &&
+		(objfilter & OBJFILTER_SCHEMA))
+		pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+
+	if ((objfilter & OBJFILTER_TABLE) &&
+		(objfilter & OBJFILTER_SCHEMA_EXCLUDE))
+		pg_fatal("cannot vacuum specific table(s) and exclude schema(s) at the same time");
+
+	if ((objfilter & OBJFILTER_SCHEMA) &&
+		(objfilter & OBJFILTER_SCHEMA_EXCLUDE))
+		pg_fatal("cannot vacuum all tables in schema(s) and exclude schema(s) at the same time");
+}
+
 /*
  * vacuum_one_database
  *
@@ -385,7 +451,7 @@ static void
 vacuum_one_database(ConnParams *cparams,
 					vacuumingOptions *vacopts,
 					int stage,
-					SimpleStringList *tables,
+					SimpleStringList *objects,
 					int concurrentCons,
 					const char *progname, bool echo, bool quiet)
 {
@@ -400,7 +466,7 @@ vacuum_one_database(ConnParams *cparams,
 	int			i;
 	int			ntups;
 	bool		failed = false;
-	bool		tables_listed = false;
+	bool		objects_listed = false;
 	bool		has_where = false;
 	const char *initcmd;
 	const char *stage_commands[] = {
@@ -499,31 +565,41 @@ vacuum_one_database(ConnParams *cparams,
 	 * catalog query will fail.
 	 */
 	initPQExpBuffer(&catalog_query);
-	for (cell = tables ? tables->head : NULL; cell; cell = cell->next)
+	for (cell = objects ? objects->head : NULL; cell; cell = cell->next)
 	{
-		char	   *just_table;
-		const char *just_columns;
+		char	   *just_table = NULL;
+		const char *just_columns = NULL;
 
-		/*
-		 * Split relation and column names given by the user, this is used to
-		 * feed the CTE with values on which are performed pre-run validity
-		 * checks as well.  For now these happen only on the relation name.
-		 */
-		splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
-							  &just_table, &just_columns);
-
-		if (!tables_listed)
+		if (!objects_listed)
 		{
 			appendPQExpBufferStr(&catalog_query,
-								 "WITH listed_tables (table_oid, column_list) "
+								 "WITH listed_objects (object_oid, column_list) "
 								 "AS (\n  VALUES (");
-			tables_listed = true;
+			objects_listed = true;
 		}
 		else
 			appendPQExpBufferStr(&catalog_query, ",\n  (");
 
-		appendStringLiteralConn(&catalog_query, just_table, conn);
-		appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		if (objfilter & (OBJFILTER_SCHEMA | OBJFILTER_SCHEMA_EXCLUDE))
+		{
+			appendStringLiteralConn(&catalog_query, cell->val, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace, ");
+		}
+
+		if (objfilter & OBJFILTER_TABLE)
+		{
+			/*
+			 * Split relation and column names given by the user, this is used
+			 * to feed the CTE with values on which are performed pre-run
+			 * validity checks as well.  For now these happen only on the
+			 * relation name.
+			 */
+			splitTableColumnsSpec(cell->val, PQclientEncoding(conn),
+								  &just_table, &just_columns);
+
+			appendStringLiteralConn(&catalog_query, just_table, conn);
+			appendPQExpBufferStr(&catalog_query, "::pg_catalog.regclass, ");
+		}
 
 		if (just_columns && just_columns[0] != '\0')
 			appendStringLiteralConn(&catalog_query, just_columns, conn);
@@ -536,13 +612,13 @@ vacuum_one_database(ConnParams *cparams,
 	}
 
 	/* Finish formatting the CTE */
-	if (tables_listed)
+	if (objects_listed)
 		appendPQExpBufferStr(&catalog_query, "\n)\n");
 
 	appendPQExpBufferStr(&catalog_query, "SELECT c.relname, ns.nspname");
 
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, ", listed_tables.column_list");
+	if (objects_listed)
+		appendPQExpBufferStr(&catalog_query, ", listed_objects.column_list");
 
 	appendPQExpBufferStr(&catalog_query,
 						 " FROM pg_catalog.pg_class c\n"
@@ -551,10 +627,22 @@ vacuum_one_database(ConnParams *cparams,
 						 " LEFT JOIN pg_catalog.pg_class t"
 						 " ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid\n");
 
-	/* Used to match the tables listed by the user */
-	if (tables_listed)
-		appendPQExpBufferStr(&catalog_query, " JOIN listed_tables"
-							 " ON listed_tables.table_oid OPERATOR(pg_catalog.=) c.oid\n");
+	/* Used to match the tables or schemas listed by the user */
+	if (objects_listed)
+	{
+		appendPQExpBufferStr(&catalog_query, " JOIN listed_objects"
+							 " ON listed_objects.object_oid ");
+
+		if (objfilter & OBJFILTER_SCHEMA_EXCLUDE)
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.!=) ");
+		else
+			appendPQExpBufferStr(&catalog_query, "OPERATOR(pg_catalog.=) ");
+
+		if (objfilter & OBJFILTER_TABLE)
+			appendPQExpBufferStr(&catalog_query, "c.oid\n");
+		else
+			appendPQExpBufferStr(&catalog_query, "ns.oid\n");
+	}
 
 	/*
 	 * If no tables were listed, filter for the relevant relation types.  If
@@ -562,7 +650,7 @@ vacuum_one_database(ConnParams *cparams,
 	 * Instead, let the server decide whether a given relation can be
 	 * processed in which case the user will know about it.
 	 */
-	if (!tables_listed)
+	if ((objfilter & OBJFILTER_TABLE) == 0)
 	{
 		appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array["
 							 CppAsString2(RELKIND_RELATION) ", "
@@ -633,7 +721,7 @@ vacuum_one_database(ConnParams *cparams,
 							 fmtQualifiedId(PQgetvalue(res, i, 1),
 											PQgetvalue(res, i, 0)));
 
-		if (tables_listed && !PQgetisnull(res, i, 2))
+		if (objects_listed && !PQgetisnull(res, i, 2))
 			appendPQExpBufferStr(&buf, PQgetvalue(res, i, 2));
 
 		simple_string_list_append(&dbtables, buf.data);
@@ -977,6 +1065,8 @@ help(const char *progname)
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-toast          skip the TOAST table associated with the table to vacuum\n"));
 	printf(_("      --no-truncate               don't truncate empty pages at the end of the table\n"));
+	printf(_("  -n, --schema=PATTERN            vacuum tables in the specified schema(s) only\n"));
+	printf(_("  -N, --exclude-schema=PATTERN    do not vacuum tables in the specified schema(s)\n"));
 	printf(_("  -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
 	printf(_("  -q, --quiet                     don't write any messages\n"));
 	printf(_("      --skip-locked               skip relations that cannot be immediately locked\n"));
-- 
2.25.1

#33

Andrew Dunstan

andrew@dunslane.net

over 3 years ago

In reply to: Nathan Bossart (#32)

Re: [Proposal] vacuumdb --schema only

On 2022-04-26 Tu 00:46, Nathan Bossart wrote:

On Tue, Apr 26, 2022 at 11:36:02AM +0900, Michael Paquier wrote:

The refactoring logic to build the queries is clear to follow. I have
a few comments about the shape of the patch, though.

Thanks for taking a look!

case 'a':
-               alldb = true;
+               check_objfilter(OBJFILTER_ALL_DBS);
break;
The cross-option checks are usually done after all the options
switches are check.  Why does this need to be different?  It does not
strike me as a huge problem to do one filter check at the end.

Makes sense. I fixed this in v13.

+void
+check_objfilter(VacObjFilter curr_option)
+{
+   objfilter |= curr_option;
+
+   if ((objfilter & OBJFILTER_ALL_DBS) &&
+       (objfilter & OBJFILTER_DATABASE))
+       pg_fatal("cannot vacuum all databases and a specific one at the same time");
The addition of more OBJFILTER_* (unlikely going to happen, but who
knows) would make it hard to know which option should not interact
with each other.  Wouldn't it be better to use a kind of compatibility
table for that?  As one OBJFILTER_* matches with one option, you could
simplify the number of strings in need of translation by switching to
an error message like "cannot use options %s and %s together", or
something like that?

+$node->command_fails(
+   [ 'vacuumdb', '-a', '-d', 'postgres' ],
+   'cannot use options -a and -d at the same time');
This set of tests had better use command_fails_like() to make sure
that the correct error patterns from check_objfilter() show up?

Yes. I did this in v13.

committed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com