[Proposal] vacuumdb --schema only
Hi,
When we want to vacuum and/or analyze all tables in a dedicated schema,
let's say pg_catalog for example, there is no easy way to do that. The
VACUUM command doesn't allow it so we have to use \gexec or a SQL script
to do that. We have an external command vacuumdb that could be used to
simplify this task. For example the following command can be used to
clean all tables stored in the pg_catalog schema:
vacuumdb --schema pg_catalog -d foo
The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.
Common use cases are an application that creates lot of temporary
objects then drop them which can bloat a lot the catalog or which have
heavy work in some schemas only. Of course the good practice is to find
the bloated tables and execute VACUUM on each table but if most of the
tables in the schema are regularly bloated the use of the vacuumdb
--schema script can save time.
I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that and also
because I'm not really in favor of such change. But if there is interest
in improving these commands I will be pleased to do that, with the
syntax suggested.
Best regards,
--
Gilles Darold
Attachments:
0001-vacuumdb-schema-only.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only.patchDownload+98-2
On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:
The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.
Yes, thanks.
I suggest there should also be an --exclude-schema.
I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that
I think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.
+ /* + * When filtereing on schema name, filter by table is not allowed. + * The schema name can already be set in a fqdn table name.
set *to*
--
Justin
On Fri, 4 Mar 2022 at 14:41, Gilles Darold <gilles@migops.com> wrote:
Hi,
When we want to vacuum and/or analyze all tables in a dedicated schema,
let's say pg_catalog for example, there is no easy way to do that. The
VACUUM command doesn't allow it so we have to use \gexec or a SQL script
to do that. We have an external command vacuumdb that could be used to
simplify this task. For example the following command can be used to
clean all tables stored in the pg_catalog schema:vacuumdb --schema pg_catalog -d foo
+1
This gives much better flexibility to users.
Show quoted text
The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.Common use cases are an application that creates lot of temporary
objects then drop them which can bloat a lot the catalog or which have
heavy work in some schemas only. Of course the good practice is to find
the bloated tables and execute VACUUM on each table but if most of the
tables in the schema are regularly bloated the use of the vacuumdb
--schema script can save time.I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that and also
because I'm not really in favor of such change. But if there is interest
in improving these commands I will be pleased to do that, with the
syntax suggested.Best regards,
--
Gilles Darold
Le 04/03/2022 à 11:56, Justin Pryzby a écrit :
On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:
The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.Yes, thanks.
I suggest there should also be an --exclude-schema.
Ok, I will add it too.
I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do thatI think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.
Yes this is what I've though, something a la EXPLAIN, for example :
"VACUUM (ANALYZE, SCHEMA foo)" but this is a change in the VACUUM syntax
that needs to keep the compatibility with the current syntax. We will
have two syntax something like "VACUUM ANALYZE FULL dbname" and "VACUUM
(ANALYZE, FULL) dbname". The other syntax "problem" is to be able to use
multiple schema values in the VACUUM command, perhaps "VACUUM (ANALYZE,
SCHEMA (foo,bar))".
+ /* + * When filtereing on schema name, filter by table is not allowed. + * The schema name can already be set in a fqdn table name.set *to*
Thanks, will be fixed in next patch version.
--
Gilles Darold
Le 04/03/2022 à 11:56, Justin Pryzby a écrit :
On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:
The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.Yes, thanks.
I suggest there should also be an --exclude-schema.
I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do thatI think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.+ /* + * When filtereing on schema name, filter by table is not allowed. + * The schema name can already be set in a fqdn table name.set *to*
Attached a new patch version that adds the -N | --exclude-schema option
to the vacuumdb command as suggested. Documentation updated too.
I will add this patch to the commitfest unless there is cons about
adding these options.
--
Gilles Darold
Attachments:
0001-vacuumdb-schema-only-v2.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v2.patchDownload+154-1
On Sun, Mar 06, 2022 at 09:39:37AM +0100, Gilles Darold wrote:
Attached a new patch version that adds the -N | --exclude-schema option
to the vacuumdb command as suggested. Documentation updated too.+ pg_log_error("cannot vacuum all tables in schema(s) and and exclude specific schema(s) at the same time");
and and
It's odd that schema_exclusion is a global var, but schemas/excluded are not.
Also, it seems unnecessary to have two schemas vars, since they can't be used
together. Maybe there's a better way than what I did in 003.
+ for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
It's preferred to write cell != NULL
+ bool schemas_listed = false;
...
+ for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next) + { + if (!schemas_listed) { + appendPQExpBufferStr(&catalog_query, + " AND pg_catalog.quote_ident(ns.nspname)"); + if (schema_exclusion) + appendPQExpBufferStr(&catalog_query, " NOT IN ("); + else + appendPQExpBufferStr(&catalog_query, " IN ("); + + schemas_listed = true; + } + else + appendPQExpBufferStr(&catalog_query, ", "); + + appendStringLiteralConn(&catalog_query, cell->val, conn); + appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name"); + + } + /* Finish formatting schema filter */ + if (schemas_listed) + appendPQExpBufferStr(&catalog_query, ")\n"); }
Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.
--
Justin
Le 06/03/2022 à 16:04, Justin Pryzby a écrit :
On Sun, Mar 06, 2022 at 09:39:37AM +0100, Gilles Darold wrote:
Attached a new patch version that adds the -N | --exclude-schema option
to the vacuumdb command as suggested. Documentation updated too.+ pg_log_error("cannot vacuum all tables in schema(s) and and exclude specific schema(s) at the same time");
and and
It's odd that schema_exclusion is a global var, but schemas/excluded are not.
Also, it seems unnecessary to have two schemas vars, since they can't be used
together. Maybe there's a better way than what I did in 003.+ for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next)
It's preferred to write cell != NULL
+ bool schemas_listed = false;
...
+ for (cell = schemas ? schemas->head : NULL; cell; cell = cell->next) + { + if (!schemas_listed) { + appendPQExpBufferStr(&catalog_query, + " AND pg_catalog.quote_ident(ns.nspname)"); + if (schema_exclusion) + appendPQExpBufferStr(&catalog_query, " NOT IN ("); + else + appendPQExpBufferStr(&catalog_query, " IN ("); + + schemas_listed = true; + } + else + appendPQExpBufferStr(&catalog_query, ", "); + + appendStringLiteralConn(&catalog_query, cell->val, conn); + appendPQExpBufferStr(&catalog_query, "::pg_catalog.regnamespace::pg_catalog.name"); + + } + /* Finish formatting schema filter */ + if (schemas_listed) + appendPQExpBufferStr(&catalog_query, ")\n"); }Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.
I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/
--
Gilles Darold
Attachments:
0001-vacuumdb-schema-only-v3.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v3.patchDownload+159-9
Hi,
New version v4 of the patch to fix a typo in a comment.
--
Gilles Darold
Attachments:
0001-vacuumdb-schema-only-v4.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v4.patchDownload+159-9
On Mon, Mar 07, 2022 at 08:38:04AM +0100, Gilles Darold wrote:
Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/
I wondered whether my patches were improvements, and it occurred to me that
your patch didn't fail if the specified schema didn't exist. That's arguably
preferable, but that's the pre-existing behavior for tables. So I think the
behavior of my patch is more consistent.
$ ./src/bin/scripts/vacuumdb -h /tmp -d postgres --table foo
vacuumdb: vacuuming database "postgres"
2022-03-09 15:04:06.922 CST client backend[25540] vacuumdb ERROR: relation "foo" does not exist at character 60
$ ./src/bin/scripts/vacuumdb -h /tmp -d postgres --schema foo
vacuumdb: vacuuming database "postgres"
2022-03-09 15:02:59.926 CST client backend[23516] vacuumdb ERROR: schema "foo" does not exist at character 335
Le 09/03/2022 à 22:10, Justin Pryzby a écrit :
On Mon, Mar 07, 2022 at 08:38:04AM +0100, Gilles Darold wrote:
Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/I wondered whether my patches were improvements, and it occurred to me that
your patch didn't fail if the specified schema didn't exist. That's arguably
preferable, but that's the pre-existing behavior for tables. So I think the
behavior of my patch is more consistent.
+1
--
Gilles Darold
On Thu, Mar 10, 2022 at 1:32 AM Gilles Darold <gilles@migops.com> wrote:
Le 09/03/2022 à 22:10, Justin Pryzby a écrit :
On Mon, Mar 07, 2022 at 08:38:04AM +0100, Gilles Darold wrote:
Maybe it's clearer to write this with =ANY() / != ALL() ?
See 002.I have applied your changes and produced a new version v3 of the patch,
thanks for the improvements. The patch have been added to commitfest
interface, see here https://commitfest.postgresql.org/38/3587/I wondered whether my patches were improvements, and it occurred to me that
your patch didn't fail if the specified schema didn't exist. That's arguably
preferable, but that's the pre-existing behavior for tables. So I think the
behavior of my patch is more consistent.+1
+1 for consistency.
Robert Treat
https://xzilla.net
I took a look at the v4 patch.
'git-apply' complains about whitespace errors:
0001-vacuumdb-schema-only-v4.patch:17: tab in indent.
<arg choice="plain">
0001-vacuumdb-schema-only-v4.patch:18: tab in indent.
<arg choice="opt">
0001-vacuumdb-schema-only-v4.patch:19: tab in indent.
<group choice="plain">
0001-vacuumdb-schema-only-v4.patch:20: tab in indent.
<arg choice="plain"><option>-n</option></arg>
0001-vacuumdb-schema-only-v4.patch:21: tab in indent.
<arg choice="plain"><option>--schema</option></arg>
warning: squelched 13 whitespace errors
warning: 18 lines add whitespace errors.
+ printf(_(" -N, --exclude-schema=PATTERN do NOT vacuum tables in the specified schema(s)\n"));
I'm personally -1 for the --exclude-schema option. I don't see any
existing "exclude" options in vacuumdb, and the uses for such an option
seem rather limited. If we can point to specific use-cases for this
option, I might be willing to change my vote.
+ <para>
+ To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas
+ only in a database named <literal>xyzzy</literal>:
+<screen>
+<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput>
+</screen></para>
nitpicks: I think the phrasing should be "To only clean tables in the...".
Also, is there any reason to use a schema name with a capital letter as an
example? IMO that just adds unnecessary complexity to the example.
+$node->issues_sql_like(
+ [ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+ qr/VACUUM "Foo".*/,
+ 'vacuumdb --schema schema only');
IIUC there should only be one table in the schema. Can we avoid matching
"*" and check for the exact command instead?
I think there should be a few more test cases. For example, we should test
using -n and -N at the same time, and we should test what happens when
those options are used for missing schemas.
+ /*
+ * When filtering on schema name, filter by table is not allowed.
+ * The schema name can already be set to a fqdn table name.
+ */
+ if (tbl_count && (schemas.head != NULL))
+ {
+ pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
+ exit(1);
+ }
I think there might be some useful refactoring we can do that would
simplify adding similar options in the future. Specifically, can we have a
global variable that stores the type of vacuumdb command (e.g., all,
tables, or schemas)? If so, perhaps the tables list could be renamed and
reused for schemas (and any other objects that need listing in the future).
+ if (schemas != NULL && schemas->head != NULL)
+ {
+ appendPQExpBufferStr(&catalog_query,
+ " AND c.relnamespace");
+ if (schema_is_exclude == TRI_YES)
+ appendPQExpBufferStr(&catalog_query,
+ " OPERATOR(pg_catalog.!=) ALL (ARRAY[");
+ else if (schema_is_exclude == TRI_NO)
+ appendPQExpBufferStr(&catalog_query,
+ " OPERATOR(pg_catalog.=) ANY (ARRAY[");
+
+ for (cell = schemas->head; cell != NULL; cell = cell->next)
+ {
+ appendStringLiteralConn(&catalog_query, cell->val, conn);
+
+ if (cell->next != NULL)
+ appendPQExpBufferStr(&catalog_query, ", ");
+ }
+
+ /* Finish formatting schema filter */
+ appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n");
+ }
IMO we should use a CTE for specified schemas like we do for the specified
tables. I wonder if we could even have a mostly-shared CTE code path for
all vacuumdb commands with a list of names.
- /*
- * If no tables were listed, filter for the relevant relation types. If
- * tables were given via --table, don't bother filtering by relation type.
- * Instead, let the server decide whether a given relation can be
- * processed in which case the user will know about it.
- */
- if (!tables_listed)
+ else
{
+ /*
+ * If no tables were listed, filter for the relevant relation types. If
+ * tables were given via --table, don't bother filtering by relation type.
+ * Instead, let the server decide whether a given relation can be
+ * processed in which case the user will know about it.
+ */
nitpick: This change seems unnecessary.
I noticed upthread that there was some discussion around adding a way to
specify a schema in VACUUM and ANALYZE commands. I think this patch is
useful even if such an option is eventually added, as we'll still want
vacuumdb to obtain the full list of tables to process so that it can
effectively parallelize.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 30, 2022 at 02:22:58PM -0700, Nathan Bossart wrote:
I'm personally -1 for the --exclude-schema option. I don't see any
existing "exclude" options in vacuumdb, and the uses for such an option
seem rather limited. If we can point to specific use-cases for this
option, I might be willing to change my vote.
I suggested it because I would consider using it, even though I don't currently
use the vacuumdb script at all. I think this would allow partially
retiring/simplifying our existing vacuum script.
We 1) put all our partitions in a separate "child" schema (so \d is more
usable), and also 2) put some short-lived tables into their own schemas. Some
of those tables may only exist for ~1 day so I'd perfer to neither vacuum nor
analyze them (they're only used for SELECT *). But there can be a lot of them,
so a nightly job could do something like vacuumdb --schema public or vacuumdb
--exclude-schema ephemeral.
Everything would be processed nightly using vacuumdb --min-xid (to keep the
monitoring system happy).
The non-partitioned tables could be vacuumed nightly (without min-xid), with
--exclude ephemeral.
The partitioned tables could be processed monthly with vacuumdb --analyze.
I'd also want to be able to run vacuumdb --analyze nightly, but I'd want to
exclude the schema with short-lived tables. I'd also need a way to exclude
our partitioned tables from nightly analyze (they should run monthly only).
Maybe this could share something with this patch:
https://commitfest.postgresql.org/37/2573/
pg_dump - read data for some options from external file
The goal of that patch was to put it in a file, which isn't really needed here.
But if there were common infrastructure for matching tables, it could be
shared. The interesting part for this patch is to avoid adding separate
commandline arguments for --include-table, --exclude-table, --include-schema,
--exclude-schema (and anything else?)
On Fri, Apr 01, 2022 at 10:01:28AM -0500, Justin Pryzby wrote:
On Wed, Mar 30, 2022 at 02:22:58PM -0700, Nathan Bossart wrote:
I'm personally -1 for the --exclude-schema option. I don't see any
existing "exclude" options in vacuumdb, and the uses for such an option
seem rather limited. If we can point to specific use-cases for this
option, I might be willing to change my vote.I suggested it because I would consider using it, even though I don't currently
use the vacuumdb script at all. I think this would allow partially
retiring/simplifying our existing vacuum script.We 1) put all our partitions in a separate "child" schema (so \d is more
usable), and also 2) put some short-lived tables into their own schemas. Some
of those tables may only exist for ~1 day so I'd perfer to neither vacuum nor
analyze them (they're only used for SELECT *). But there can be a lot of them,
so a nightly job could do something like vacuumdb --schema public or vacuumdb
--exclude-schema ephemeral.Everything would be processed nightly using vacuumdb --min-xid (to keep the
monitoring system happy).The non-partitioned tables could be vacuumed nightly (without min-xid), with
--exclude ephemeral.The partitioned tables could be processed monthly with vacuumdb --analyze.
I'd also want to be able to run vacuumdb --analyze nightly, but I'd want to
exclude the schema with short-lived tables. I'd also need a way to exclude
our partitioned tables from nightly analyze (they should run monthly only).
Thanks for elaborating. I retract my -1 vote.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Le 30/03/2022 à 23:22, Nathan Bossart a écrit :
I took a look at the v4 patch.
'git-apply' complains about whitespace errors:
Fixed.
+ <para> + To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas + only in a database named <literal>xyzzy</literal>: +<screen> +<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput> +</screen></para>nitpicks: I think the phrasing should be "To only clean tables in the...".
Also, is there any reason to use a schema name with a capital letter as an
example? IMO that just adds unnecessary complexity to the example.
I have though that an example of a schema with case sensitivity was
missing in the documentation but I agree with your comment, this is
probably not he best place to do that. Fixed.
+$node->issues_sql_like( + [ 'vacuumdb', '--schema', '"Foo"', 'postgres' ], + qr/VACUUM "Foo".*/, + 'vacuumdb --schema schema only');IIUC there should only be one table in the schema. Can we avoid matching
"*" and check for the exact command instead?
Fixed.
I think there should be a few more test cases. For example, we should test
using -n and -N at the same time, and we should test what happens when
those options are used for missing schemas.
Fixed
+ /* + * When filtering on schema name, filter by table is not allowed. + * The schema name can already be set to a fqdn table name. + */ + if (tbl_count && (schemas.head != NULL)) + { + pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time"); + exit(1); + }I think there might be some useful refactoring we can do that would
simplify adding similar options in the future. Specifically, can we have a
global variable that stores the type of vacuumdb command (e.g., all,
tables, or schemas)? If so, perhaps the tables list could be renamed and
reused for schemas (and any other objects that need listing in the future).
I don't think there will be much more options like this one that will be
added to this command but anyway I have changed the patch that way.
+ if (schemas != NULL && schemas->head != NULL) + { + appendPQExpBufferStr(&catalog_query, + " AND c.relnamespace"); + if (schema_is_exclude == TRI_YES) + appendPQExpBufferStr(&catalog_query, + " OPERATOR(pg_catalog.!=) ALL (ARRAY["); + else if (schema_is_exclude == TRI_NO) + appendPQExpBufferStr(&catalog_query, + " OPERATOR(pg_catalog.=) ANY (ARRAY["); + + for (cell = schemas->head; cell != NULL; cell = cell->next) + { + appendStringLiteralConn(&catalog_query, cell->val, conn); + + if (cell->next != NULL) + appendPQExpBufferStr(&catalog_query, ", "); + } + + /* Finish formatting schema filter */ + appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n"); + }IMO we should use a CTE for specified schemas like we do for the specified
tables. I wonder if we could even have a mostly-shared CTE code path for
all vacuumdb commands with a list of names.
Fixed
- /* - * If no tables were listed, filter for the relevant relation types. If - * tables were given via --table, don't bother filtering by relation type. - * Instead, let the server decide whether a given relation can be - * processed in which case the user will know about it. - */ - if (!tables_listed) + else { + /* + * If no tables were listed, filter for the relevant relation types. If + * tables were given via --table, don't bother filtering by relation type. + * Instead, let the server decide whether a given relation can be + * processed in which case the user will know about it. + */ nitpick: This change seems unnecessary.
Fixed
Thanks for the review, all these changes are available in new version v6
of the patch and attached here.
Best regards,
--
Gilles Darold
Attachments:
0001-vacuumdb-schema-only-v6.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v6.patchDownload+210-34
On Wed, Apr 06, 2022 at 07:43:42PM +0200, Gilles Darold wrote:
Thanks for the review, all these changes are available in new version v6
of the patch and attached here.
This is failing in CI (except on macos, which is strangely passing).
http://cfbot.cputube.org/gilles-darold.html
not ok 59 - vacuumdb --schema "Foo" postgres exit code 0
# Failed test 'vacuumdb --schema "Foo" postgres exit code 0'
# at t/100_vacuumdb.pl line 151.
not ok 60 - vacuumdb --schema schema only: SQL found in server log
# Failed test 'vacuumdb --schema schema only: SQL found in server log'
# at t/100_vacuumdb.pl line 151.
# '2022-04-06 18:15:36.313 UTC [34857][not initialized] [[unknown]][:0] LOG: connection received: host=[local]
# 2022-04-06 18:15:36.314 UTC [34857][client backend] [[unknown]][3/2801:0] LOG: connection authorized: user=postgres database=postgres application_name=100_vacuumdb.pl
# 2022-04-06 18:15:36.318 UTC [34857][client backend] [100_vacuumdb.pl][3/2802:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false);
# 2022-04-06 18:15:36.586 UTC [34857][client backend] [100_vacuumdb.pl][:0] LOG: disconnection: session time: 0:00:00.273 user=postgres database=postgres host=[local]
# '
# doesn't match '(?^:VACUUM "Foo".bar)'
Le 08/04/2022 à 02:46, Justin Pryzby a écrit :
On Wed, Apr 06, 2022 at 07:43:42PM +0200, Gilles Darold wrote:
Thanks for the review, all these changes are available in new version v6
of the patch and attached here.This is failing in CI (except on macos, which is strangely passing).
http://cfbot.cputube.org/gilles-darold.htmlnot ok 59 - vacuumdb --schema "Foo" postgres exit code 0
# Failed test 'vacuumdb --schema "Foo" postgres exit code 0'
# at t/100_vacuumdb.pl line 151.
not ok 60 - vacuumdb --schema schema only: SQL found in server log# Failed test 'vacuumdb --schema schema only: SQL found in server log'
# at t/100_vacuumdb.pl line 151.
# '2022-04-06 18:15:36.313 UTC [34857][not initialized] [[unknown]][:0] LOG: connection received: host=[local]
# 2022-04-06 18:15:36.314 UTC [34857][client backend] [[unknown]][3/2801:0] LOG: connection authorized: user=postgres database=postgres application_name=100_vacuumdb.pl
# 2022-04-06 18:15:36.318 UTC [34857][client backend] [100_vacuumdb.pl][3/2802:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false);
# 2022-04-06 18:15:36.586 UTC [34857][client backend] [100_vacuumdb.pl][:0] LOG: disconnection: session time: 0:00:00.273 user=postgres database=postgres host=[local]
# '
# doesn't match '(?^:VACUUM "Foo".bar)'
I'm surprised because make check do do not reports errors running on an
Ubuntu 20.04 and CentOs 8:
t/010_clusterdb.pl ........ ok
t/011_clusterdb_all.pl .... ok
t/020_createdb.pl ......... ok
t/040_createuser.pl ....... ok
t/050_dropdb.pl ........... ok
t/070_dropuser.pl ......... ok
t/080_pg_isready.pl ....... ok
t/090_reindexdb.pl ........ ok
t/091_reindexdb_all.pl .... ok
t/100_vacuumdb.pl ......... ok
t/101_vacuumdb_all.pl ..... ok
t/102_vacuumdb_stages.pl .. ok
t/200_connstr.pl .......... ok
All tests successful.
Files=13, Tests=233, 17 wallclock secs ( 0.09 usr 0.02 sys + 6.63 cusr
2.68 csys = 9.42 CPU)
Result: PASS
In tmp_check/log/regress_log_100_vacuumdb:
# Running: vacuumdb --schema "Foo" postgres
vacuumdb: vacuuming database "postgres"
ok 59 - vacuumdb --schema "Foo" postgres exit code 0
ok 60 - vacuumdb --schema schema only: SQL found in server log
In PG log:
2022-04-08 11:01:44.519 CEST [17223] 100_vacuumdb.pl LOG: statement:
RESET search_path;
2022-04-08 11:01:44.519 CEST [17223] 100_vacuumdb.pl LOG: statement:
WITH listed_objects (object_oid, column_list) AS (
VALUES ('"Foo"'::pg_catalog.regnamespace::pg_catalog.oid,
NULL::pg_catalog.text)
)
SELECT c.relname, ns.nspname, listed_objects.column_list FROM
pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace ns ON c.relnamespace
OPERATOR(pg_catalog.=) ns.oid
LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid
OPERATOR(pg_catalog.=) t.oid
JOIN listed_objects ON listed_objects.object_oid
OPERATOR(pg_catalog.=) ns.oid
WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm'])
ORDER BY c.relpages DESC;
2022-04-08 11:01:44.521 CEST [17223] 100_vacuumdb.pl LOG: statement:
SELECT pg_catalog.set_config('search_path', '', false);
2022-04-08 11:01:44.521 CEST [17223] 100_vacuumdb.pl LOG: statement:
VACUUM "Foo".bar;
And if I run the command manually:
$ /usr/local/pgsql/bin/vacuumdb -e -h localhost --schema '"Foo"' -d
contrib_regress -U postgres
SELECT pg_catalog.set_config('search_path', '', false);
vacuumdb: vacuuming database "contrib_regress"
RESET search_path;
WITH listed_objects (object_oid, column_list) AS (
VALUES ('"Foo"'::pg_catalog.regnamespace::pg_catalog.oid,
NULL::pg_catalog.text)
)
SELECT c.relname, ns.nspname, listed_objects.column_list FROM
pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace ns ON c.relnamespace
OPERATOR(pg_catalog.=) ns.oid
LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid
OPERATOR(pg_catalog.=) t.oid
JOIN listed_objects ON listed_objects.object_oid
OPERATOR(pg_catalog.=) ns.oid
WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm'])
ORDER BY c.relpages DESC;
SELECT pg_catalog.set_config('search_path', '', false);
VACUUM "Foo".bar;
$ echo $?
0
I don't know what happen on cfbot, investigating...
--
Gilles Darold
Le 08/04/2022 à 02:46, Justin Pryzby a écrit :
On Wed, Apr 06, 2022 at 07:43:42PM +0200, Gilles Darold wrote:
Thanks for the review, all these changes are available in new version v6
of the patch and attached here.This is failing in CI (except on macos, which is strangely passing).
http://cfbot.cputube.org/gilles-darold.htmlnot ok 59 - vacuumdb --schema "Foo" postgres exit code 0
# Failed test 'vacuumdb --schema "Foo" postgres exit code 0'
# at t/100_vacuumdb.pl line 151.
not ok 60 - vacuumdb --schema schema only: SQL found in server log# Failed test 'vacuumdb --schema schema only: SQL found in server log'
# at t/100_vacuumdb.pl line 151.
# '2022-04-06 18:15:36.313 UTC [34857][not initialized] [[unknown]][:0] LOG: connection received: host=[local]
# 2022-04-06 18:15:36.314 UTC [34857][client backend] [[unknown]][3/2801:0] LOG: connection authorized: user=postgres database=postgres application_name=100_vacuumdb.pl
# 2022-04-06 18:15:36.318 UTC [34857][client backend] [100_vacuumdb.pl][3/2802:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false);
# 2022-04-06 18:15:36.586 UTC [34857][client backend] [100_vacuumdb.pl][:0] LOG: disconnection: session time: 0:00:00.273 user=postgres database=postgres host=[local]
# '
# doesn't match '(?^:VACUUM "Foo".bar)'
Ok, got it with the help of rjuju. Actually it was compiling well using
gcc but clang give some warnings. A fix of these warning makes CI happy.
Attached v7 of the patch that should pass cfbot.
--
Gilles Darold
Attachments:
0001-vacuumdb-schema-only-v7.patchtext/x-patch; charset=UTF-8; name=0001-vacuumdb-schema-only-v7.patchDownload+210-36
On Fri, Apr 08, 2022 at 05:16:06PM +0200, Gilles Darold wrote:
Attached v7 of the patch that should pass cfbot.
Thanks for the new patch! Unfortunately, it looks like some recent changes
have broken it again.
+enum trivalue schema_is_exclude = TRI_DEFAULT; + +/* + * The kind of object filter to use. '0': none, 'n': schema, 't': table + * these values correspond to the -n | -N and -t command line options. + */ +char objfilter = '0';
I think these should be combined into a single enum for simplicity and
readability (e.g., OBJFILTER_NONE, OBJFILTER_INCLUDE_SCHEMA,
OBJFILTER_EXCLUDE_SCHEMA, OBJFILTER_TABLE).
* Instead, let the server decide whether a given relation can be * processed in which case the user will know about it. */ - if (!tables_listed) + if (!objects_listed || objfilter == 'n') { appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array[" CppAsString2(RELKIND_RELATION) ", "
I think this deserveѕ a comment.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Le 11/04/2022 à 20:37, Nathan Bossart a écrit :
On Fri, Apr 08, 2022 at 05:16:06PM +0200, Gilles Darold wrote:
Attached v7 of the patch that should pass cfbot.
Thanks for the new patch! Unfortunately, it looks like some recent changes
have broken it again.+enum trivalue schema_is_exclude = TRI_DEFAULT; + +/* + * The kind of object filter to use. '0': none, 'n': schema, 't': table + * these values correspond to the -n | -N and -t command line options. + */ +char objfilter = '0';I think these should be combined into a single enum for simplicity and
readability (e.g., OBJFILTER_NONE, OBJFILTER_INCLUDE_SCHEMA,
OBJFILTER_EXCLUDE_SCHEMA, OBJFILTER_TABLE).* Instead, let the server decide whether a given relation can be * processed in which case the user will know about it. */ - if (!tables_listed) + if (!objects_listed || objfilter == 'n') { appendPQExpBufferStr(&catalog_query, " WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array[" CppAsString2(RELKIND_RELATION) ", "I think this deserveѕ a comment.
Attached v8 of the patch that tries to address the remarks above, fixes
patch apply failure to master and replace calls to pg_log_error+exit
with pg_fatal.
.Thanks.
--
Gilles Darold