[Proposal] Allow users to specify multiple tables in VACUUM commands

Started by Nathan Bossartalmost 9 years ago119 messageshackers
Jump to latest
#1Nathan Bossart
nathandbossart@gmail.com

Hi Hackers,

Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?

I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)). The attached patch requires that only one table be specified in order to specify a column list. But before I spend too much more time working on this, I thought I’d see how pgsql-hackers@ felt about this feature.

Thanks,
Nathan Bossart

Attachments:

vacuum_multiple_tables_v1.patchapplication/octet-stream; name=vacuum_multiple_tables_v1.patchDownload+125-77
#2David Fetter
david@fetter.org
In reply to: Nathan Bossart (#1)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Wed, May 10, 2017 at 08:10:48PM +0000, Bossart, Nathan wrote:

Hi Hackers,

Currently, VACUUM commands allow you to specify one table or all of
the tables in the current database to vacuum. I’ve recently found
myself wishing I could specify multiple tables in a single VACUUM
statement. For example, this would be convenient when there are
several large tables in a database and only a few need cleanup for
XID purposes. Is this a feature that the community might be
interested in?

I haven't read the implementation yet, but I've wanted this feature
several times.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#1)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

"Bossart, Nathan" <bossartn@amazon.com> writes:

Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?

I'm a bit surprised to realize that we don't allow that, since the
underlying code certainly can do it.

You realize of course that ANALYZE should grow this capability as well.

I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)).

The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#3)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

Great, I’ll keep working on this patch and will update this thread with a more complete implementation.

Nathan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#3)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Thu, May 11, 2017 at 6:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Bossart, Nathan" <bossartn@amazon.com> writes:

Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?

I'm a bit surprised to realize that we don't allow that, since the
underlying code certainly can do it.

You realize of course that ANALYZE should grow this capability as well.

Yup. It is just a matter of extending ExecVacuum() to handle a list of
qualified names with a quick look at the grammar as we are talking
only about manual commands. One question I am wondering though is do
we want to have everything happening in the same transaction? I would
say yes to that to simplify the code. I think that VERBOSE should also
report the per-table information, so this can be noisy with many
tables but that's more helpful than gathering all the results.

I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)).

The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.

The grammar allows that by the way:
=# VACUUM (full) aa (a);
VACUUM
Perhaps that's an oversight? I don't think it makes much sense.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#5)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

Michael Paquier <michael.paquier@gmail.com> writes:

On Thu, May 11, 2017 at 6:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.

The grammar allows that by the way:
=# VACUUM (full) aa (a);
VACUUM
Perhaps that's an oversight? I don't think it makes much sense.

It would be hard to reject at the grammar level, and not very friendly
either because you'd only get "syntax error". We could certainly make
the runtime code throw an error if you gave a column list without saying
ANALYZE. But on the other hand, why bother? I do not remember ever
seeing a question that boiled down to somebody being confused by this.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#5)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Thu, May 11, 2017 at 11:13 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Thu, May 11, 2017 at 6:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Bossart, Nathan" <bossartn@amazon.com> writes:

Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?

I'm a bit surprised to realize that we don't allow that, since the
underlying code certainly can do it.

You realize of course that ANALYZE should grow this capability as well.

Yup. It is just a matter of extending ExecVacuum() to handle a list of
qualified names with a quick look at the grammar as we are talking
only about manual commands. One question I am wondering though is do
we want to have everything happening in the same transaction? I would
say yes to that to simplify the code. I think that VERBOSE should also
report the per-table information, so this can be noisy with many
tables but that's more helpful than gathering all the results.

I agree to report per-table information. Especially In case of one of
tables specified failed during vacuuming, I think we should report at
least information of tables that is done successfully so far.

I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)).

The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.

The grammar allows that by the way:
=# VACUUM (full) aa (a);
VACUUM
Perhaps that's an oversight? I don't think it makes much sense.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Nathan Bossart
nathandbossart@gmail.com
In reply to: Masahiko Sawada (#7)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On 5/10/17, 8:10 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

I agree to report per-table information. Especially In case of one of
tables specified failed during vacuuming, I think we should report at
least information of tables that is done successfully so far.

+1

I believe you already get all per-table information when you do not specify a table name (“VACUUM VERBOSE;”), so I would expect to get this for free as long as we are building this into the existing ExecVacuum(…) and vacuum(…) functions.

Nathan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#6)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Thu, May 11, 2017 at 11:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It would be hard to reject at the grammar level, and not very friendly
either because you'd only get "syntax error". We could certainly make
the runtime code throw an error if you gave a column list without saying
ANALYZE. But on the other hand, why bother? I do not remember ever
seeing a question that boiled down to somebody being confused by this.

The docs also say that adding a column list implies an ANALYZE even if
other keywords are added, and that's the case. Sorry for the noise.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#8)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

Hi again,

Attached is a more complete first attempt at adding this functionality. I added two node types: one for parsing the “relation and columns” list in the grammar, and one for holding the relation information we need for each call to vacuum_rel(…)/analyze_rel(…). I also added assertions and comments for some undocumented assumptions that we currently rely upon.

Adjustments to the documentation for VACUUM/ANALYZE and new checks in the VACUUM regression test are included in this patch as well.

Looking forward to any feedback that you have.

Nathan

Attachments:

vacuum_multiple_tables_v2.patchapplication/octet-stream; name=vacuum_multiple_tables_v2.patchDownload+294-117
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#10)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

"Bossart, Nathan" <bossartn@amazon.com> writes:

Looking forward to any feedback that you have.

You probably won't get much in the near term, because we're in
stabilize-the-release mode not develop-new-features mode.
Please add your patch to the pending commitfest
https://commitfest.postgresql.org/14/
so that we remember to look at it when the time comes.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#11)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On 5/11/17, 6:32 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

You probably won't get much in the near term, because we're in
stabilize-the-release mode not develop-new-features mode.
Please add your patch to the pending commitfest
https://commitfest.postgresql.org/14/
so that we remember to look at it when the time comes.

Understood. I’ve added it to the commitfest.

Nathan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#10)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Fri, May 12, 2017 at 9:47 AM, Bossart, Nathan <bossartn@amazon.com> wrote:

Attached is a more complete first attempt at adding this functionality. I added two node types: one for parsing the “relation and columns” list in the grammar, and one for holding the relation information we need for each call to vacuum_rel(…)/analyze_rel(…). I also added assertions and comments for some undocumented assumptions that we currently rely upon.

Adjustments to the documentation for VACUUM/ANALYZE and new checks in the VACUUM regression test are included in this patch as well.

Looking forward to any feedback that you have.

Browsing the code....

<synopsis>
-ANALYZE [ VERBOSE ] [ <replaceable
class="PARAMETER">table_name</replaceable> [ ( <replaceable
class="PARAMETER">column_name</replaceable> [, ...] ) ] ]
+ANALYZE [ VERBOSE ] [ <replaceable
class="PARAMETER">table_name</replaceable> [ [ ( <replaceable
class="PARAMETER">column_name</replaceable> [, ...] ) ] [, ...] ]
</synopsis>
It seems to me that you don't need those extra square brackets around
the column list. Same remark for vacuum.sgml.

     <listitem>
      <para>
-      The name (possibly schema-qualified) of a specific table to
+      The name (possibly schema-qualified) of the specific tables to
       analyze.  If omitted, all regular tables, partitioned tables, and
       materialized views in the current database are analyzed (but not
-      foreign tables).  If the specified table is a partitioned table, both the
+      foreign tables).  If a specified table is a partitioned table, both the
       inheritance statistics of the partitioned table as a whole and
       statistics of the individual partitions are updated.
      </para>
Don't think that's needed. table_name is still referencing a single
table name. And similar remark for vacuum.sgml.

In short for all that, it is enough to have "[, ... ]" to document
that a list is accepted.

    /* Now go through the common routine */
-   vacuum(vacstmt->options, vacstmt->relation, InvalidOid, &params,
-          vacstmt->va_cols, NULL, isTopLevel);
+   vacuum(vacstmt->options, vacstmt->rels, InvalidOid, &params,
+          NULL, isTopLevel);
 }
It seems to me that it would have been less invasive to loop through
vacuum() for each relation. Do you foresee advantages in allowing
vacuum() to handle multiple? I am not sure if is is worth complicating
the current logic more considering that you have as well so extra
logic to carry on option values.
+ * used for error messages.  In the case that relid is valid, rels
+ * must have exactly one element corresponding to the specified relid.
s/rels/relations/ as variable name?
-- 
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#13)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On 5/11/17, 7:20 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:

Browsing the code....

Thanks for taking a look.

It seems to me that you don't need those extra square brackets around
the column list. Same remark for vacuum.sgml.

I’ll remove them. My intent was to ensure that we didn’t accidentally suggest that statements like “VACUUM , foo, bar;” were accepted, but the extra brackets don’t seem to fix that, and I don’t foresee much confusion anyway.

In short for all that, it is enough to have "[, ... ]" to document
that a list is accepted.

That makes sense, I’ll fix it.

It seems to me that it would have been less invasive to loop through
vacuum() for each relation. Do you foresee advantages in allowing
vacuum() to handle multiple? I am not sure if is is worth complicating
the current logic more considering that you have as well so extra
logic to carry on option values.

That was the approach I first prototyped. The main disadvantage that I found was that the command wouldn’t fail-fast if one of the tables or columns didn’t exist, and I thought that it might be frustrating to encounter such an error in the middle of vacuuming several large tables. It’s easy enough to change the logic to emit a warning and simply move on to the next table, but that seemed like it could be easily missed among the rest of the vacuum log statements (especially with the verbose option specified). What are your thoughts on this?

In the spirit of simplifying things a bit, I do think it is possible to eliminate one of the new node types, since the fields for each are almost identical.

+ * used for error messages.  In the case that relid is valid, rels
+ * must have exactly one element corresponding to the specified relid.
s/rels/relations/ as variable name?

Agreed, that seems nicer.

Nathan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#14)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Fri, May 12, 2017 at 1:06 PM, Bossart, Nathan <bossartn@amazon.com> wrote:

On 5/11/17, 7:20 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:

It seems to me that it would have been less invasive to loop through
vacuum() for each relation. Do you foresee advantages in allowing
vacuum() to handle multiple? I am not sure if is is worth complicating
the current logic more considering that you have as well so extra
logic to carry on option values.

That was the approach I first prototyped. The main disadvantage that I found was that the command wouldn’t fail-fast if one of the tables or columns didn’t exist, and I thought that it might be frustrating to encounter such an error in the middle of vacuuming several large tables. It’s easy enough to change the logic to emit a warning and simply move on to the next table, but that seemed like it could be easily missed among the rest of the vacuum log statements (especially with the verbose option specified). What are your thoughts on this?

Hm. If multiple tables are specified and that some of them take a long
time, it could be possible that an error still happens if the
definition of one of those tables changes while VACUUM is in the
middle of running. And this makes moot the error checks that happened
at first step. So it seems to me that we definitely should have a
WARNING if multiple tables are defined anyway, and that to avoid code
duplication we may want to just do those checks once, before
processing one of the listed tables. It is true that is would be easy
to miss a WARNING in the VERBOSE logs, but issuing an ERROR would
really be frustrating in the middle of a nightly run of VACUUM.

In the spirit of simplifying things a bit, I do think it is possible to eliminate one of the new node types, since the fields for each are almost identical.

Two looks too much for a code just aiming at scaling up vacuum to
handle N items. It may be possible to make things even more simple,
but I have not put much thoughts into that to be honest.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#15)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

I’ve attached a v3 patch that addresses your feedback:

Changes:
- removed extra square brackets in documentation changes
- removed unnecessary documentation changes for parameter list
- eliminated one of the new node types
- renamed ‘rel’ argument to ‘relations’ in vacuum(…)
- moved relations list to vacuum memory context in vacuum(…)
- minor addition to VACUUM regression test
- rebased with master

On 5/15/17, 11:00 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:

Hm. If multiple tables are specified and that some of them take a long
time, it could be possible that an error still happens if the
definition of one of those tables changes while VACUUM is in the
middle of running. And this makes moot the error checks that happened
at first step. So it seems to me that we definitely should have a
WARNING if multiple tables are defined anyway, and that to avoid code
duplication we may want to just do those checks once, before
processing one of the listed tables. It is true that is would be easy
to miss a WARNING in the VERBOSE logs, but issuing an ERROR would
really be frustrating in the middle of a nightly run of VACUUM.

I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:

/*
* Since we don't take a lock here, the relation might be gone, or the
* RangeVar might no longer refer to the OID we look up here. In the
* former case, VACUUM will do nothing; in the latter case, it will
* process the OID we looked up here, rather than the new one. Neither
* is ideal, but there's little practical alternative, since we're
* going to commit this transaction and begin a new one between now
* and then.
*/
relid = RangeVarGetRelid(vacrel, NoLock, false);

With the patch applied, I believe this statement still holds true. So if the relation disappears before we get to vacuum_rel(…), we will simply skip it and move on to the next one. The vacuum_rel(…) code provides a WARNING in many cases (e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it disappears before the call to vacuum_rel(…). If we added a WARNING to the effect of “skipping vacuum of <table_name> — relation no longer exists” for this case, I think what you are suggesting would be satisfied.

However, ANALYZE has a slight caveat. While analyze_rel(…) silently skips the relation if it no longer exists like vacuum_rel(…) does, we do not pre-validate the columns list at all. So, in an ANALYZE statement with multiple tables and columns specified, it’ll only fail once we get to the undefined column. To fix this, we could add a check for the column lists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the meantime.

Does this seem like a sane approach?

1. Emit WARNING when skipping if relation disappears before we get to it.
2. Early in vacuum(…), check that the specified columns exist.
3. Emit WARNING and skip any specified columns that vanish before processing.

Nathan

Attachments:

vacuum_multiple_tables_v3.patchapplication/octet-stream; name=vacuum_multiple_tables_v3.patchDownload+264-140
#17Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#16)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Wed, May 17, 2017 at 7:56 AM, Bossart, Nathan <bossartn@amazon.com> wrote:

I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:

/*
* Since we don't take a lock here, the relation might be gone, or the
* RangeVar might no longer refer to the OID we look up here. In the
* former case, VACUUM will do nothing; in the latter case, it will
* process the OID we looked up here, rather than the new one. Neither
* is ideal, but there's little practical alternative, since we're
* going to commit this transaction and begin a new one between now
* and then.
*/
relid = RangeVarGetRelid(vacrel, NoLock, false);

With the patch applied, I believe this statement still holds true. So if the relation disappears before we get to vacuum_rel(…), we will simply skip it and move on to the next one. The vacuum_rel(…) code provides a WARNING in many cases (e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it disappears before the call to vacuum_rel(…).

Yes, that's the bits using try_relation_open() which returns NULL if
the relation is gone. I don't think that we want VACUUM to be noisy
about that when running it on a database.

If we added a WARNING to the effect of “skipping vacuum of <table_name> — relation no longer exists” for this case, I think what you are suggesting would be satisfied.

We would do no favor by reporting nothing to the user. Without any
information, the user triggering a manual vacuum may believe that the
relation has been vacuumed as it was listed but that would not be the
case.

However, ANALYZE has a slight caveat. While analyze_rel(…) silently skips the relation if it no longer exists like vacuum_rel(…) does, we do not pre-validate the columns list at all. So, in an ANALYZE statement with multiple tables and columns specified, it’ll only fail once we get to the undefined column. To fix this, we could add a check for the column lists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the meantime.

Does this seem like a sane approach?

1. Emit WARNING when skipping if relation disappears before we get to it.
2. Early in vacuum(…), check that the specified columns exist.

And issue an ERROR, right?

3. Emit WARNING and skip any specified columns that vanish before processing.

This looks like a sensible approach to me.

+           RelationAndColumns *relinfo = (RelationAndColumns *)
lfirst(relation);
+           int per_table_opts = options | relinfo->options;  /*
VACOPT_ANALYZE may be set per-table */
+           ListCell *oid;
I have just noticed this bit in your patch. So you can basically do
something like that:
VACUUM (ANALYZE) foo, (FULL) bar;
Do we really want to go down to this level of control? I would keep
things a bit more restricted as users may be confused by the different
lock levels involved by each operation, and make use of the same
options for all the relations listed. Opinions from others is welcome.
-- 
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#17)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On 5/16/17, 11:21 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:

On Wed, May 17, 2017 at 7:56 AM, Bossart, Nathan <bossartn@amazon.com> wrote:

I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:

/*
* Since we don't take a lock here, the relation might be gone, or the
* RangeVar might no longer refer to the OID we look up here. In the
* former case, VACUUM will do nothing; in the latter case, it will
* process the OID we looked up here, rather than the new one. Neither
* is ideal, but there's little practical alternative, since we're
* going to commit this transaction and begin a new one between now
* and then.
*/
relid = RangeVarGetRelid(vacrel, NoLock, false);

With the patch applied, I believe this statement still holds true. So if the relation disappears before we get to vacuum_rel(…), we will simply skip it and move on to the next one. The vacuum_rel(…) code provides a WARNING in many cases (e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it disappears before the call to vacuum_rel(…).

Yes, that's the bits using try_relation_open() which returns NULL if
the relation is gone. I don't think that we want VACUUM to be noisy
about that when running it on a database.

Agreed.

If we added a WARNING to the effect of “skipping vacuum of <table_name> — relation no longer exists” for this case, I think what you are suggesting would be satisfied.

We would do no favor by reporting nothing to the user. Without any
information, the user triggering a manual vacuum may believe that the
relation has been vacuumed as it was listed but that would not be the
case.

Agreed. Unfortunately, I think this is already possible when you specify a table to be vacuumed.

However, ANALYZE has a slight caveat. While analyze_rel(…) silently skips the relation if it no longer exists like vacuum_rel(…) does, we do not pre-validate the columns list at all. So, in an ANALYZE statement with multiple tables and columns specified, it’ll only fail once we get to the undefined column. To fix this, we could add a check for the column lists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the meantime.

Does this seem like a sane approach?

1. Emit WARNING when skipping if relation disappears before we get to it.
2. Early in vacuum(…), check that the specified columns exist.

And issue an ERROR, right?

Correct. This means that both the relations and columns specified would cause an ERROR if they do not exist and a WARNING if they disappear before we can actually process them.

+           RelationAndColumns *relinfo = (RelationAndColumns *)
lfirst(relation);
+           int per_table_opts = options | relinfo->options;  /*
VACOPT_ANALYZE may be set per-table */
+           ListCell *oid;
I have just noticed this bit in your patch. So you can basically do
something like that:
VACUUM (ANALYZE) foo, (FULL) bar;
Do we really want to go down to this level of control? I would keep
things a bit more restricted as users may be confused by the different
lock levels involved by each operation, and make use of the same
options for all the relations listed. Opinions from others is welcome.

I agree with you here, too. I stopped short of allowing customers to explicitly provide per-table options, so the example you provided wouldn’t work here. This is more applicable for something like the following:

VACUUM (FREEZE, VERBOSE) foo, bar (a);

In this case, the FREEZE and VERBOSE options are used for both tables. However, we have a column list specified for ‘bar’, and the ANALYZE option is implied when we specify a column list. So when we process ‘bar’, we need to apply the ANALYZE option, but we do not need it for ‘foo’. For now, that is all that this per-table options variable is used for.

Nathan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#18)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

I’ve attached a v4 of the patch.

Changes:
- Early in vacuum(…), emit an ERROR if any specified columns do not exist.
- Emit a WARNING and skip any specified tables or columns that disappear before they are actually processed.
- Small additions to the VACUUM regression test.
- Rebased with master.

Nathan

Attachments:

vacuum_multiple_tables_v4.patchapplication/octet-stream; name=vacuum_multiple_tables_v4.patchDownload+367-142
#20Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#18)
Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

On Thu, May 18, 2017 at 12:06 AM, Bossart, Nathan <bossartn@amazon.com> wrote:

I agree with you here, too. I stopped short of allowing customers to explicitly provide per-table options, so the example you provided wouldn’t work here. This is more applicable for something like the following:

VACUUM (FREEZE, VERBOSE) foo, bar (a);

In this case, the FREEZE and VERBOSE options are used for both tables. However, we have a column list specified for ‘bar’, and the ANALYZE option is implied when we specify a column list. So when we process ‘bar’, we need to apply the ANALYZE option, but we do not need it for ‘foo’. For now, that is all that this per-table options variable is used for.

Hm. One argument can be made here: having a column list defined in one
of the tables implies that ANALYZE is enforced for all the relations
listed instead of doing that only on the relations listing columns.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Nathan Bossart (#19)
#22Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#20)
#23Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#22)
#24Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#23)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#25)
#27Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Tom Lane (#26)
#28Nathan Bossart
nathandbossart@gmail.com
In reply to: Masahiko Sawada (#27)
#29Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#26)
#30Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#29)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#30)
#32Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#31)
#33Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#33)
#35Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#34)
#36Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#35)
#37Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#36)
#38Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#37)
#39Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#38)
#40Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#40)
#42Nathan Bossart
nathandbossart@gmail.com
In reply to: Robert Haas (#41)
#43Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#42)
#44Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#43)
#45Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#44)
#46Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#43)
#47Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#46)
#48Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#47)
#49Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#48)
#50Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#49)
#51Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#50)
#52Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#51)
#53David G. Johnston
david.g.johnston@gmail.com
In reply to: Nathan Bossart (#52)
#54Michael Paquier
michael@paquier.xyz
In reply to: David G. Johnston (#53)
#55Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#54)
#56Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Nathan Bossart (#55)
#57Nathan Bossart
nathandbossart@gmail.com
In reply to: Masahiko Sawada (#56)
#58Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#57)
#59Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#58)
#60Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#59)
#61Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#60)
#62Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#61)
#63Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#62)
#64Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#63)
#65Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#64)
#66Simon Riggs
simon@2ndQuadrant.com
In reply to: Michael Paquier (#63)
#67Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#65)
#68Nathan Bossart
nathandbossart@gmail.com
In reply to: Simon Riggs (#66)
#69Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#68)
#70Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#63)
#71Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#69)
#72Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#70)
#73Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#72)
#74Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#73)
#75Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#74)
#76Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#75)
#77Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#76)
#78Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#77)
#79Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#78)
#80Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#79)
#81Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#80)
#82Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#81)
#83Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#82)
#84Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#83)
#85Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#84)
#86Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#85)
#87Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#85)
#88Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#87)
#89Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#86)
#90Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#88)
#91Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#90)
#92Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#86)
#93Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#92)
#94Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#93)
#95Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#94)
#96Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#95)
#97Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#96)
#98Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#97)
#99Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#98)
#100Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#99)
#101Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#100)
#102Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#101)
#103Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#102)
#104Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#103)
#105Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#104)
#106Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#105)
#107Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#106)
#108Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#107)
#109Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#108)
#110Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#109)
#111Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#110)
#112Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#111)
#113Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#112)
#114Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#113)
#115Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#114)
#116Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#115)
#117Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#115)
#118Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#117)
#119Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#118)