[Proposal] Allow users to specify multiple tables in VACUUM commands
Hi Hackers,
Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?
I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)). The attached patch requires that only one table be specified in order to specify a column list. But before I spend too much more time working on this, I thought I’d see how pgsql-hackers@ felt about this feature.
Thanks,
Nathan Bossart
Attachments:
vacuum_multiple_tables_v1.patchapplication/octet-stream; name=vacuum_multiple_tables_v1.patchDownload+125-77
On Wed, May 10, 2017 at 08:10:48PM +0000, Bossart, Nathan wrote:
Hi Hackers,
Currently, VACUUM commands allow you to specify one table or all of
the tables in the current database to vacuum. I’ve recently found
myself wishing I could specify multiple tables in a single VACUUM
statement. For example, this would be convenient when there are
several large tables in a database and only a few need cleanup for
XID purposes. Is this a feature that the community might be
interested in?
I haven't read the implementation yet, but I've wanted this feature
several times.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
"Bossart, Nathan" <bossartn@amazon.com> writes:
Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?
I'm a bit surprised to realize that we don't allow that, since the
underlying code certainly can do it.
You realize of course that ANALYZE should grow this capability as well.
I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)).
The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Great, I’ll keep working on this patch and will update this thread with a more complete implementation.
Nathan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, May 11, 2017 at 6:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Bossart, Nathan" <bossartn@amazon.com> writes:
Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?
I'm a bit surprised to realize that we don't allow that, since the
underlying code certainly can do it.You realize of course that ANALYZE should grow this capability as well.
Yup. It is just a matter of extending ExecVacuum() to handle a list of
qualified names with a quick look at the grammar as we are talking
only about manual commands. One question I am wondering though is do
we want to have everything happening in the same transaction? I would
say yes to that to simplify the code. I think that VERBOSE should also
report the per-table information, so this can be noisy with many
tables but that's more helpful than gathering all the results.
I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)).
The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.
The grammar allows that by the way:
=# VACUUM (full) aa (a);
VACUUM
Perhaps that's an oversight? I don't think it makes much sense.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
On Thu, May 11, 2017 at 6:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.
The grammar allows that by the way:
=# VACUUM (full) aa (a);
VACUUM
Perhaps that's an oversight? I don't think it makes much sense.
It would be hard to reject at the grammar level, and not very friendly
either because you'd only get "syntax error". We could certainly make
the runtime code throw an error if you gave a column list without saying
ANALYZE. But on the other hand, why bother? I do not remember ever
seeing a question that boiled down to somebody being confused by this.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, May 11, 2017 at 11:13 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Thu, May 11, 2017 at 6:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Bossart, Nathan" <bossartn@amazon.com> writes:
Currently, VACUUM commands allow you to specify one table or all of the tables in the current database to vacuum. I’ve recently found myself wishing I could specify multiple tables in a single VACUUM statement. For example, this would be convenient when there are several large tables in a database and only a few need cleanup for XID purposes. Is this a feature that the community might be interested in?
I'm a bit surprised to realize that we don't allow that, since the
underlying code certainly can do it.You realize of course that ANALYZE should grow this capability as well.
Yup. It is just a matter of extending ExecVacuum() to handle a list of
qualified names with a quick look at the grammar as we are talking
only about manual commands. One question I am wondering though is do
we want to have everything happening in the same transaction? I would
say yes to that to simplify the code. I think that VERBOSE should also
report the per-table information, so this can be noisy with many
tables but that's more helpful than gathering all the results.
I agree to report per-table information. Especially In case of one of
tables specified failed during vacuuming, I think we should report at
least information of tables that is done successfully so far.
I’ve attached my first attempt at introducing this functionality. In the patch, I’ve extended the table_name parameter in the VACUUM grammar to a qualified_name_list. While this fits into the grammar decently well, I suspect that it may be desirable to be able to specify a column list for each table as well (e.g. VACUUM foo (a), bar (b)).
The column list only matters for ANALYZE (or VACUUM ANALYZE). But yes,
it should be per-table.The grammar allows that by the way:
=# VACUUM (full) aa (a);
VACUUM
Perhaps that's an oversight? I don't think it makes much sense.
--
Michael--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/10/17, 8:10 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
I agree to report per-table information. Especially In case of one of
tables specified failed during vacuuming, I think we should report at
least information of tables that is done successfully so far.
+1
I believe you already get all per-table information when you do not specify a table name (“VACUUM VERBOSE;”), so I would expect to get this for free as long as we are building this into the existing ExecVacuum(…) and vacuum(…) functions.
Nathan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, May 11, 2017 at 11:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
It would be hard to reject at the grammar level, and not very friendly
either because you'd only get "syntax error". We could certainly make
the runtime code throw an error if you gave a column list without saying
ANALYZE. But on the other hand, why bother? I do not remember ever
seeing a question that boiled down to somebody being confused by this.
The docs also say that adding a column list implies an ANALYZE even if
other keywords are added, and that's the case. Sorry for the noise.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi again,
Attached is a more complete first attempt at adding this functionality. I added two node types: one for parsing the “relation and columns” list in the grammar, and one for holding the relation information we need for each call to vacuum_rel(…)/analyze_rel(…). I also added assertions and comments for some undocumented assumptions that we currently rely upon.
Adjustments to the documentation for VACUUM/ANALYZE and new checks in the VACUUM regression test are included in this patch as well.
Looking forward to any feedback that you have.
Nathan
Attachments:
vacuum_multiple_tables_v2.patchapplication/octet-stream; name=vacuum_multiple_tables_v2.patchDownload+294-117
"Bossart, Nathan" <bossartn@amazon.com> writes:
Looking forward to any feedback that you have.
You probably won't get much in the near term, because we're in
stabilize-the-release mode not develop-new-features mode.
Please add your patch to the pending commitfest
https://commitfest.postgresql.org/14/
so that we remember to look at it when the time comes.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/11/17, 6:32 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
You probably won't get much in the near term, because we're in
stabilize-the-release mode not develop-new-features mode.
Please add your patch to the pending commitfest
https://commitfest.postgresql.org/14/
so that we remember to look at it when the time comes.
Understood. I’ve added it to the commitfest.
Nathan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 12, 2017 at 9:47 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
Attached is a more complete first attempt at adding this functionality. I added two node types: one for parsing the “relation and columns” list in the grammar, and one for holding the relation information we need for each call to vacuum_rel(…)/analyze_rel(…). I also added assertions and comments for some undocumented assumptions that we currently rely upon.
Adjustments to the documentation for VACUUM/ANALYZE and new checks in the VACUUM regression test are included in this patch as well.
Looking forward to any feedback that you have.
Browsing the code....
<synopsis>
-ANALYZE [ VERBOSE ] [ <replaceable
class="PARAMETER">table_name</replaceable> [ ( <replaceable
class="PARAMETER">column_name</replaceable> [, ...] ) ] ]
+ANALYZE [ VERBOSE ] [ <replaceable
class="PARAMETER">table_name</replaceable> [ [ ( <replaceable
class="PARAMETER">column_name</replaceable> [, ...] ) ] [, ...] ]
</synopsis>
It seems to me that you don't need those extra square brackets around
the column list. Same remark for vacuum.sgml.
<listitem>
<para>
- The name (possibly schema-qualified) of a specific table to
+ The name (possibly schema-qualified) of the specific tables to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
+ foreign tables). If a specified table is a partitioned table, both the
inheritance statistics of the partitioned table as a whole and
statistics of the individual partitions are updated.
</para>
Don't think that's needed. table_name is still referencing a single
table name. And similar remark for vacuum.sgml.
In short for all that, it is enough to have "[, ... ]" to document
that a list is accepted.
/* Now go through the common routine */
- vacuum(vacstmt->options, vacstmt->relation, InvalidOid, ¶ms,
- vacstmt->va_cols, NULL, isTopLevel);
+ vacuum(vacstmt->options, vacstmt->rels, InvalidOid, ¶ms,
+ NULL, isTopLevel);
}
It seems to me that it would have been less invasive to loop through
vacuum() for each relation. Do you foresee advantages in allowing
vacuum() to handle multiple? I am not sure if is is worth complicating
the current logic more considering that you have as well so extra
logic to carry on option values.
+ * used for error messages. In the case that relid is valid, rels
+ * must have exactly one element corresponding to the specified relid.
s/rels/relations/ as variable name?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/11/17, 7:20 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
Browsing the code....
Thanks for taking a look.
It seems to me that you don't need those extra square brackets around
the column list. Same remark for vacuum.sgml.
I’ll remove them. My intent was to ensure that we didn’t accidentally suggest that statements like “VACUUM , foo, bar;” were accepted, but the extra brackets don’t seem to fix that, and I don’t foresee much confusion anyway.
In short for all that, it is enough to have "[, ... ]" to document
that a list is accepted.
That makes sense, I’ll fix it.
It seems to me that it would have been less invasive to loop through
vacuum() for each relation. Do you foresee advantages in allowing
vacuum() to handle multiple? I am not sure if is is worth complicating
the current logic more considering that you have as well so extra
logic to carry on option values.
That was the approach I first prototyped. The main disadvantage that I found was that the command wouldn’t fail-fast if one of the tables or columns didn’t exist, and I thought that it might be frustrating to encounter such an error in the middle of vacuuming several large tables. It’s easy enough to change the logic to emit a warning and simply move on to the next table, but that seemed like it could be easily missed among the rest of the vacuum log statements (especially with the verbose option specified). What are your thoughts on this?
In the spirit of simplifying things a bit, I do think it is possible to eliminate one of the new node types, since the fields for each are almost identical.
+ * used for error messages. In the case that relid is valid, rels + * must have exactly one element corresponding to the specified relid. s/rels/relations/ as variable name?
Agreed, that seems nicer.
Nathan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 12, 2017 at 1:06 PM, Bossart, Nathan <bossartn@amazon.com> wrote:
On 5/11/17, 7:20 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
It seems to me that it would have been less invasive to loop through
vacuum() for each relation. Do you foresee advantages in allowing
vacuum() to handle multiple? I am not sure if is is worth complicating
the current logic more considering that you have as well so extra
logic to carry on option values.That was the approach I first prototyped. The main disadvantage that I found was that the command wouldn’t fail-fast if one of the tables or columns didn’t exist, and I thought that it might be frustrating to encounter such an error in the middle of vacuuming several large tables. It’s easy enough to change the logic to emit a warning and simply move on to the next table, but that seemed like it could be easily missed among the rest of the vacuum log statements (especially with the verbose option specified). What are your thoughts on this?
Hm. If multiple tables are specified and that some of them take a long
time, it could be possible that an error still happens if the
definition of one of those tables changes while VACUUM is in the
middle of running. And this makes moot the error checks that happened
at first step. So it seems to me that we definitely should have a
WARNING if multiple tables are defined anyway, and that to avoid code
duplication we may want to just do those checks once, before
processing one of the listed tables. It is true that is would be easy
to miss a WARNING in the VERBOSE logs, but issuing an ERROR would
really be frustrating in the middle of a nightly run of VACUUM.
In the spirit of simplifying things a bit, I do think it is possible to eliminate one of the new node types, since the fields for each are almost identical.
Two looks too much for a code just aiming at scaling up vacuum to
handle N items. It may be possible to make things even more simple,
but I have not put much thoughts into that to be honest.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I’ve attached a v3 patch that addresses your feedback:
Changes:
- removed extra square brackets in documentation changes
- removed unnecessary documentation changes for parameter list
- eliminated one of the new node types
- renamed ‘rel’ argument to ‘relations’ in vacuum(…)
- moved relations list to vacuum memory context in vacuum(…)
- minor addition to VACUUM regression test
- rebased with master
On 5/15/17, 11:00 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
Hm. If multiple tables are specified and that some of them take a long
time, it could be possible that an error still happens if the
definition of one of those tables changes while VACUUM is in the
middle of running. And this makes moot the error checks that happened
at first step. So it seems to me that we definitely should have a
WARNING if multiple tables are defined anyway, and that to avoid code
duplication we may want to just do those checks once, before
processing one of the listed tables. It is true that is would be easy
to miss a WARNING in the VERBOSE logs, but issuing an ERROR would
really be frustrating in the middle of a nightly run of VACUUM.
I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:
/*
* Since we don't take a lock here, the relation might be gone, or the
* RangeVar might no longer refer to the OID we look up here. In the
* former case, VACUUM will do nothing; in the latter case, it will
* process the OID we looked up here, rather than the new one. Neither
* is ideal, but there's little practical alternative, since we're
* going to commit this transaction and begin a new one between now
* and then.
*/
relid = RangeVarGetRelid(vacrel, NoLock, false);
With the patch applied, I believe this statement still holds true. So if the relation disappears before we get to vacuum_rel(…), we will simply skip it and move on to the next one. The vacuum_rel(…) code provides a WARNING in many cases (e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it disappears before the call to vacuum_rel(…). If we added a WARNING to the effect of “skipping vacuum of <table_name> — relation no longer exists” for this case, I think what you are suggesting would be satisfied.
However, ANALYZE has a slight caveat. While analyze_rel(…) silently skips the relation if it no longer exists like vacuum_rel(…) does, we do not pre-validate the columns list at all. So, in an ANALYZE statement with multiple tables and columns specified, it’ll only fail once we get to the undefined column. To fix this, we could add a check for the column lists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the meantime.
Does this seem like a sane approach?
1. Emit WARNING when skipping if relation disappears before we get to it.
2. Early in vacuum(…), check that the specified columns exist.
3. Emit WARNING and skip any specified columns that vanish before processing.
Nathan
Attachments:
vacuum_multiple_tables_v3.patchapplication/octet-stream; name=vacuum_multiple_tables_v3.patchDownload+264-140
On Wed, May 17, 2017 at 7:56 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:
/*
* Since we don't take a lock here, the relation might be gone, or the
* RangeVar might no longer refer to the OID we look up here. In the
* former case, VACUUM will do nothing; in the latter case, it will
* process the OID we looked up here, rather than the new one. Neither
* is ideal, but there's little practical alternative, since we're
* going to commit this transaction and begin a new one between now
* and then.
*/
relid = RangeVarGetRelid(vacrel, NoLock, false);With the patch applied, I believe this statement still holds true. So if the relation disappears before we get to vacuum_rel(…), we will simply skip it and move on to the next one. The vacuum_rel(…) code provides a WARNING in many cases (e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it disappears before the call to vacuum_rel(…).
Yes, that's the bits using try_relation_open() which returns NULL if
the relation is gone. I don't think that we want VACUUM to be noisy
about that when running it on a database.
If we added a WARNING to the effect of “skipping vacuum of <table_name> — relation no longer exists” for this case, I think what you are suggesting would be satisfied.
We would do no favor by reporting nothing to the user. Without any
information, the user triggering a manual vacuum may believe that the
relation has been vacuumed as it was listed but that would not be the
case.
However, ANALYZE has a slight caveat. While analyze_rel(…) silently skips the relation if it no longer exists like vacuum_rel(…) does, we do not pre-validate the columns list at all. So, in an ANALYZE statement with multiple tables and columns specified, it’ll only fail once we get to the undefined column. To fix this, we could add a check for the column lists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the meantime.
Does this seem like a sane approach?
1. Emit WARNING when skipping if relation disappears before we get to it.
2. Early in vacuum(…), check that the specified columns exist.
And issue an ERROR, right?
3. Emit WARNING and skip any specified columns that vanish before processing.
This looks like a sensible approach to me.
+ RelationAndColumns *relinfo = (RelationAndColumns *)
lfirst(relation);
+ int per_table_opts = options | relinfo->options; /*
VACOPT_ANALYZE may be set per-table */
+ ListCell *oid;
I have just noticed this bit in your patch. So you can basically do
something like that:
VACUUM (ANALYZE) foo, (FULL) bar;
Do we really want to go down to this level of control? I would keep
things a bit more restricted as users may be confused by the different
lock levels involved by each operation, and make use of the same
options for all the relations listed. Opinions from others is welcome.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/16/17, 11:21 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
On Wed, May 17, 2017 at 7:56 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:
/*
* Since we don't take a lock here, the relation might be gone, or the
* RangeVar might no longer refer to the OID we look up here. In the
* former case, VACUUM will do nothing; in the latter case, it will
* process the OID we looked up here, rather than the new one. Neither
* is ideal, but there's little practical alternative, since we're
* going to commit this transaction and begin a new one between now
* and then.
*/
relid = RangeVarGetRelid(vacrel, NoLock, false);With the patch applied, I believe this statement still holds true. So if the relation disappears before we get to vacuum_rel(…), we will simply skip it and move on to the next one. The vacuum_rel(…) code provides a WARNING in many cases (e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it disappears before the call to vacuum_rel(…).
Yes, that's the bits using try_relation_open() which returns NULL if
the relation is gone. I don't think that we want VACUUM to be noisy
about that when running it on a database.
Agreed.
If we added a WARNING to the effect of “skipping vacuum of <table_name> — relation no longer exists” for this case, I think what you are suggesting would be satisfied.
We would do no favor by reporting nothing to the user. Without any
information, the user triggering a manual vacuum may believe that the
relation has been vacuumed as it was listed but that would not be the
case.
Agreed. Unfortunately, I think this is already possible when you specify a table to be vacuumed.
However, ANALYZE has a slight caveat. While analyze_rel(…) silently skips the relation if it no longer exists like vacuum_rel(…) does, we do not pre-validate the columns list at all. So, in an ANALYZE statement with multiple tables and columns specified, it’ll only fail once we get to the undefined column. To fix this, we could add a check for the column lists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the meantime.
Does this seem like a sane approach?
1. Emit WARNING when skipping if relation disappears before we get to it.
2. Early in vacuum(…), check that the specified columns exist.And issue an ERROR, right?
Correct. This means that both the relations and columns specified would cause an ERROR if they do not exist and a WARNING if they disappear before we can actually process them.
+ RelationAndColumns *relinfo = (RelationAndColumns *) lfirst(relation); + int per_table_opts = options | relinfo->options; /* VACOPT_ANALYZE may be set per-table */ + ListCell *oid; I have just noticed this bit in your patch. So you can basically do something like that: VACUUM (ANALYZE) foo, (FULL) bar; Do we really want to go down to this level of control? I would keep things a bit more restricted as users may be confused by the different lock levels involved by each operation, and make use of the same options for all the relations listed. Opinions from others is welcome.
I agree with you here, too. I stopped short of allowing customers to explicitly provide per-table options, so the example you provided wouldn’t work here. This is more applicable for something like the following:
VACUUM (FREEZE, VERBOSE) foo, bar (a);
In this case, the FREEZE and VERBOSE options are used for both tables. However, we have a column list specified for ‘bar’, and the ANALYZE option is implied when we specify a column list. So when we process ‘bar’, we need to apply the ANALYZE option, but we do not need it for ‘foo’. For now, that is all that this per-table options variable is used for.
Nathan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I’ve attached a v4 of the patch.
Changes:
- Early in vacuum(…), emit an ERROR if any specified columns do not exist.
- Emit a WARNING and skip any specified tables or columns that disappear before they are actually processed.
- Small additions to the VACUUM regression test.
- Rebased with master.
Nathan
Attachments:
vacuum_multiple_tables_v4.patchapplication/octet-stream; name=vacuum_multiple_tables_v4.patchDownload+367-142
On Thu, May 18, 2017 at 12:06 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
I agree with you here, too. I stopped short of allowing customers to explicitly provide per-table options, so the example you provided wouldn’t work here. This is more applicable for something like the following:
VACUUM (FREEZE, VERBOSE) foo, bar (a);
In this case, the FREEZE and VERBOSE options are used for both tables. However, we have a column list specified for ‘bar’, and the ANALYZE option is implied when we specify a column list. So when we process ‘bar’, we need to apply the ANALYZE option, but we do not need it for ‘foo’. For now, that is all that this per-table options variable is used for.
Hm. One argument can be made here: having a column list defined in one
of the tables implies that ANALYZE is enforced for all the relations
listed instead of doing that only on the relations listing columns.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers